Commit Graph

117 Commits

Author SHA1 Message Date
Joseph Doherty 499c9b9165 feat(validation): allow GalaxyMxGateway under Equipment; rename Galaxy-tag FullName check 2026-06-12 21:11:06 -04:00
Joseph Doherty 57355405a6 chore(security): drop dead audit suppressions; patch OpenTelemetry + Tmds.DBus CVEs
All five suppressed advisories are now resolved at baseline/resolved versions,
so every NuGetAuditSuppress is removed repo-wide:
- System.Security.Cryptography.Xml (GHSA-37gx-xxp4-5rgx / GHSA-w3x6-4m5h-cxqf)
  -> fixed by the .NET 10 baseline (10.0.6)
- OPCFoundation Opc.Ua.Core (GHSA-h958-fxgg-g7w3) -> fixed at resolved 1.5.378.106

Two were still live and are now patched via direct security pins:
- OpenTelemetry.Api 1.9.0 -> 1.15.3 (GHSA-g94r-2vxg-569j) pinned in Cluster;
  Runtime/ControlPlane/AdminUI + tests inherit via project reference
- Tmds.DBus.Protocol 0.20.0 -> 0.21.3 (GHSA-xrw6-gwf8-vvr9) pinned in Client.UI

Also correct the Historian sidecar runtime comments (x86 -> x64, matching the
csproj PlatformTarget). Solution audit: 0 vulnerable packages; full build clean.
2026-06-12 09:03:42 -04:00
Joseph Doherty 9f13101896 feat(validation): require TagConfig.FullName on Galaxy alias tags; reframe Tag doc 2026-06-11 21:21:32 -04:00
Joseph Doherty 2ba2f8a899 feat(commons): TryParseRelayBody — detect pure ctx.GetTag relay scripts 2026-06-11 20:59:10 -04:00
Joseph Doherty 61b230d79a harden(historian): nullable HistorizeToAveva (missing→historize) for rolling-restart-safe deserialize + middle-link test 2026-06-11 13:00:57 -04:00
Joseph Doherty c20d228384 fix(historian): volatile _backoffIndex + read _evictedCount under lock (thread-safety) 2026-06-11 12:49:44 -04:00
Joseph Doherty 8012509584 feat(historian): honor per-alarm HistorizeToAveva opt-out at the durable write 2026-06-11 12:48:13 -04:00
Joseph Doherty 8ac3ac5be9 feat(alarms): carry AlarmTypeName + operator Comment on AlarmTransitionEvent (historian feed prep) 2026-06-11 11:03:00 -04:00
Joseph Doherty f9932f2d8e refactor(admin): use CorrelationId wrapper for alarm ack/shelve commands 2026-06-11 09:27:24 -04:00
Joseph Doherty 370a2b7b48 feat(alerts): AdminUI alarm ack/shelve via AdminOperationsActor singleton
T21: add an AdminUI path for acknowledging/shelving alarms that routes
through the admin-pinned AdminOperationsActor cluster singleton, which
republishes onto the same 'alarm-commands' DPS topic the OPC UA method
path (T18) and the engine subscriber (T19) use. The broadcast + the
ScriptedAlarmHostActor ownership filter handle cross-node routing, so
the singleton needs no knowledge of which node owns the alarm.

- Commons: AcknowledgeAlarmCommand/ShelveAlarmCommand (+ result records)
  and a shared AlarmCommandsTopic const; ScriptedAlarmHostActor now
  re-exports that const (mirrors the DriverControlTopic pattern).
- AdminOperationsActor: two handlers map the control-plane messages to
  AlarmCommand (Acknowledge / OneShotShelve / TimedShelve / Unshelve,
  threading User/Comment/UnshelveAtUtc) and publish via the DPS mediator.
- IAdminOperationsClient + AdminOperationsClient: typed Acknowledge/Shelve
  ask wrappers mirroring StartDeploymentAsync.
- Alerts.razor: per-row DriverOperator-gated Ack/Shelve/Unshelve controls;
  operator name from AuthenticationState. Timed-shelve datetime UI deferred.
- 5 TestKit tests (mediator-probe subscribed to alarm-commands) verifying
  each kind's mapping + reply; 56/56 ControlPlane tests green.
2026-06-11 06:44:27 -04:00
Joseph Doherty 1784eedd3f fix(opcua): exempt OnTimedUnshelve from the client AlarmAck gate (system-initiated)
The SDK fires OnTimedUnshelve with the node manager's system context (no
session, no user identity) when a TimedShelve duration expires. Routing
through the shared HandleAlarmCommand hit the AlarmAck gate and returned
BadUserAccessDenied, leaving the alarm permanently shelved.

Replace the delegated HandleAlarmCommand call with an inline lambda that
bypasses the client gate, extracts the AlarmId the same way, and routes an
Unshelve command so the engine clears its shelve state. The manual-client
Unshelve path via OnShelve(shelving:false) remains gated.

Update the AlarmCommandRouterTests OnTimedUnshelve test to use a real
system context (no UserIdentity) — reproducing the actual SDK invocation
path — and assert Good, AlarmId, Operation==Unshelve, User==empty.

Add a doc note to AlarmCommand.Operation that Enable/Disable are in the
vocabulary but not yet wired at the node-manager seam.
2026-06-11 06:16:30 -04:00
Joseph Doherty 63289d377c feat(alarms): route inbound Part 9 alarm methods through AlarmAck gate (T18)
Wire the materialised AlarmConditionState method handlers so a client calling
Acknowledge/Confirm/Shelve/AddComment is gated on the AlarmAck data-plane role
and, when allowed, routed back to the scripted-alarm engine via a new
`alarm-commands` DistributedPubSub topic.

- Commons: new AlarmCommand DTO (AlarmId/Operation/User/Comment/UnshelveAtUtc).
- ScriptedAlarmHostActor: add AlarmCommandsTopic const.
- OtOpcUaNodeManager: settable AlarmCommandRouter + wire OnAcknowledge/OnConfirm/
  OnAddComment/OnShelve/OnTimedUnshelve. Each resolves the principal off
  ISessionOperationContext.UserIdentity as RoleCarryingUserIdentity, fails closed
  (BadUserAccessDenied) when the AlarmAck role is absent or no identity, else maps
  + routes an AlarmCommand and returns Good. OnShelve discriminates OneShotShelve/
  TimedShelve/Unshelve from the SDK flags; TimedShelve expiry = UtcNow + ms.
  No Akka/IActorRef handle — only the Action<AlarmCommand> delegate. T20 de-dup
  note left; WriteAlarmCondition untouched.
- OpcUaServer.Security: OpcUaDataPlaneRoles.AlarmAck shared const (the role was a
  bare string everywhere; introduced one symbol for the gate + tests).
- OtOpcUaSdkServer: SetAlarmCommandRouter pass-through.
- Host: boot wiring publishes each command via mediator.Tell(Publish(...)) using a
  lazy ActorSystem accessor (mirrors DpsScriptLogPublisher).
- Tests: 11 new gate + mapping tests (OpcUaServer.Tests 88->99, all green).
2026-06-11 06:05:39 -04:00
Joseph Doherty 4c417f7fb8 fix(scripted-alarms): log failed event-report via SDK trace + correct sink doc (T16 review) 2026-06-10 19:54:37 -04:00
Joseph Doherty 4eb1d65e2b feat(scripted-alarms): richer AlarmConditionState bridge to the OPC UA node (T15) 2026-06-10 19:41:16 -04:00
Joseph Doherty 60d48a2a0a feat(scripted-alarms): materialise real Part 9 AlarmConditionState nodes (T14) 2026-06-10 19:19:10 -04:00
Joseph Doherty fc0d43a3dc refactor(scripted-alarms): retire orphaned ScriptedAlarmActor + F9b evaluator (T11) 2026-06-10 15:22:26 -04:00
Joseph Doherty 8e8ca9efe8 feat(scripted-alarms): DeploymentArtifact byte-parity for the alarm plan (T6) 2026-06-10 14:41:46 -04:00
Joseph Doherty 788bb68d1d fix(scripting): companion sink falls back to ScriptId for the main-log mirror (T3 review) 2026-06-10 12:08:29 -04:00
Joseph Doherty bd2dd05a0c feat(scripting): evaluators log through root script logger → script-log page (F8) 2026-06-10 12:03:51 -04:00
Joseph Doherty bf86b3def6 fix(scripting): explicit companion logger + disposable ScriptRootLogger (T2 review) 2026-06-10 11:56:51 -04:00
Joseph Doherty 73014258ef feat(scripting): root script logger + DPS publisher wired in Host 2026-06-10 11:50:50 -04:00
Joseph Doherty 14fe88fc80 feat(scripting): ScriptLogTopicSink — script LogEvent → ScriptLogEntry → publisher 2026-06-10 11:38:54 -04:00
Joseph Doherty f431504825 feat(commons): EquipmentScriptPaths — derive base + {{equip}} substitution + shared dep extraction 2026-06-10 07:42:14 -04:00
Joseph Doherty 0b5fc44866 fix(adminui): show + clarify driver-less equipment across list/import (Task 1 review) 2026-06-08 07:00:03 -04:00
Joseph Doherty d2dbf7b0d7 feat(config): make Equipment.DriverInstanceId nullable + driver-less AdminUI support + migration 2026-06-08 06:49:28 -04:00
Joseph Doherty b73ce75402 harden(vtag): exclude backslash from passthrough capture + parity tests (A review) 2026-06-07 15:31:54 -04:00
Joseph Doherty 08d7477860 feat(vtag): passthrough fast-path skips Roslyn for mirror scripts (A) 2026-06-07 15:26:20 -04:00
Joseph Doherty 1827c51c42 refactor(scripting): clarify sandbox-pin invariant + add RootNamespace (A0 review) 2026-06-07 15:16:14 -04:00
Joseph Doherty 56cac39216 refactor(scripting): extract script-callable types into Roslyn-free Core.Scripting.Abstractions (A0) 2026-06-07 15:10:00 -04:00
Joseph Doherty 46aba992c5 fix(config): DraftSnapshotFactory loads only active (unreleased) reservations
Filter ExternalIdReservations to WHERE ReleasedAt IS NULL so
DraftSnapshot.ActiveReservations matches its documented semantics and
ValidateReservationPreflight cannot emit spurious BadDuplicateExternalIdentifier
errors from already-released rows. Adds a focused unit test seeding one active
and one released reservation and asserting only the active row is returned.
2026-06-07 10:47:33 -04:00
Joseph Doherty 1023209d52 feat(deploy): reject Tag/VirtualTag NodeId collisions at deploy (surgical DraftValidator gate) 2026-06-07 10:42:13 -04:00
Joseph Doherty fce66d104a refactor(config): materialise collision groups once; note VirtualTag folder coupling 2026-06-07 10:37:22 -04:00
Joseph Doherty 83c7149be0 feat(config): DraftValidator rule + DraftSnapshot.VirtualTags for Tag/VirtualTag NodeId collisions 2026-06-07 10:33:45 -04:00
Joseph Doherty b7f5e887ee feat(audit): OtOpcUa ConfigAuditLog.Outcome column + migration + ClusterAudit visibility fix (Task 2.2)
Persist the canonical AuditOutcome and make structured audit rows visible.

- ConfigAuditLog gains a nullable Outcome column, stored as the AuditOutcome
  enum member name (nvarchar(16), mirroring how AdminRole is persisted). The
  AuditWriterActor flush now writes Outcome = evt.Outcome.ToString(). Nullable so
  legacy rows and the bespoke stored-procedure path (no derived outcome) write
  NULL.
- Migration 20260602135350_AddConfigAuditLogOutcome: additive nullable column,
  no backfill. Up adds the column, Down drops it. Chains after
  20260602112419_CanonicalizeAdminRoles; `dotnet ef migrations
  has-pending-model-changes` is clean.
- ClusterAudit visibility fix: the page filtered solely on ClusterId, but the
  structured AuditWriterActor path stamps NodeId (ClusterId null), so those rows
  were invisible. Extracted ClusterAuditQuery.ForClusterAsync (shared by the page
  and tests) which ORs in rows whose NodeId belongs to a node in the cluster —
  membership resolved from ClusterNode (NodeId -> ClusterId). SP-path
  ClusterId-stamped rows still match.

Tests: ControlPlane 45/45 (adds Outcome persistence + Denied-outcome asserts);
new Configuration ClusterAuditQueryTests 3/3 (both-paths visible, other-cluster
excluded, page-size cap); AdminUI 121/121. Configuration Unit suite is green on a
clean run (a pre-existing timing flake in ResilientConfigReaderTests, untouched
here, occasionally fails under parallel load and passes in isolation).
2026-06-02 09:59:22 -04:00
Joseph Doherty 933dd1a874 feat(audit): OtOpcUa adopt canonical ZB.MOM.WW.Audit.AuditEvent + AuditWriterActor:IAuditWriter + Outcome derivation (Task 2.1)
Deep-adopt the shared audit record. Deletes the bespoke 8-field positional
Commons AuditEvent and repoints the writer path at ZB.MOM.WW.Audit.AuditEvent
(0.1.0, feed-mapped via dohertj2-gitea). Adds the package reference to both
Commons and ControlPlane.

- AuditWriterActor now implements IAuditWriter: WriteAsync(evt, ct) is a
  best-effort, never-throwing entry point that Self.Tell()s the event onto the
  same batching/dedup/flush pipeline and returns Task.CompletedTask. Existing
  Receive<AuditEvent> + 500/5s batching + two-layer dedup unchanged.
- Flush mapping updated for the canonical field types: OccurredAtUtc is now
  DateTimeOffset (.UtcDateTime into the datetime2 column), SourceNode is string?
  (was NodeId.Value), CorrelationId is Guid? (stored null when null). Outcome is
  NOT yet persisted (column lands in Task 2.2).
- New AuditOutcomeMapper.FromAction maps the OtOpcUa action vocabulary to the
  required canonical Outcome: OpcUaAccessDenied / CrossClusterNamespaceAttempt ->
  Denied; config verbs (DraftCreated/Edited, Published, RolledBack, NodeApplied,
  ClusterCreated, NodeAdded, CredentialAdded/Disabled, ExternalIdReleased) ->
  Success. OtOpcUa emits no Failure events.

The Akka message shape changed, but the structured audit path is dormant (zero
production emit/Tell sites; all live audit flows through the bespoke SP path),
so there is no rolling-deploy wire-compat concern. Tested-not-exercised by
design.

ControlPlane.Tests: 44/44 green (AuditWriterActor suite rewritten to construct
the canonical record + assert the Outcome derivation table + the WriteAsync
best-effort/mailbox-routing contract + null SourceNode/CorrelationId handling).
2026-06-02 09:53:12 -04:00
Joseph Doherty c1619d95f5 feat(auth)!: OtOpcUa canonical control-plane roles + config-DB migration (Task 1.7)
Standardize the control-plane admin role VALUES on the canonical six
(ZB.MOM.WW.Auth CanonicalRole). OtOpcUa uses four:
  ConfigViewer   -> Viewer
  ConfigEditor   -> Designer
  FleetAdmin     -> Administrator
  DriverOperator -> Operator   (appsettings-only string role)

This is a rename, not a permission change: enforcement semantics are
preserved (whoever could deploy/administer/operate before still can).

- AdminRole enum members renamed (persisted as string names via
  HasConversion<string>); RoleGrants.razor dropdown default updated.
- EF DATA migration CanonicalizeAdminRoles rewrites existing
  LdapGroupRoleMapping.Role rows old->new (Up) and back (Down); schema /
  model snapshot byte-identical (no pending model changes).
- Enforcement role STRINGS canonicalized:
  * Security policies keep their NAMES ("DriverOperator"/"FleetAdmin")
    but require canonical roles: RequireRole("Operator","Administrator")
    and RequireRole("Administrator").
  * Deployments.razor [Authorize(Roles="Administrator,Designer")].
  * DevStub now grants "Administrator"; LdapOptions/doc-comment examples
    canonicalized.
- Data-plane authorization (NodePermissions/NodeAcl/IPermissionEvaluator/
  TriePermissionEvaluator/UserAuthorizationState) UNTOUCHED.
- New CanonicalAdminRolesTests pins canonical claim values end-to-end and
  the real registered policies; existing role-string tests updated.
2026-06-02 07:30:00 -04:00
Joseph Doherty 32d7fd7cc9 fix(galaxy): complete PR 7.2 rename — use canonical GalaxyMxGateway driver type
v2-ci / build (push) Failing after 48s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
The driver/factory/seed use 'GalaxyMxGateway' (legacy 'Galaxy' was retired),
but the AdminUI editor router, GalaxyDriverPage, address picker, identity
dropdown, the Galaxy browser/probe, and DraftValidator still keyed on 'Galaxy'.
Result: the seeded GalaxyMxGateway driver couldn't be edited ('no editor
registered'), UI-created Galaxy drivers wrote a type with no factory, and a
SystemPlatform-bound GalaxyMxGateway driver failed publish validation.
Align all stragglers to GalaxyMxGateway (+ failing-test-first DraftValidator
coverage). ShouldStub's 'Galaxy' legacy safety-net left intact.
2026-05-29 12:31:55 -04:00
Joseph Doherty bc40388914 chore(di): register ILdapGroupRoleMappingService 2026-05-29 09:47:10 -04:00
Joseph Doherty 81f09a7054 feat(commons): add IDriverBrowser/IBrowseSession/BrowseNode abstractions 2026-05-28 15:32:01 -04:00
Joseph Doherty 662f3f9f5c refactor(driver-pages): address Phase 6/8 deep-review findings
v2-ci / build (push) Failing after 32s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
- Topic-name drift fix: DriverHealthChanged.TopicName and
  DriverControlTopic.Name now live on the message contracts in
  Commons. AkkaDriverHealthPublisher, DriverStatusSignalRBridge,
  DriverHostActor, and AdminOperationsActor all delegate to the
  single constant so a rename can't silently desynchronise
  publisher and subscriber.
- DriverStatusPanel._opResultClearTimer switched from
  System.Timers.Timer to System.Threading.Timer + awaited
  DisposeAsync. Prevents an in-flight 8s clear-callback from
  invoking StateHasChanged on a component whose hub has already
  been released.
- PublishHealthSnapshot deduplicates against the last published
  (state, lastSuccess, lastError, errorCount) fingerprint. The
  30s heartbeat no longer floods the SignalR layer with identical
  Healthy snapshots — newly-joined clients still warm up via the
  snapshot store on JoinDriver.
2026-05-28 11:52:20 -04:00
Joseph Doherty ffcc8d1065 feat(adminui): Reconnect/Restart on DriverStatusPanel (DriverOperator-gated)
- RestartDriver / ReconnectDriver messages + AdminOperationsActor
  handlers (broadcast via driver-control DPS topic; audited via
  ConfigEdits).
- DriverHostActor subscribes to driver-control; locates the
  matching child DriverInstanceActor and stops+respawns it
  (Restart) or sends it a ForceReconnect internal message
  (Reconnect — re-enters Reconnecting state without full stop).
  DriverInstanceSpec constructor call uses named args to handle
  the full 6-parameter signature.
- New DriverOperator authorization policy mapped to DriverOperator
  or FleetAdmin role; documented in docs/security.md. Map LDAP
  group via GroupToRole (e.g. "ot-driver-operator": "DriverOperator").
- DriverStatusPanel renders Reconnect + Restart buttons when the
  user holds the DriverOperator policy (hidden otherwise). Restart
  requires an in-page Razor confirm block (no JS confirm, keeps
  SignalR event loop unblocked). Both buttons show a spinner and
  are disabled during in-flight; result chip auto-clears after 8s.
  Username sourced from AuthenticationStateProvider.

Reconnect resolves to "ForceReconnect" (re-enter Reconnecting,
not full stop+respawn) — transport drops and retries while actor
and in-memory state are preserved. All DriverInstanceActor states
handle ForceReconnect safely (no-op when already in transition).
2026-05-28 11:14:04 -04:00
Joseph Doherty 4b374fd177 feat(adminui): Test Connect button on every typed driver page
- AdminProbeService routes TestDriverConnect through
  IAdminOperationsClient with a 65s outer guard (actor side already
  clamps to [1,60]).
- Added generic AskAsync<T> to IAdminOperationsClient interface and
  AdminOperationsClient impl, delegating straight to the Akka proxy.
- DriverTestConnectButton renders the button + inline result chip,
  auto-clears after 30s, disables during in-flight.
- Wired into all 9 typed driver pages directly under the
  identity section. Sources timeout from the form's
  ProbeTimeoutSeconds; sources config JSON from the form's
  current Options (operator can test BEFORE saving).
2026-05-28 11:02:49 -04:00
Joseph Doherty f3f328c25c feat(adminops): IDriverProbe + TestDriverConnect actor handler
- IDriverProbe abstraction in Core.Abstractions; one impl per driver
  type, resolved by DriverType string. Phase 7.3 + 7.4 add concrete
  probes for the 9 supported driver types.
- TestDriverConnect / TestDriverConnectResult messages.
- AdminOperationsActor.HandleTestDriverConnectAsync looks up the probe
  by DriverType, runs it with a [1,60]s clamped timeout, and returns
  success/latency or failure/message. Probes that throw or time out
  surface as soft failures.
2026-05-28 10:44:00 -04:00
Joseph Doherty 4203b84d51 feat(runtime): publish DriverHealthChanged via DriverInstanceActor
- IDriverHealthPublisher in Core.Abstractions + NullDriverHealthPublisher
  no-op for tests/dev-stub paths.
- AkkaDriverHealthPublisher in Runtime forwards to the cluster-wide
  `driver-health` DPS topic.
- DriverInstanceActor instrumented to publish snapshots on every
  observable state change + a periodic 30s heartbeat so the AdminUI
  snapshot store warms up for newly-joined SignalR clients.
- Sliding 5-minute Faulted-count tracked per actor via Queue<DateTime>.
- DriverHostActor.SpawnChild threads clusterId (_localNode.Value) and
  the health publisher down to every DriverInstanceActor child.
- ServiceCollectionExtensions.AddOtOpcUaRuntime registers
  AkkaDriverHealthPublisher as IDriverHealthPublisher singleton.
2026-05-28 10:22:44 -04:00
Joseph Doherty 4d5c6ac892 feat(messages): add DriverHealthChanged DPS contract 2026-05-28 10:10:16 -04:00
Joseph Doherty 64e3fbe035 docs: backfill XML documentation across 756 files
v2-ci / build (push) Failing after 1m43s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Adds <summary>, <param>, <typeparam>, and <inheritdoc/> tags to public
members surfaced by commentchecker — resolves 5,847 of 5,869 issues
(99.6%) across three /fixdocs passes.
2026-05-28 08:10:17 -04:00
Joseph Doherty 7dfbca6469 feat(opcua): materialise SystemPlatform tags (Galaxy) as OPC UA variables
v2-ci / build (push) Failing after 47s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Closes the gap where Tag rows with EquipmentId=NULL + Namespace.Kind=SystemPlatform
(Galaxy hierarchy) existed in ConfigDb but were never surfaced in the OPC UA
address space. Now they materialise as Variable nodes under a folder named for
their FolderPath, browseable through any OPC UA client.

Layers touched:

- IOpcUaAddressSpaceSink: new EnsureVariable(nodeId, parentFolderId, displayName,
  dataType) signature on the sink interface, NullSink, DeferredSink, SdkSink.
- OtOpcUaNodeManager.EnsureVariable: creates a BaseDataVariableState parented
  under the named folder (or root), initial Value=null +
  StatusCode=BadWaitingForInitialData; resolves Tag.DataType strings to the
  matching OPC UA built-in NodeId. Idempotent.
- Phase7CompositionResult: new GalaxyTags collection of GalaxyTagPlan records
  carrying (TagId, DriverInstanceId, FolderPath, DisplayName, DataType,
  MxAccessRef). Constructor overloads keep existing call sites compiling.
- Phase7Composer.Compose: now takes Tag + Namespace inputs, filters for
  SystemPlatform-namespace tags with EquipmentId=NULL, emits GalaxyTagPlan
  rows with MXAccess ref "FolderPath.Name".
- Phase7Plan: new AddedGalaxyTags / RemovedGalaxyTags / ChangedGalaxyTags
  collections + GalaxyTagDelta record; IsEmpty + needsRebuild updated.
- Phase7Planner.Compute: diffs GalaxyTags by TagId via existing DiffById helper.
- DeploymentArtifact.ParseComposition: reads the Tags + Namespaces +
  DriverInstances arrays the ConfigComposer already emits, applies the same
  SystemPlatform filter, returns the same GalaxyTagPlan list as the composer
  so artifact-side and compose-side plans agree.
- Phase7Applier: new MaterialiseGalaxyTags pass that ensures one folder per
  distinct FolderPath then one Variable per tag. NodeId for the variable is
  "<FolderPath>.<Name>" matching the MXAccess ref so the future Galaxy
  SubscribeBulk wiring can address them directly.
- OpcUaPublishActor.RebuildAddressSpace: invokes MaterialiseGalaxyTags after
  MaterialiseHierarchy. _lastApplied initialiser updated for the new ctor.
- seed-clusters.sql: pre-existing TestMachine_001.TestAlarm001..003 rows
  needed no change — the composer/applier now picks them up automatically.

Verified end-to-end via docker-dev: deploy click → driver-a logs
"Phase7Applier: Galaxy tags materialised (tags=3, folders=1)" → OPC UA Client
CLI browses the three Variable nodes under TestMachine_001 folder. Reads
return BadWaitingForInitialData status (expected — Galaxy driver's
SubscribeBulk wiring to push values into the nodes is the remaining
follow-up).
2026-05-26 15:43:22 -04:00
Joseph Doherty 607dc51dec feat(opcua): #85 UNS Area/Line/Equipment folder hierarchy in SDK
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
Phase7Composer now carries UnsAreaProjection + UnsLineProjection lists so
the applier can materialise the full UNS topology in the OPC UA address
space. New IOpcUaAddressSpaceSink.EnsureFolder(folderNodeId, parentNodeId,
displayName) seam (no-op default, recorded in tests, forwarded by
DeferredAddressSpaceSink, implemented by SdkAddressSpaceSink). The SDK-
side OtOpcUaNodeManager gains an EnsureFolder API that creates
FolderState nodes with proper parent linkage; RebuildAddressSpace now
clears folders too so re-applies don't accumulate stale topology.

Phase7Applier.MaterialiseHierarchy walks composition.UnsAreas →
composition.UnsLines → composition.EquipmentNodes, calling EnsureFolder
with the correct parent at each level. Idempotent — calling twice with
the same composition is a no-op. OpcUaPublishActor.HandleRebuild invokes
it after Phase7Applier.Apply so OPC UA clients browsing the server now
see Area/Line/Equipment as proper folders rather than flat tag ids.

DeploymentArtifact.ParseComposition reads UnsAreas + UnsLines from the
JSON snapshot the ControlPlane emits, populating the new fields when
present.

Phase7Composer.Compose now accepts UnsAreas + UnsLines; a 3-arg overload
preserves the old signature for legacy callers + existing tests. The
Phase7CompositionResult convenience ctor likewise keeps the planner
tests working without UNS data.

3 new hierarchy tests (pure unit + boot-verify against a real
OtOpcUaSdkServer); OpcUaServer suite is 48/48 green (was 45, +3),
Runtime 74/74 unchanged.

Closes #85.
2026-05-26 10:48:56 -04:00
Joseph Doherty 2697af31d1 feat(opcua,host): #81 ServiceLevel SDK publisher
SdkServiceLevelPublisher writes Server.ServiceLevel through the SDK's
ServerObjectState — the standard OPC UA non-transparent-redundancy signal
clients use to pick a primary. Writes are guarded by DiagnosticsLock so
concurrent SDK diagnostics scans don't fight with our updates.

DeferredServiceLevelPublisher mirrors the DeferredAddressSpaceSink late-
binding pattern: Akka actors resolve IServiceLevelPublisher at construction,
hosted service swaps the SDK publisher in after StandardServer.Start. Host
Program.cs registers DeferredServiceLevelPublisher as the singleton bound
to IServiceLevelPublisher; OtOpcUaServerHostedService gets it injected and
fills it once IServerInternal is available.

Tests boot a real StandardServer on a free port (cross-platform), call
Publish, then verify ServerObject.ServiceLevel.Value reflects the write.
5 new tests; OpcUaServer suite now 45/45 green (was 40, +5).

Closes #81 residual. Unblocks Task 60 (OPC UA dual-endpoint + ServiceLevel
tests).
2026-05-26 10:37:42 -04:00
Joseph Doherty 52997ee164 feat(observability): F13d Prometheus + OpenTelemetry instrumentation
v2-ci / build (push) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
OtOpcUaTelemetry (Commons/Observability) centralizes the project's Meter
+ ActivitySource so all instrumentation points emit through a single
named surface. Counters cover the hot paths:

  otopcua.deploy.applied               (outcome=ack|reject)
  otopcua.deploy.apply.duration        (s, histogram)
  otopcua.driver.lifecycle             (event=spawn|spawn_stub|stop|fault)
  otopcua.virtualtag.eval              (outcome=ok|fail|skip)
  otopcua.scriptedalarm.transition     (state=activated|acknowledged|cleared)
  otopcua.opcua.sink.write             (kind=value|alarm|rebuild)
  otopcua.redundancy.service_level_change (level=byte)

Plus two ActivitySource spans:

  otopcua.deploy.apply                 wraps DriverHostActor.ApplyAndAck
  otopcua.opcua.address_space_rebuild  wraps OpcUaPublishActor.HandleRebuild

Instruments are no-op until a listener attaches, so tests + dev hosts
pay nothing for unread telemetry.

Host Program.cs gains AddOtOpcUaObservability() (binds the OtOpcUa Meter
+ ActivitySource to OpenTelemetry, attaches a Prometheus exporter) and
MapOtOpcUaMetrics() (mounts /metrics scrape endpoint). Driver-side
internals + ASP.NET request metrics deliberately stay off — the scrape
payload is scoped to OtOpcUa signals only.

Tests use MeterListener + ActivityListener to verify
VirtualTagActor.eval, OpcUaPublishActor.AttributeValueUpdate, and
RebuildAddressSpace actually emit on the central instruments. Runtime
suite is 72 / 72 green (+3).

Closes #105. Path A (F13b/c/d) complete; next batch options: #85 UNS
folder hierarchy in SDK, or F8b/F9b production engine bindings.
2026-05-26 10:29:40 -04:00