Commit Graph

127 Commits

Author SHA1 Message Date
Joseph Doherty 0f7c47a559 feat(commons): IOpcUaNodeWriteGateway + NodeWriteOutcome for write-outcome routing 2026-06-14 01:20:14 -04:00
Joseph Doherty a23fb2b82e feat(server): equipment-tag node writability from Tag.AccessLevel (parity-safe, no migration) 2026-06-13 11:46:00 -04:00
Joseph Doherty eaa335d779 feat(drivers): shared EquipmentTagRefResolver (by-name + parse-on-miss + cache) 2026-06-13 11:07:23 -04:00
Joseph Doherty 26816fd17e feat(commons): EquipmentNodeIds — single source of truth for folder-scoped equipment NodeIds 2026-06-13 06:26:59 -04:00
Joseph Doherty b8277922b6 fix(config): also clean NodeAcl + release ExternalIdReservation in SystemPlatform cleanup migration 2026-06-12 22:34:13 -04:00
Joseph Doherty 7d25480fee docs(galaxy): neutralize remaining stale SystemPlatform/alias terminology in comments + a test name
Replace "SystemPlatform mirror tag", "Galaxy alias", and "SystemPlatform-kind" in doc-comments and
test names with neutral accurate wording ("FolderPath-scoped tag", "EquipmentId == null", etc.).
No code, logic, or test bodies changed — comments and one test method name only.
2026-06-12 22:30:50 -04:00
Joseph Doherty ba42bed538 feat(config): forward-only migration deleting orphaned SystemPlatform namespace data (clean break) 2026-06-12 22:25:00 -04:00
Joseph Doherty dcbaf63ab1 feat(config): remove the SystemPlatform NamespaceKind (capstone) — Galaxy is Equipment-kind 2026-06-12 22:18:56 -04:00
Joseph Doherty 5edea52bd7 docs(galaxy): fix stale SystemPlatform/alias/Galaxy doc comments (review follow-up)
Resolves the code-review notes on 95be607a + the AdminUI bundle: the
EnsureVariable docs (IOpcUaAddressSpaceSink, OtOpcUaNodeManager) and the Tag
entity doc no longer say 'Galaxy / SystemPlatform / alias'; the DriverHostActor
ForwardToMux comment now states the real equipment-tag value-routing gap (the
FullName→NodeId 'live values' milestone) instead of claiming Galaxy values map
straight through.
2026-06-12 22:00:52 -04:00
Joseph Doherty b4b7cd7d0f feat(authz): remove SystemPlatform scope + permission-trie walk (Galaxy resolves via Equipment) 2026-06-12 21:54:28 -04:00
Joseph Doherty 499c9b9165 feat(validation): allow GalaxyMxGateway under Equipment; rename Galaxy-tag FullName check 2026-06-12 21:11:06 -04:00
Joseph Doherty 57355405a6 chore(security): drop dead audit suppressions; patch OpenTelemetry + Tmds.DBus CVEs
All five suppressed advisories are now resolved at baseline/resolved versions,
so every NuGetAuditSuppress is removed repo-wide:
- System.Security.Cryptography.Xml (GHSA-37gx-xxp4-5rgx / GHSA-w3x6-4m5h-cxqf)
  -> fixed by the .NET 10 baseline (10.0.6)
- OPCFoundation Opc.Ua.Core (GHSA-h958-fxgg-g7w3) -> fixed at resolved 1.5.378.106

Two were still live and are now patched via direct security pins:
- OpenTelemetry.Api 1.9.0 -> 1.15.3 (GHSA-g94r-2vxg-569j) pinned in Cluster;
  Runtime/ControlPlane/AdminUI + tests inherit via project reference
- Tmds.DBus.Protocol 0.20.0 -> 0.21.3 (GHSA-xrw6-gwf8-vvr9) pinned in Client.UI

Also correct the Historian sidecar runtime comments (x86 -> x64, matching the
csproj PlatformTarget). Solution audit: 0 vulnerable packages; full build clean.
2026-06-12 09:03:42 -04:00
Joseph Doherty 9f13101896 feat(validation): require TagConfig.FullName on Galaxy alias tags; reframe Tag doc 2026-06-11 21:21:32 -04:00
Joseph Doherty 2ba2f8a899 feat(commons): TryParseRelayBody — detect pure ctx.GetTag relay scripts 2026-06-11 20:59:10 -04:00
Joseph Doherty 61b230d79a harden(historian): nullable HistorizeToAveva (missing→historize) for rolling-restart-safe deserialize + middle-link test 2026-06-11 13:00:57 -04:00
Joseph Doherty c20d228384 fix(historian): volatile _backoffIndex + read _evictedCount under lock (thread-safety) 2026-06-11 12:49:44 -04:00
Joseph Doherty 8012509584 feat(historian): honor per-alarm HistorizeToAveva opt-out at the durable write 2026-06-11 12:48:13 -04:00
Joseph Doherty 8ac3ac5be9 feat(alarms): carry AlarmTypeName + operator Comment on AlarmTransitionEvent (historian feed prep) 2026-06-11 11:03:00 -04:00
Joseph Doherty f9932f2d8e refactor(admin): use CorrelationId wrapper for alarm ack/shelve commands 2026-06-11 09:27:24 -04:00
Joseph Doherty 370a2b7b48 feat(alerts): AdminUI alarm ack/shelve via AdminOperationsActor singleton
T21: add an AdminUI path for acknowledging/shelving alarms that routes
through the admin-pinned AdminOperationsActor cluster singleton, which
republishes onto the same 'alarm-commands' DPS topic the OPC UA method
path (T18) and the engine subscriber (T19) use. The broadcast + the
ScriptedAlarmHostActor ownership filter handle cross-node routing, so
the singleton needs no knowledge of which node owns the alarm.

- Commons: AcknowledgeAlarmCommand/ShelveAlarmCommand (+ result records)
  and a shared AlarmCommandsTopic const; ScriptedAlarmHostActor now
  re-exports that const (mirrors the DriverControlTopic pattern).
- AdminOperationsActor: two handlers map the control-plane messages to
  AlarmCommand (Acknowledge / OneShotShelve / TimedShelve / Unshelve,
  threading User/Comment/UnshelveAtUtc) and publish via the DPS mediator.
- IAdminOperationsClient + AdminOperationsClient: typed Acknowledge/Shelve
  ask wrappers mirroring StartDeploymentAsync.
- Alerts.razor: per-row DriverOperator-gated Ack/Shelve/Unshelve controls;
  operator name from AuthenticationState. Timed-shelve datetime UI deferred.
- 5 TestKit tests (mediator-probe subscribed to alarm-commands) verifying
  each kind's mapping + reply; 56/56 ControlPlane tests green.
2026-06-11 06:44:27 -04:00
Joseph Doherty 1784eedd3f fix(opcua): exempt OnTimedUnshelve from the client AlarmAck gate (system-initiated)
The SDK fires OnTimedUnshelve with the node manager's system context (no
session, no user identity) when a TimedShelve duration expires. Routing
through the shared HandleAlarmCommand hit the AlarmAck gate and returned
BadUserAccessDenied, leaving the alarm permanently shelved.

Replace the delegated HandleAlarmCommand call with an inline lambda that
bypasses the client gate, extracts the AlarmId the same way, and routes an
Unshelve command so the engine clears its shelve state. The manual-client
Unshelve path via OnShelve(shelving:false) remains gated.

Update the AlarmCommandRouterTests OnTimedUnshelve test to use a real
system context (no UserIdentity) — reproducing the actual SDK invocation
path — and assert Good, AlarmId, Operation==Unshelve, User==empty.

Add a doc note to AlarmCommand.Operation that Enable/Disable are in the
vocabulary but not yet wired at the node-manager seam.
2026-06-11 06:16:30 -04:00
Joseph Doherty 63289d377c feat(alarms): route inbound Part 9 alarm methods through AlarmAck gate (T18)
Wire the materialised AlarmConditionState method handlers so a client calling
Acknowledge/Confirm/Shelve/AddComment is gated on the AlarmAck data-plane role
and, when allowed, routed back to the scripted-alarm engine via a new
`alarm-commands` DistributedPubSub topic.

- Commons: new AlarmCommand DTO (AlarmId/Operation/User/Comment/UnshelveAtUtc).
- ScriptedAlarmHostActor: add AlarmCommandsTopic const.
- OtOpcUaNodeManager: settable AlarmCommandRouter + wire OnAcknowledge/OnConfirm/
  OnAddComment/OnShelve/OnTimedUnshelve. Each resolves the principal off
  ISessionOperationContext.UserIdentity as RoleCarryingUserIdentity, fails closed
  (BadUserAccessDenied) when the AlarmAck role is absent or no identity, else maps
  + routes an AlarmCommand and returns Good. OnShelve discriminates OneShotShelve/
  TimedShelve/Unshelve from the SDK flags; TimedShelve expiry = UtcNow + ms.
  No Akka/IActorRef handle — only the Action<AlarmCommand> delegate. T20 de-dup
  note left; WriteAlarmCondition untouched.
- OpcUaServer.Security: OpcUaDataPlaneRoles.AlarmAck shared const (the role was a
  bare string everywhere; introduced one symbol for the gate + tests).
- OtOpcUaSdkServer: SetAlarmCommandRouter pass-through.
- Host: boot wiring publishes each command via mediator.Tell(Publish(...)) using a
  lazy ActorSystem accessor (mirrors DpsScriptLogPublisher).
- Tests: 11 new gate + mapping tests (OpcUaServer.Tests 88->99, all green).
2026-06-11 06:05:39 -04:00
Joseph Doherty 4c417f7fb8 fix(scripted-alarms): log failed event-report via SDK trace + correct sink doc (T16 review) 2026-06-10 19:54:37 -04:00
Joseph Doherty 4eb1d65e2b feat(scripted-alarms): richer AlarmConditionState bridge to the OPC UA node (T15) 2026-06-10 19:41:16 -04:00
Joseph Doherty 60d48a2a0a feat(scripted-alarms): materialise real Part 9 AlarmConditionState nodes (T14) 2026-06-10 19:19:10 -04:00
Joseph Doherty fc0d43a3dc refactor(scripted-alarms): retire orphaned ScriptedAlarmActor + F9b evaluator (T11) 2026-06-10 15:22:26 -04:00
Joseph Doherty 8e8ca9efe8 feat(scripted-alarms): DeploymentArtifact byte-parity for the alarm plan (T6) 2026-06-10 14:41:46 -04:00
Joseph Doherty 788bb68d1d fix(scripting): companion sink falls back to ScriptId for the main-log mirror (T3 review) 2026-06-10 12:08:29 -04:00
Joseph Doherty bd2dd05a0c feat(scripting): evaluators log through root script logger → script-log page (F8) 2026-06-10 12:03:51 -04:00
Joseph Doherty bf86b3def6 fix(scripting): explicit companion logger + disposable ScriptRootLogger (T2 review) 2026-06-10 11:56:51 -04:00
Joseph Doherty 73014258ef feat(scripting): root script logger + DPS publisher wired in Host 2026-06-10 11:50:50 -04:00
Joseph Doherty 14fe88fc80 feat(scripting): ScriptLogTopicSink — script LogEvent → ScriptLogEntry → publisher 2026-06-10 11:38:54 -04:00
Joseph Doherty f431504825 feat(commons): EquipmentScriptPaths — derive base + {{equip}} substitution + shared dep extraction 2026-06-10 07:42:14 -04:00
Joseph Doherty 0b5fc44866 fix(adminui): show + clarify driver-less equipment across list/import (Task 1 review) 2026-06-08 07:00:03 -04:00
Joseph Doherty d2dbf7b0d7 feat(config): make Equipment.DriverInstanceId nullable + driver-less AdminUI support + migration 2026-06-08 06:49:28 -04:00
Joseph Doherty b73ce75402 harden(vtag): exclude backslash from passthrough capture + parity tests (A review) 2026-06-07 15:31:54 -04:00
Joseph Doherty 08d7477860 feat(vtag): passthrough fast-path skips Roslyn for mirror scripts (A) 2026-06-07 15:26:20 -04:00
Joseph Doherty 1827c51c42 refactor(scripting): clarify sandbox-pin invariant + add RootNamespace (A0 review) 2026-06-07 15:16:14 -04:00
Joseph Doherty 56cac39216 refactor(scripting): extract script-callable types into Roslyn-free Core.Scripting.Abstractions (A0) 2026-06-07 15:10:00 -04:00
Joseph Doherty 46aba992c5 fix(config): DraftSnapshotFactory loads only active (unreleased) reservations
Filter ExternalIdReservations to WHERE ReleasedAt IS NULL so
DraftSnapshot.ActiveReservations matches its documented semantics and
ValidateReservationPreflight cannot emit spurious BadDuplicateExternalIdentifier
errors from already-released rows. Adds a focused unit test seeding one active
and one released reservation and asserting only the active row is returned.
2026-06-07 10:47:33 -04:00
Joseph Doherty 1023209d52 feat(deploy): reject Tag/VirtualTag NodeId collisions at deploy (surgical DraftValidator gate) 2026-06-07 10:42:13 -04:00
Joseph Doherty fce66d104a refactor(config): materialise collision groups once; note VirtualTag folder coupling 2026-06-07 10:37:22 -04:00
Joseph Doherty 83c7149be0 feat(config): DraftValidator rule + DraftSnapshot.VirtualTags for Tag/VirtualTag NodeId collisions 2026-06-07 10:33:45 -04:00
Joseph Doherty b7f5e887ee feat(audit): OtOpcUa ConfigAuditLog.Outcome column + migration + ClusterAudit visibility fix (Task 2.2)
Persist the canonical AuditOutcome and make structured audit rows visible.

- ConfigAuditLog gains a nullable Outcome column, stored as the AuditOutcome
  enum member name (nvarchar(16), mirroring how AdminRole is persisted). The
  AuditWriterActor flush now writes Outcome = evt.Outcome.ToString(). Nullable so
  legacy rows and the bespoke stored-procedure path (no derived outcome) write
  NULL.
- Migration 20260602135350_AddConfigAuditLogOutcome: additive nullable column,
  no backfill. Up adds the column, Down drops it. Chains after
  20260602112419_CanonicalizeAdminRoles; `dotnet ef migrations
  has-pending-model-changes` is clean.
- ClusterAudit visibility fix: the page filtered solely on ClusterId, but the
  structured AuditWriterActor path stamps NodeId (ClusterId null), so those rows
  were invisible. Extracted ClusterAuditQuery.ForClusterAsync (shared by the page
  and tests) which ORs in rows whose NodeId belongs to a node in the cluster —
  membership resolved from ClusterNode (NodeId -> ClusterId). SP-path
  ClusterId-stamped rows still match.

Tests: ControlPlane 45/45 (adds Outcome persistence + Denied-outcome asserts);
new Configuration ClusterAuditQueryTests 3/3 (both-paths visible, other-cluster
excluded, page-size cap); AdminUI 121/121. Configuration Unit suite is green on a
clean run (a pre-existing timing flake in ResilientConfigReaderTests, untouched
here, occasionally fails under parallel load and passes in isolation).
2026-06-02 09:59:22 -04:00
Joseph Doherty 933dd1a874 feat(audit): OtOpcUa adopt canonical ZB.MOM.WW.Audit.AuditEvent + AuditWriterActor:IAuditWriter + Outcome derivation (Task 2.1)
Deep-adopt the shared audit record. Deletes the bespoke 8-field positional
Commons AuditEvent and repoints the writer path at ZB.MOM.WW.Audit.AuditEvent
(0.1.0, feed-mapped via dohertj2-gitea). Adds the package reference to both
Commons and ControlPlane.

- AuditWriterActor now implements IAuditWriter: WriteAsync(evt, ct) is a
  best-effort, never-throwing entry point that Self.Tell()s the event onto the
  same batching/dedup/flush pipeline and returns Task.CompletedTask. Existing
  Receive<AuditEvent> + 500/5s batching + two-layer dedup unchanged.
- Flush mapping updated for the canonical field types: OccurredAtUtc is now
  DateTimeOffset (.UtcDateTime into the datetime2 column), SourceNode is string?
  (was NodeId.Value), CorrelationId is Guid? (stored null when null). Outcome is
  NOT yet persisted (column lands in Task 2.2).
- New AuditOutcomeMapper.FromAction maps the OtOpcUa action vocabulary to the
  required canonical Outcome: OpcUaAccessDenied / CrossClusterNamespaceAttempt ->
  Denied; config verbs (DraftCreated/Edited, Published, RolledBack, NodeApplied,
  ClusterCreated, NodeAdded, CredentialAdded/Disabled, ExternalIdReleased) ->
  Success. OtOpcUa emits no Failure events.

The Akka message shape changed, but the structured audit path is dormant (zero
production emit/Tell sites; all live audit flows through the bespoke SP path),
so there is no rolling-deploy wire-compat concern. Tested-not-exercised by
design.

ControlPlane.Tests: 44/44 green (AuditWriterActor suite rewritten to construct
the canonical record + assert the Outcome derivation table + the WriteAsync
best-effort/mailbox-routing contract + null SourceNode/CorrelationId handling).
2026-06-02 09:53:12 -04:00
Joseph Doherty c1619d95f5 feat(auth)!: OtOpcUa canonical control-plane roles + config-DB migration (Task 1.7)
Standardize the control-plane admin role VALUES on the canonical six
(ZB.MOM.WW.Auth CanonicalRole). OtOpcUa uses four:
  ConfigViewer   -> Viewer
  ConfigEditor   -> Designer
  FleetAdmin     -> Administrator
  DriverOperator -> Operator   (appsettings-only string role)

This is a rename, not a permission change: enforcement semantics are
preserved (whoever could deploy/administer/operate before still can).

- AdminRole enum members renamed (persisted as string names via
  HasConversion<string>); RoleGrants.razor dropdown default updated.
- EF DATA migration CanonicalizeAdminRoles rewrites existing
  LdapGroupRoleMapping.Role rows old->new (Up) and back (Down); schema /
  model snapshot byte-identical (no pending model changes).
- Enforcement role STRINGS canonicalized:
  * Security policies keep their NAMES ("DriverOperator"/"FleetAdmin")
    but require canonical roles: RequireRole("Operator","Administrator")
    and RequireRole("Administrator").
  * Deployments.razor [Authorize(Roles="Administrator,Designer")].
  * DevStub now grants "Administrator"; LdapOptions/doc-comment examples
    canonicalized.
- Data-plane authorization (NodePermissions/NodeAcl/IPermissionEvaluator/
  TriePermissionEvaluator/UserAuthorizationState) UNTOUCHED.
- New CanonicalAdminRolesTests pins canonical claim values end-to-end and
  the real registered policies; existing role-string tests updated.
2026-06-02 07:30:00 -04:00
Joseph Doherty 32d7fd7cc9 fix(galaxy): complete PR 7.2 rename — use canonical GalaxyMxGateway driver type
v2-ci / build (push) Failing after 48s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
The driver/factory/seed use 'GalaxyMxGateway' (legacy 'Galaxy' was retired),
but the AdminUI editor router, GalaxyDriverPage, address picker, identity
dropdown, the Galaxy browser/probe, and DraftValidator still keyed on 'Galaxy'.
Result: the seeded GalaxyMxGateway driver couldn't be edited ('no editor
registered'), UI-created Galaxy drivers wrote a type with no factory, and a
SystemPlatform-bound GalaxyMxGateway driver failed publish validation.
Align all stragglers to GalaxyMxGateway (+ failing-test-first DraftValidator
coverage). ShouldStub's 'Galaxy' legacy safety-net left intact.
2026-05-29 12:31:55 -04:00
Joseph Doherty bc40388914 chore(di): register ILdapGroupRoleMappingService 2026-05-29 09:47:10 -04:00
Joseph Doherty 81f09a7054 feat(commons): add IDriverBrowser/IBrowseSession/BrowseNode abstractions 2026-05-28 15:32:01 -04:00
Joseph Doherty 662f3f9f5c refactor(driver-pages): address Phase 6/8 deep-review findings
v2-ci / build (push) Failing after 32s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
- Topic-name drift fix: DriverHealthChanged.TopicName and
  DriverControlTopic.Name now live on the message contracts in
  Commons. AkkaDriverHealthPublisher, DriverStatusSignalRBridge,
  DriverHostActor, and AdminOperationsActor all delegate to the
  single constant so a rename can't silently desynchronise
  publisher and subscriber.
- DriverStatusPanel._opResultClearTimer switched from
  System.Timers.Timer to System.Threading.Timer + awaited
  DisposeAsync. Prevents an in-flight 8s clear-callback from
  invoking StateHasChanged on a component whose hub has already
  been released.
- PublishHealthSnapshot deduplicates against the last published
  (state, lastSuccess, lastError, errorCount) fingerprint. The
  30s heartbeat no longer floods the SignalR layer with identical
  Healthy snapshots — newly-joined clients still warm up via the
  snapshot store on JoinDriver.
2026-05-28 11:52:20 -04:00