Compare commits

...

64 Commits

Author SHA1 Message Date
Joseph Doherty
d5fa1f450e Phase 3 PR 29 — Account/session page expanding the minimal sidebar role display into a dedicated /account route. Shows the authenticated operator's identity (username from ClaimTypes.NameIdentifier, display name from ClaimTypes.Name), their Admin roles as badges (from ClaimTypes.Role), the raw LDAP groups that mapped to those roles (from the 'ldap_group' claim added by Login.razor at sign-in), and a capability table listing each Admin capability with its required role and a Yes/No badge showing whether this session has it. Capability list mirrors the Program.cs authorization policies + each page's [Authorize] attribute so operators can self-service check whether their session has access without trial-and-error navigation — capabilities covered: view clusters + fleet status (all roles), edit configuration drafts (ConfigEditor or FleetAdmin per CanEdit policy), publish generations (FleetAdmin per CanPublish policy), manage certificate trust (FleetAdmin per PR 28 Certificates page attribute), manage external-ID reservations (ConfigEditor or FleetAdmin per Reservations page attribute).
Sidebar's 'Signed in as' line now wraps the display name in a link to /account so the existing sidebar-compact view becomes the entry point for the fuller page — keeps the sign-out button where it was for muscle memory, just adds the detail page one click away. Page is gated with [Authorize] (any authenticated admin) rather than a specific role — the capability table deliberately works for every signed-in user so they can see what they DON'T have access to, which helps them file the right ticket with their LDAP admin instead of getting a plain Access Denied when navigating blindly.
Capability → required-role table is defined as a private readonly record list in the page rather than pulled from a service because it's a UI-presentation concern, not runtime policy state — the runtime policy IS Program.cs's AddAuthorizationBuilder + each page's [Authorize] attribute, and this table just mirrors it for operator readability. Comment on the list reminds future-me to extend it when a new policy or [Authorize] page lands. No behavior change if roles are empty, but the page surfaces a hint ('Sign-in would have been blocked, so if you're seeing this, the session claim is likely stale') that nudges the operator toward signing out + back in.
No new tests added — the page is pure display over claims; its only logic is the 'has-capability' Any-overlap check which is exactly what ASP.NET's [Authorize(Roles=...)] does in-framework, and duplicating that in a unit test would test the framework rather than our code. Admin.Tests Unit stays 23 pass / 0 fail. Admin build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:43:35 -04:00
6fdaee3a71 Merge pull request 'Phase 3 PR 28 — Admin UI cert-trust management page' (#27) from phase-3-pr28-cert-trust into v2 2026-04-18 14:42:52 -04:00
Joseph Doherty
ed88835d34 Phase 3 PR 28 — Admin UI cert-trust management page. New /certificates route (FleetAdmin-only) surfaces the OPC UA server's PKI store rejected + trusted certs and gives operators Trust / Delete / Revoke actions so rejected client certs can be promoted without touching disk. CertTrustService reads $PkiStoreRoot/{rejected,trusted}/certs/*.der files directly via X509CertificateLoader — no Opc.Ua dependency in the Admin project, which keeps the Admin host runnable on a machine that doesn't have the full Server install locally (only needs the shared PKI directory reachable; typical deployment has Admin + Server side-by-side on the same box and PkiStoreRoot defaults match so a plain-vanilla install needs no override). CertTrustOptions bound from the Admin's 'CertTrust:PkiStoreRoot' section, default %ProgramData%\OtOpcUa\pki (matches OpcUaServerOptions.PkiStoreRoot default). Trust action moves the .der from rejected/certs/ to trusted/certs/ via File.Move(overwrite:true) — idempotent, tolerates a concurrent operator doing the same move. Delete wipes the file. Revoke removes from trusted/certs/ (Opc.Ua re-reads the Directory store on each new client handshake, so no explicit reload signal is needed; operators retry the rejected connection after trusting). Thumbprint matching is case-insensitive because X509Certificate2.Thumbprint is upper-case hex but operators copy-paste from logs that sometimes lowercase it. Malformed files in the store are logged + skipped — a single bad .der can't take the whole management page offline. Missing store directories produce empty lists rather than exceptions so a pristine install (Server never run yet, no rejected/trusted dirs yet) doesn't crash the page.
Razor page layout: two tables (Rejected / Trusted) with Subject / Issuer / Thumbprint / Valid-window / Actions columns, status banner after each action with success or warning kind ('file missing' = another admin handled it), FleetAdmin-only via [Authorize(Roles=AdminRoles.FleetAdmin)]. Each action invokes LogActionAsync which Serilog-logs the authenticated admin user + thumbprint + action for an audit trail — DB-level ConfigAuditLog persistence is deferred because its schema is cluster-scoped and cert actions are cluster-agnostic; Serilog + CertTrustService's filesystem-op info logs give the forensic trail in the meantime. Sidebar link added to MainLayout between Reservations and the future Account page.
Tests — CertTrustServiceTests (9 new unit cases): ListRejected parses Subject + Thumbprint + store kind from a self-signed test cert written into rejected/certs/; rejected and trusted stores are kept separate; TrustRejected moves the file and the Rejected list is empty afterwards; TrustRejected with a thumbprint not in rejected returns false without touching trusted; DeleteRejected removes the file; UntrustCert removes from trusted only; thumbprint match is case-insensitive (operator UX); missing store directories produce empty lists instead of throwing DirectoryNotFoundException (pristine-install tolerance); a junk .der in the store is logged + skipped and the valid certs still surface (one bad file doesn't break the page). Full Admin.Tests Unit suite: 23 pass / 0 fail (14 prior + 9 new). Full Admin build clean — 0 errors, 0 warnings.
lmx-followups.md #3 marked DONE with a cross-reference to this PR and a note that flipping AutoAcceptUntrustedClientCertificates to false as the production default is a deployment-config follow-up, not a code gap — the Admin UI is now ready to be the trust gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:37:55 -04:00
5389d4d22d Phase 3 PR 27 � Fleet status dashboard page (#26) 2026-04-18 14:07:16 -04:00
Joseph Doherty
b5f8661e98 Phase 3 PR 27 — Fleet status dashboard page. New /fleet route shows per-node apply state (ClusterNodeGenerationState joined with ClusterNode for the ClusterId) in a sortable table with summary cards for Total / Applied / Stale / Failed node counts. Stale detection: LastSeenAt older than 30s triggers a table-warning row class + yellow count card. Failed rows get table-danger + red card. Badge classes per LastAppliedStatus: Applied=bg-success, Failed=bg-danger, Applying=bg-info, unknown=bg-secondary. Timestamps rendered as relative-age strings ('42s ago', '15m ago', '3h ago', then absolute date for >24h). Error column is truncated to 320px with the full message in a tooltip so the table stays readable on wide fleets. Initial data load on OnInitializedAsync; auto-refresh every 5s via a Timer that calls InvokeAsync(RefreshAsync) — matches the FleetStatusPoller's 5s cadence so the dashboard sees the most recent state without polling ahead of the broadcaster. A Refresh button also kicks a manual reload; _refreshing gate prevents double-runs when the timer fires during an in-flight query. IServiceScopeFactory (matches FleetStatusPoller's pattern) creates a fresh DI scope per refresh so the per-page DbContext can't race the timer with the render thread; no new DI registrations needed. Live SignalR hub push is deliberately deferred to a follow-up PR — the existing FleetStatusHub + NodeStateChangedMessage already works for external JavaScript clients; wiring an in-process Blazor Server consumer adds HubConnectionBuilder plumbing that's worth its own focused change. Sidebar link added to MainLayout between Overview and Clusters. Full Admin.Tests Unit suite 14 pass / 0 fail — unchanged, no tests regressed. Full Admin build clean (0 errors, 0 warnings). Closes the 'no per-driver dashboard' gap from lmx-followups item #7 at the fleet level; per-host (platform/engine/Modbus PLC) granularity still needs a dedicated page that consumes IHostConnectivityProbe.GetHostStatuses from the Server process — that's the live-SignalR follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:12:31 -04:00
4058b88784 Merge pull request 'Phase 3 PR 26 — server-layer write authorization by role' (#25) from phase-3-pr26-server-write-authz into v2 2026-04-18 13:04:35 -04:00
Joseph Doherty
6b04a85f86 Phase 3 PR 26 — server-layer write authorization gating by role. Per the user's ACL-at-server-layer directive (saved as feedback_acl_at_server_layer.md in memory), write authorization is enforced in DriverNodeManager.OnWriteValue and never delegated to the driver or to driver-specific auth (the v1 Galaxy-provided security path is explicitly not part of v2 — drivers report SecurityClassification as discovery metadata only). New WriteAuthzPolicy static class in Server/Security/ maps SecurityClassification → required role per the table documented in docs/Configuration.md: FreeAccess = no role required (anonymous sessions can write), Operate + SecuredWrite = WriteOperate, Tune = WriteTune, VerifiedWrite + Configure = WriteConfigure, ViewOnly = deny regardless of roles. Role matching is case-insensitive and role requirements do NOT cascade — a session with WriteConfigure can write Configure attributes but needs WriteOperate separately to write Operate attributes; this is deliberate so escalation is an explicit LDAP group assignment, not a hierarchy the policy silently grants. DriverNodeManager gains a _securityByFullRef Dictionary populated during Variable() registration (parallel to the existing _variablesByFullRef) so OnWriteValue can look up the classification in O(1) on the hot path. OnWriteValue casts the session's context.UserIdentity to the new IRoleBearer interface (implemented by OtOpcUaServer.RoleBasedIdentity from PR 19) — empty Roles collection when the session is anonymous; the same WriteAuthzPolicy.IsAllowed check then either short-circuits true (FreeAccess), false (ViewOnly), or walks the roles list looking for the required one. On deny, OnWriteValue logs 'Write denied for {FullRef}: classification=X userRoles=[...]' at Information level (readable trail for operator complaints) and returns BadUserAccessDenied without touching IWritable.WriteAsync — drivers never see a request we'd have refused. IRoleBearer kept as a minimal server-side interface rather than reusing some abstraction from Core.Abstractions because the concept is OPC-UA-session-scoped and doesn't generalize (the driver side has no notion of a user session). Tests — WriteAuthzPolicyTests (17 new cases): FreeAccess allows write with empty role set + arbitrary roles; ViewOnly denies write even with every role; Operate requires WriteOperate; role match is case-insensitive; Operate denies empty role set + wrong role; SecuredWrite shares Operate's requirement; Tune requires WriteTune; Tune denies WriteOperate-only (asserts roles don't cascade — this is the test that catches a future regression where someone 'helpfully' adds a role-escalation table); Configure requires WriteConfigure; VerifiedWrite shares Configure's requirement; multi-role session allowed when any role matches; unrelated roles denied; RequiredRole theory covering all 5 classified-and-mapped rows + null for FreeAccess/ViewOnly special cases. lmx-followups.md follow-up #2 marked DONE with a back-reference to this PR and the memory note. Full Server.Tests Unit suite: 38 pass / 0 fail (17 new WriteAuthz + 14 SecurityConfiguration from PR 19 + 2 NodeBootstrap + 5 others). Server.Tests Integration (Category=Integration) 2 pass — existing PR 17 anonymous-endpoint smoke tests stay green since the read path doesn't hit OnWriteValue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:01:01 -04:00
cd8691280a Merge pull request 'Phase 3 PR 25 — Modbus test plan + DL205 quirk catalog' (#24) from phase-3-pr25-modbus-test-plan into v2 2026-04-18 12:49:19 -04:00
Joseph Doherty
77d09bf64e Phase 3 PR 25 — modbus-test-plan.md: integration-test playbook with per-device quirk catalog. ModbusPal is the chosen simulator; AutomationDirect DL205 is the first target device class with 6 pending quirks to document and cover with named tests (word order for 32-bit values, register-zero access policy, coil addressing base, maximum registers per FC03, response framing under sustained load, exception code on protected-bit coil write). Each quirk placeholder has a proposed test name so the user's validation work translates directly into integration tests. Test conventions section codifies the named-per-quirk pattern, skip-when-unreachable guard, real ModbusTcpTransport usage, and inter-test isolation. Sets up the harness-and-catalog structure future device families (Allen-Bradley Micrologix, Siemens S7-1200 Modbus gateway, Schneider M340, whatever the user hits) will slot into — same per-device catalog shape, cross-device patterns section for recurring quirks that can get promoted into driver defaults. Next concrete PRs proposed: PR 26 for the integration test project scaffold + DL205 profile + fixture with skip-guard + one smoke test, PR 27+ for the individual confirmed quirks one-per-PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:45:21 -04:00
163c821e74 Merge pull request 'Phase 3 PR 24 — Modbus PLC data type extensions' (#23) from phase-3-pr24-modbus-types into v2 2026-04-18 12:32:55 -04:00
Joseph Doherty
eea31dcc4e Phase 3 PR 24 — Modbus PLC data type extensions. Extends ModbusDataType beyond the textbook Int16/UInt16/Int32/UInt32/Float32 set with Int64/UInt64/Float64 (4-register types), BitInRegister (single bit within a holding register, BitIndex 0-15 LSB-first), and String (ASCII packed 2 chars per register with StringLength-driven sizing). Adds ModbusByteOrder enum on ModbusTagDefinition covering the two word-orderings that matter in the real PLC population: BigEndian (ABCD — Modbus TCP standard, Schneider PLCs that follow it strictly) and WordSwap (CDAB — Siemens S7 family, several Allen-Bradley series, some Modicon families). NormalizeWordOrder helper reverses word pairs in-place for 32-bit values and reverses all four words for 64-bit values (keeps bytes big-endian within each register, which is universal; swaps only the word positions). Internal codec surface switched from (bytes, ModbusDataType) pairs to (bytes, ModbusTagDefinition) because the tag carries the ByteOrder + BitIndex + StringLength context the codec needs; RegisterCount similarly takes the tag so strings can compute ceil(StringLength/2). DriverDataType mapping in MapDataType extended to cover the new logical types — Int64/UInt64 widen to Int32 (PR 25 follow-up: extend DriverDataType enum with Int64 to avoid precision loss), Float64 maps to DriverDataType.Float64, String maps to DriverDataType.String, BitInRegister surfaces as Boolean, all other mappings preserved. BitInRegister writes throw a deliberate InvalidOperationException with a 'read-modify-write' hint — to atomically flip a single bit the driver needs to FC03 the register, OR/AND in the bit, then FC06 it back; that's a separate PR because the bit-modify atomicity story needs a per-register mutex and optional compare-and-write semantics. Everything else (decoder paths for both byte orders, Int64/UInt64/Float64 encode + decode, bit-index extraction across both register halves, String nul-truncation on decode, String nul-padding on encode) ships here. Tests (21 new ModbusDataTypeTests): RegisterCount_returns_correct_register_count_per_type theory (10 rows covering every numeric type); RegisterCount_for_String_rounds_up_to_register_pair theory (5 rows including the 0-char edge case that returns 0 registers); Int32_BigEndian_decodes_ABCD_layout + Int32_WordSwap_decodes_CDAB_layout + Float32_WordSwap_encode_decode_roundtrips (covers the two most-common 32-bit orderings); Int64_BigEndian_roundtrips + UInt64_WordSwap_reverses_four_words (word-swap on 64-bit reverses the four-word layout explicitly, with the test computing the expected wire shape by hand rather than trusting the implementation) + Float64_roundtrips_under_word_swap (3.14159265358979 survives the round-trip with 1e-12 tolerance); BitInRegister_extracts_bit_at_index theory (6 rows including LSB, MSB, and arbitrary bits in a multi-bit mask); BitInRegister_write_is_not_supported_in_PR24 (asserts the exception message steers the reader to the 'read-modify-write' follow-up); String_decodes_ASCII_packed_two_chars_per_register (decodes 'HELLO!' from 3 packed registers with the 'HELLO!'u8 test-only UTF-8 literal which happens to equal the ASCII bytes for this ASCII input); String_decode_truncates_at_first_nul ('Hi' padded with nuls reads back as 'Hi'); String_encode_nul_pads_remaining_bytes (short input writes remaining bytes as 0). Full solution: 0 errors, 217 unit + integration tests pass (22 + 30 new Modbus = 52 Modbus total, 165 pre-existing). ModbusDriver capability footprint now matches the most common industrial PLC workloads — Siemens S7 + Allen-Bradley + Modicon all supported via ByteOrder config without driver forks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:27:12 -04:00
8a692d4ba8 Merge pull request 'Phase 3 PR 23 — Modbus IHostConnectivityProbe' (#22) from phase-3-pr23-modbus-probe into v2 2026-04-18 12:23:04 -04:00
Joseph Doherty
268b12edec Phase 3 PR 23 — Modbus IHostConnectivityProbe. ModbusDriver now implements 6 of 8 capability interfaces (adds IHostConnectivityProbe alongside IDriver + ITagDiscovery + IReadable + IWritable + ISubscribable from the earlier PRs). Background probe loop kicks off in InitializeAsync when ModbusProbeOptions.Enabled is true, sends a cheap FC03 read-1-register at ProbeAddress (default 0) every Interval (default 5s) with a per-tick Timeout (default 2s), and tracks the single Modbus endpoint's state in the HostState machine. Initial state = Unknown; first successful probe transitions to Running; any transport/timeout failure transitions to Stopped; recovery transitions back to Running. OnHostStatusChanged fires exactly on transitions (not on repeat successes — prevents event-spam on a healthy connection). HostName format is 'host:port' so the Admin UI can display the endpoint uniformly with Galaxy platforms/engines in the fleet status dashboard. GetHostStatuses returns a single-item list with the current state + last-change timestamp (Modbus driver talks to exactly one endpoint per instance — operators spin up multiple driver instances for multi-PLC deployments). ShutdownAsync cancels the probe CTS before tearing down the transport so the loop can't log a spurious Stopped after intentional shutdown (OperationCanceledException caught separately from the 'real' transport errors). ModbusDriverOptions extended with ModbusProbeOptions sub-record (Enabled default true, Interval 5s, Timeout 2s, ProbeAddress ushort for PLCs that have register-0 policies; most PLCs tolerate an FC03 at 0 but some industrial gateways lock it). Tests (7 new ModbusProbeTests): Initial_state_is_Unknown_before_first_probe_tick (probe disabled, state stays Unknown, HostName formatted); First_successful_probe_transitions_to_Running (enabled, waits for probe count + event queue, asserts Unknown → Running with correct OldState/NewState); Transport_failure_transitions_to_Stopped (flip fake.Reachable = false mid-run, wait for state diff); Recovery_transitions_Stopped_back_to_Running (up → down → up, asserts ≥ 3 transitions); Repeated_successful_probes_do_not_generate_duplicate_Running_events (several hundred ms of stable probes, count stays at 1); Disabled_probe_stays_Unknown_and_fires_no_events (safety guard when operator wants to disable probing); Shutdown_stops_the_probe_loop (probe count captured at shutdown, delay 400ms, assert ≤ 1 extra to tolerate the narrow race where an in-flight tick completes after shutdown — the contract is 'no new ticks scheduled' not 'instantaneous freeze'). FlappyTransport fake exposes a volatile Reachable flag so tests can toggle the PLC availability mid-run, + ProbeCount counter so tests can assert the loop actually issued requests. WaitForStateAsync helper polls GetHostStatuses up to a deadline; tolerates scheduler jitter on slow CI runners. Full solution: 0 errors, 202 unit + integration tests pass (22 Modbus + 180 pre-existing).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:12:00 -04:00
edce1be742 Merge pull request 'Phase 3 PR 22 — Modbus ISubscribable via polling overlay' (#21) from phase-3-pr22-modbus-subscribe into v2 2026-04-18 12:07:51 -04:00
Joseph Doherty
18b3e24710 Phase 3 PR 22 — Modbus ISubscribable via polling overlay. Modbus has no push model at the wire protocol (unlike MXAccess's OnDataChange callback or OPC UA's own Subscription service), so the driver layers a per-subscription polling loop on top of the existing IReadable path: SubscribeAsync returns an opaque ModbusSubscriptionHandle, starts a background Task.Run that sleeps for the requested publishing interval (floored to 100ms so a misconfigured sub-10ms request doesn't hammer the PLC), reads every subscribed tag through the same FC01/03/04 path the one-shot ReadAsync uses, diffs the returned DataValueSnapshot against the last known value per tag, and raises OnDataChange exactly when (a) it's the first poll (initial-data push per OPC UA Part 4 convention) or (b) boxed value changed or (c) StatusCode changed — stable values don't generate event traffic after the initial push, matching the v1 MXAccess OnDataChange shape. SubscriptionState record holds the handle + tag list + interval + per-subscription CancellationTokenSource + ConcurrentDictionary<string,DataValueSnapshot> LastValues; UnsubscribeAsync removes the state from _subscriptions and cancels the CTS, stopping the polling loop cleanly. Multiple overlapping subscriptions each get their own polling Task so a slow PLC on one subscription can't stall the others. ShutdownAsync cancels every active subscription CTS before tearing down the transport so the driver doesn't leave orphaned polling tasks pumping requests against a disposed socket. Transient poll errors are swallowed inside the loop (the loop continues to the next tick) — the driver's health surface reflects the last-known Degraded state from the underlying ReadAsync path. OperationCanceledException is caught separately to exit the loop silently on unsubscribe/shutdown. Tests (6 new ModbusSubscriptionTests): Initial_poll_raises_OnDataChange_for_every_subscribed_tag asserts the initial-data push fires once per tag in the subscribe call (2 tags → 2 events with FullReference='Level' and FullReference='Temp'); Unchanged_values_do_not_raise_after_initial_poll lets the loop run ~5 cycles at 100ms with a stable register value, asserts only the initial push fires (no event spam on stable tags); Value_change_between_polls_raises_OnDataChange mutates the fake register bank between poll ticks and asserts a second event fires with the new value (verified via e.Snapshot.Value.ShouldBe((short)200)); Unsubscribe_stops_the_polling_loop captures the event count right after UnsubscribeAsync, mutates a register that would have triggered a change if polling continued, asserts the count stays the same after 400ms; SubscribeAsync_floors_intervals_below_100ms passes a 10ms interval + asserts only 1 event fires across 300ms (if the floor weren't enforced we'd see 30 — the test asserts the floor semantically by counting events on stable data); Multiple_subscriptions_fire_independently creates two subs on different tags, unsubscribes only one, mutates the other's tag, asserts only the surviving sub emits while the unsubscribed one stays at its pre-unsubscribe count. FakeTransport in this test file is scoped to FC03 only since that's all the subscription path exercises — keeps the test doubles minimal and the failure modes obvious. WaitForCountAsync helper polls a ConcurrentQueue up to a deadline, makes the tests tolerant of scheduler jitter on slow CI runners. Full solution: 0 errors, 195 tests pass (6 new subscription + 9 existing Modbus + 180 pre-existing). ModbusDriver now implements IDriver + ITagDiscovery + IReadable + IWritable + ISubscribable — five of the eight capability interfaces. IAlarmSource + IHistoryProvider remain unimplemented because Modbus has no wire-level alarm or history semantics; IHostConnectivityProbe is a plausible future addition (treat transport disconnect as a Stopped signal) but needs the socket-level connection-state tracking plumbed through IModbusTransport which is its own PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:03:39 -04:00
f6a12dafe9 Merge pull request 'Phase 3 PR 21 — Modbus TCP driver (first native-protocol greenfield)' (#20) from phase-3-pr21-modbus-driver into v2 2026-04-18 11:58:20 -04:00
Joseph Doherty
058c3dddd3 Phase 3 PR 21 — Modbus TCP driver: first native-protocol greenfield for v2. New src/Driver.Modbus project with ModbusDriver implementing IDriver + ITagDiscovery + IReadable + IWritable. Validates the driver-agnostic abstractions (IAddressSpaceBuilder, DriverAttributeInfo, DataValueSnapshot, WriteRequest/WriteResult) generalize beyond Galaxy — nothing Galaxy-specific is used here. ModbusDriverOptions carries Host/Port/UnitId/Timeout + a pre-declared tag list (Modbus has no discovery protocol — tags are configuration). IModbusTransport abstracts the socket layer so tests swap in-memory fakes; concrete ModbusTcpTransport speaks the MBAP ADU (TxId + Protocol=0 + Length + UnitId + PDU) over TcpClient, serializes requests through a semaphore for single-flight in-order responses, validates the response TxId matches, surfaces server exception PDUs as ModbusException with function code + exception code. DiscoverAsync streams one folder per driver with a BaseDataVariable per tag + DriverAttributeInfo that flags writable tags as SecurityClassification.Operate vs ViewOnly for read-only regions. ReadAsync routes per-tag by ModbusRegion: FC01 for Coils, FC02 for DiscreteInputs, FC03 for HoldingRegisters, FC04 for InputRegisters; register values decoded through System.Buffers.Binary.BinaryPrimitives (BigEndian for single-register Int16/UInt16 + two-register Int32/UInt32/Float32 per standard modbus word-swap conventions). WriteAsync uses FC05 (Write Single Coil with 0xFF00/0x0000 encoding) for booleans, FC06 (Write Single Register) for 16-bit types, FC16 (Write Multiple Registers) for 32-bit types. Unknown tag → BadNodeIdUnknown; write to InputRegister or DiscreteInput or Writable=false tag → BadNotWritable; exception during transport → BadInternalError + driver health Degraded. Subscriptions + Historian + Alarms deliberately out of scope — Modbus has no push model (subscribe would be a polling overlay, additive PR) and no history/alarm semantics at the protocol level. Tests (9 new ModbusDriverTests): InitializeAsync connects + populates the tag map + sets health=Healthy; Read Int16 from HoldingRegister returns BigEndian value; Read Float32 spans two registers BigEndian (IEEE 754 single for 25.5f round-trips exactly); Read Coil returns boolean from the bit-packed response; unknown tag name returns BadNodeIdUnknown without an exception; Write UInt16 round-trips via FC06; Write Float32 uses FC16 (two-register write verified by decoding back through the fake register bank); Write to InputRegister returns BadNotWritable; Discover streams one folder + one variable per tag with correct DriverDataType mapping (Int16/Int32→Int32, UInt16/UInt32→Int32, Float32→Float32, Bool→Boolean). FakeTransport simulates a 256-register/256-coil bank + implements the 7 function codes the driver uses. slnx updated with the new src + tests entries. Full solution post-add: 0 errors, 189 tests pass (9 Modbus + 180 pre-existing). IDriver abstraction validated against a fundamentally different protocol — Modbus TCP has no AlarmExtension, no ScanState, no IPC boundary, no historian, no LDAP — and the same builder/reader/writer contract plugged straight in. Future PRs on this driver: ISubscribable via a polling loop, IHostConnectivityProbe for dead-device detection, PLC-specific data-type extensions (Int64/BCD/string-in-registers).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:55:21 -04:00
52791952dd Merge pull request 'Phase 3 PR 20 — lmx-followups.md' (#19) from phase-3-pr20-lmx-followups into v2 2026-04-18 11:50:38 -04:00
Joseph Doherty
860deb8e0d Phase 3 PR 20 — lmx-followups.md: track remaining Galaxy-bridge tasks after PR 19 (HistoryReadAtTime/Events Proxy wiring, write-gating by role, Admin UI cert trust, live-LDAP integration test, full-stack Galaxy smoke, multi-driver test, per-host dashboard). Documents what each item depends on, the shipped surface it builds on, and the minimal to-do so a future session can pick any one off in isolation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:43:15 -04:00
f5e7173de3 Merge pull request 'Phase 3 PR 19 — LDAP user identity + Basic256Sha256 security profile' (#18) from phase-3-pr19-ldap-security into v2 2026-04-18 11:36:18 -04:00
Joseph Doherty
22d3b0d23c Phase 3 PR 19 — LDAP user identity + Basic256Sha256 security profile. Replaces the anonymous-only endpoint with a configurable security profile and an LDAP-backed UserName token validator. New IUserAuthenticator abstraction in Backend/Security/: LdapUserAuthenticator binds to the configured directory (reuses the pattern from Admin.Security.LdapAuthService without the cross-app dependency — Novell.Directory.Ldap.NETStandard 3.6.0 package ref added to Server alongside the existing OPCFoundation packages) and maps group membership to OPC UA roles via LdapOptions.GroupToRole (case-insensitive). DenyAllUserAuthenticator is the default when Ldap.Enabled=false so UserName token attempts return a clean BadUserAccessDenied rather than hanging on a localhost:3893 bind attempt. OpcUaSecurityProfile enum + LdapOptions nested record on OpcUaServerOptions. Profile=None keeps the PR 17 shape (SecurityPolicies.None + Anonymous token only) so existing integration tests stay green; Profile=Basic256Sha256SignAndEncrypt adds a second ServerSecurityPolicy (Basic256Sha256 + SignAndEncrypt) to the collection and, when Ldap.Enabled=true, adds a UserName token policy scoped to SecurityPolicies.Basic256Sha256 only — passwords must ride an encrypted channel, the stack rejects UserName over None. OtOpcUaServer.OnServerStarted hooks SessionManager.ImpersonateUser: AnonymousIdentityToken passes through; UserNameIdentityToken delegates to IUserAuthenticator.AuthenticateAsync — rejected identities throw ServiceResultException(BadUserAccessDenied); accepted identities get a RoleBasedIdentity that carries the resolved roles through session.Identity so future PRs can gate writes by role. OpcUaApplicationHost + OtOpcUaServer constructors take IUserAuthenticator as a dependency. Program.cs binds the new OpcUaServer:Ldap section from appsettings (Enabled defaults false, GroupToRole parsed as Dictionary<string,string>), registers IUserAuthenticator as LdapUserAuthenticator when enabled or DenyAllUserAuthenticator otherwise. PR 17 integration test updated to pass DenyAllUserAuthenticator so it keeps exercising the anonymous-only path unchanged. Tests — SecurityConfigurationTests (new, 13 cases): DenyAllAuthenticator rejects every credential; LdapAuthenticator rejects blank creds without hitting the server; rejects when Enabled=false; rejects plaintext when both UseTls=false AND AllowInsecureLdap=false (safety guard matching the Admin service); EscapeLdapFilter theory (4 rows: plain passthrough, parens/asterisk/backslash → hex escape) — regression guard against LDAP injection; ExtractOuSegment theory (3 rows: finds ou=, returns null when absent, handles multiple ou segments by returning first); ExtractFirstRdnValue theory (3 rows: strips cn= prefix, handles single-segment DN, returns plain string unchanged when no =). OpcUaServerOptions_default_is_anonymous_only asserts the default posture preserves PR 17 behavior. InternalsVisibleTo('ZB.MOM.WW.OtOpcUa.Server.Tests') added to Server csproj so ExtractOuSegment and siblings are reachable from the tests. Full solution: 0 errors, 180 tests pass (8 Core + 14 Proxy + 24 Configuration + 6 Shared + 91 Galaxy.Host + 19 Server (17 unit + 2 integration) + 18 Admin). Live-LDAP integration test (connect via Basic256Sha256 endpoint with a real user from GLAuth, assert the session.Identity carries the mapped role) is deferred to a follow-up — it requires the GLAuth dev instance to be running at localhost:3893 which is dev-machine-specific, and the test harness for that also needs a fresh client-side certificate provisioned by the live server's trusted store.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:49:46 -04:00
55696a8750 Merge pull request 'Phase 3 PR 18 — delete v1 archived projects' (#17) from phase-3-pr18-delete-v1 into v2 2026-04-18 08:41:56 -04:00
Joseph Doherty
dd3a449308 Phase 3 PR 18 — delete v1 archived projects. PR 2 archived via IsTestProject=false + PropertyGroup comment; PR 17 landed the full v2 OPC UA server runtime (ApplicationConfiguration + endpoint + client integration test); every v1 surface is now functionally superseded. This PR removes the archive: 154 files across 5 projects — src/OtOpcUa.Host (v1 server, 158 files), src/Historian.Aveva (v1 historian plugin, 4 files), tests/OtOpcUa.Tests.v1Archive (494 unit tests that were archived in PR 2 with IsTestProject=false), tests/Historian.Aveva.Tests (18 tests against the v1 plugin), tests/OtOpcUa.IntegrationTests (6 tests against the v1 Host). slnx trimmed to reflect the current set (12 src + 12 tests). Verified zero incoming references from live projects before deleting — no live csproj references .Host or .Historian.Aveva since PR 5 ported Historian into Driver.Galaxy.Host/Backend/Historian/ and PR 17 stood up the new OtOpcUa.Server. Full solution post-delete: 0 errors, 165 unit + integration tests pass (8 Core + 14 Proxy + 24 Configuration + 91 Galaxy.Host + 6 Shared + 4 Server + 18 Admin) — no regressions. Recovery path if a future PR needs to resurrect a specific v1 routine: git revert this commit or cherry-pick the specific file from pre-delete history; v1 is preserved in the full branch history, not lost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:35:22 -04:00
3c1dc334f9 Merge pull request 'Phase 3 PR 17 — complete OPC UA server startup + live integration test' (#16) from phase-3-pr17-server-startup into v2 2026-04-18 08:28:42 -04:00
Joseph Doherty
46834a43bd Phase 3 PR 17 — complete OPC UA server startup end-to-end + integration test. PR 16 shipped the materialization shape (DriverNodeManager / OtOpcUaServer) without the activation glue; this PR finishes the scope so an external OPC UA client can actually connect, browse, and read. New OpcUaServerOptions DTO bound from the OpcUaServer section of appsettings.json (EndpointUrl default opc.tcp://0.0.0.0:4840/OtOpcUa, ApplicationName, ApplicationUri, PkiStoreRoot default %ProgramData%\OtOpcUa\pki, AutoAcceptUntrustedClientCertificates default true for dev — production flips via config). OpcUaApplicationHost wraps Opc.Ua.Configuration.ApplicationInstance: BuildConfiguration constructs the ApplicationConfiguration programmatically (no external XML) with SecurityConfiguration pointing at <PkiStoreRoot>/own, /issuers, /trusted, /rejected directories — stack auto-creates the cert folders on first run and generates a self-signed application certificate via CheckApplicationInstanceCertificate, ServerConfiguration.BaseAddresses set to the endpoint URL + SecurityPolicies just None + UserTokenPolicies just Anonymous with PolicyId='Anonymous' + SecurityPolicyUri=None so the client's UserTokenPolicy lookup succeeds at OpenSession, TransportQuotas.OperationTimeout=15s + MinRequestThreadCount=5 / MaxRequestThreadCount=100 / MaxQueuedRequestCount=200, CertificateValidator auto-accepts untrusted when configured. StartAsync creates the OtOpcUaServer (passes DriverHost + ILoggerFactory so one DriverNodeManager is created per registered driver in CreateMasterNodeManager from PR 16), calls ApplicationInstance.Start(server) to bind the endpoint, then walks each DriverNodeManager and drives a fresh GenericDriverNodeManager.BuildAddressSpaceAsync against it so the driver's discovery streams into the address space that's already serving clients. Per-driver discovery is isolated per decision #12: a discovery exception marks the driver's subtree faulted but the server stays up serving the other drivers' subtrees. DriverHost.GetDriver(instanceId) public accessor added alongside the existing GetHealth so OtOpcUaServer can enumerate drivers during CreateMasterNodeManager. DriverNodeManager.Driver property made public so OpcUaApplicationHost can identify which driver each node manager wraps during the discovery loop. OpcUaServerService constructor takes OpcUaApplicationHost — ExecuteAsync sequence now: bootstrap.LoadCurrentGenerationAsync → applicationHost.StartAsync → infinite Task.Delay until stop. StopAsync disposes the application host (which stops the server via OtOpcUaServer.Stop) before disposing DriverHost. Program.cs binds OpcUaServerOptions from appsettings + registers OpcUaApplicationHost + OpcUaServerOptions as singletons. Integration test (OpcUaServerIntegrationTests, Category=Integration): IAsyncLifetime spins up the server on a random non-default port (48400+random for test isolation) with a per-test-run PKI store root (%temp%/otopcua-test-<guid>) + a FakeDriver registered in DriverHost that has ITagDiscovery + IReadable implementations — DiscoverAsync registers TestFolder>Var1, ReadAsync returns 42. Client_can_connect_and_browse_driver_subtree creates an in-process OPC UA client session via CoreClientUtils.SelectEndpoint (which talks to the running server's GetEndpoints and fetches the live EndpointDescription with the actual PolicyId), browses the fake driver's root, asserts TestFolder appears in the returned references. Client_can_read_a_driver_variable_through_the_node_manager constructs the variable NodeId using the namespace index the server registered (urn:OtOpcUa:fake), calls Session.ReadValue, asserts the DataValue.Value is 42 — the whole pipeline (client → server endpoint → DriverNodeManager.OnReadValue → FakeDriver.ReadAsync → back through the node manager → response to client) round-trips correctly. Dispose tears down the session, server, driver host, and PKI store directory. Full solution: 0 errors, 165 tests pass (8 Core unit + 14 Proxy unit + 24 Configuration unit + 6 Shared unit + 91 Galaxy.Host unit + 4 Server (2 unit NodeBootstrap + 2 new integration) + 18 Admin). End-to-end outcome: PR 14's GalaxyAlarmTracker alarm events now flow through PR 15's GenericDriverNodeManager event forwarder → PR 16's ConditionSink → OPC UA AlarmConditionState.ReportEvent → out to every OPC UA client subscribed to the alarm condition. The full alarm subsystem (driver-side subscription of the Galaxy 4-attribute quartet, Core-side routing by source node id, Server-side AlarmConditionState materialization with ReportEvent dispatch) is now complete and observable through any compliant OPC UA client. LDAP / security-profile wire-up (replacing the anonymous-only endpoint with BasicSignAndEncrypt + user identity mapping to NodePermissions role) is the next layer — it reuses the same ApplicationConfiguration plumbing this PR introduces but needs a deployment-policy source (central config DB) for the cert trust decisions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:18:37 -04:00
7683b94287 Merge pull request 'Phase 3 PR 16 — concrete OPC UA server scaffolding + AlarmConditionState materialization' (#15) from phase-3-pr16-opcua-server into v2 2026-04-18 08:10:44 -04:00
Joseph Doherty
f53c39a598 Phase 3 PR 16 — concrete OPC UA server scaffolding with AlarmConditionState materialization. Introduces the OPCFoundation.NetStandard.Opc.Ua.Server package (v1.5.374.126, same version the v1 stack already uses) and two new server-side classes: DriverNodeManager : CustomNodeManager2 is the concrete realization of PR 15's IAddressSpaceBuilder contract — Folder() creates FolderState nodes under an Organizes hierarchy rooted at ObjectsFolder > DriverInstanceId; Variable() creates BaseDataVariableState with DataType mapped from DriverDataType (Boolean/Int32/Float/Double/String/DateTime) + ValueRank (Scalar or OneDimension) + AccessLevel CurrentReadOrWrite; AddProperty() creates PropertyState with HasProperty reference. Read hook wires OnReadValue per variable to route to IReadable.ReadAsync; Write hook wires OnWriteValue to route to IWritable.WriteAsync and surface per-tag StatusCode. MarkAsAlarmCondition() materializes an OPC UA AlarmConditionState child of the variable, seeded from AlarmConditionInfo (SourceName, InitialSeverity → UA severity via Low=250/Medium=500/High=700/Critical=900, InitialDescription), initial state Enabled + Acknowledged + Inactive + Retain=false. Returns an IAlarmConditionSink whose OnTransition updates alarm.Severity/Time/Message and switches state per AlarmType string ('Active' → SetActiveState(true) + SetAcknowledgedState(false) + Retain=true; 'Acknowledged' → SetAcknowledgedState(true); 'Inactive' → SetActiveState(false) + Retain=false if already Acked) then calls alarm.ReportEvent to emit the OPC UA event to subscribed clients. Galaxy's GalaxyAlarmTracker (PR 14) now lands at a concrete AlarmConditionState node instead of just raising an unobserved C# event. OtOpcUaServer : StandardServer wires one DriverNodeManager per DriverHost.GetDriver during CreateMasterNodeManager — anonymous endpoint, no security profile (minimum-viable; LDAP + security-profile wire-up is the next PR). DriverHost gains public GetDriver(instanceId) so the server can enumerate drivers at startup. NestedBuilder inner class in DriverNodeManager implements IAddressSpaceBuilder by temporarily retargeting the parent's _currentFolder during each call so Folder→Variable→AddProperty land under the correct subtree — not thread-safe if discovery ran concurrently, but GenericDriverNodeManager.BuildAddressSpaceAsync is sequential per driver so this is safe by construction. NuGet audit suppress for GHSA-h958-fxgg-g7w3 (moderate-severity in OPCFoundation.NetStandard.Opc.Ua.Core 1.5.374.126; v1 stack already accepts this risk on the same package version). PR 16 is scoped as scaffolding — the actual server startup (ApplicationInstance, certificate config, endpoint binding, session management wiring into OpcUaServerService.ExecuteAsync) is deferred to a follow-up PR because it needs ApplicationConfiguration XML + optional-cert-store logic that depends on per-deployment policy decisions. The materialization shape is complete: a subsequent PR adds 100 LOC to start the server and all the already-written IAddressSpaceBuilder + alarm-condition + read/write wire-up activates end-to-end. Full solution: 0 errors, 152 unit tests pass (no new tests this PR — DriverNodeManager unit testing needs an IServerInternal mock which is heavyweight; live-endpoint integration tests land alongside the server-startup PR).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:00:36 -04:00
d569c39f30 Merge pull request 'Phase 3 PR 15 — alarm-condition contract in abstract layer' (#14) from phase-3-pr15-alarm-contract into v2 2026-04-18 07:54:30 -04:00
Joseph Doherty
190d09cdeb Phase 3 PR 15 — alarm-condition contract in IAddressSpaceBuilder + wire OnAlarmEvent through GenericDriverNodeManager. IAddressSpaceBuilder.IVariableHandle gains MarkAsAlarmCondition(AlarmConditionInfo) which returns an IAlarmConditionSink. AlarmConditionInfo carries SourceName/InitialSeverity/InitialDescription. Concrete address-space builders (the upcoming PR 16 OPC UA server backend) materialize a sibling AlarmConditionState node on the first call; the sink receives every lifecycle transition the generic node manager forwards. GenericDriverNodeManager gains a CapturingBuilder wrapper that transparently wraps every Folder/Variable call — the wrapper observes MarkAsAlarmCondition calls without participating in materialization, captures the resulting IAlarmConditionSink into an internal source-node-id → sink ConcurrentDictionary keyed by IVariableHandle.FullReference. After DiscoverAsync completes, if the driver implements IAlarmSource the node manager subscribes to OnAlarmEvent and routes every AlarmEventArgs to the sink registered for args.SourceNodeId — unknown source ids are dropped silently (may belong to another driver or to a variable the builder chose not to flag). Dispose unsubscribes the forwarder to prevent dangling invocation-list references across node-manager rebuilds. GalaxyProxyDriver.DiscoverAsync now calls handle.MarkAsAlarmCondition(new AlarmConditionInfo(fullName, AlarmSeverity.Medium, null)) on every attr.IsAlarm=true variable — severity seed is Medium because the live Priority byte arrives through the subsequent GalaxyAlarmEvent stream (which PR 14's GalaxyAlarmTracker now emits); the Admin UI sees the severity update on the first transition. RecordingAddressSpaceBuilder in Driver.Galaxy.E2E gains a RecordedAlarmCondition list + a RecordingSink implementation that captures AlarmEventArgs for test assertion — the E2E parity suite can now verify alarm-condition registration shape in addition to folder/variable shape. Tests (4 new GenericDriverNodeManagerTests): Alarm_events_are_routed_to_the_sink_registered_for_the_matching_source_node_id — 2 alarms registered (Tank.HiHi + Heater.OverTemp), driver raises an event for Tank.HiHi, the Tank.HiHi sink captures the payload, the Heater.OverTemp sink does not (tag-scoped fan-out, not broadcast); Non_alarm_variables_do_not_register_sinks — plain Tank.Level in the same discover is not in TrackedAlarmSources; Unknown_source_node_id_is_dropped_silently — a transition for Unknown.Source doesn't reach any sink + no exception; Dispose_unsubscribes_from_OnAlarmEvent — post-dispose, a transition for a previously-registered tag is no-op because the forwarder detached. InternalsVisibleTo('ZB.MOM.WW.OtOpcUa.Core.Tests') added to Core csproj so TrackedAlarmSources internal property is visible to the test. Full solution: 0 errors, 152 unit tests pass (8 Core + 14 Proxy + 14 Admin + 24 Configuration + 6 Shared + 84 Galaxy.Host + 2 Server). PR 16 will implement the concrete OPC UA address-space builder that materializes AlarmConditionState from this contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:51:35 -04:00
4e0040e670 Merge pull request 'Phase 2 PR 14 — alarm subsystem (subscribe to alarm attribute quartet + raise GalaxyAlarmEvent)' (#13) from phase-2-pr14-alarm-subsystem into v2 2026-04-18 07:37:49 -04:00
91cb2a1355 Merge pull request 'Phase 2 PR 13 — port GalaxyRuntimeProbeManager + per-platform ScanState probing' (#12) from phase-2-pr13-runtime-probe into v2 2026-04-18 07:37:41 -04:00
Joseph Doherty
c14624f012 Phase 2 PR 14 — alarm subsystem wire-up. Per IsAlarm=true attribute (PR 9 added the discovery flag), GalaxyAlarmTracker in Backend/Alarms/ advises the four Galaxy alarm-state attributes: .InAlarm (boolean alarm active), .Priority (int severity), .DescAttrName (human-readable description), .Acked (boolean acknowledged). Runs the OPC UA Part 9 alarm lifecycle state machine simplified for the Galaxy AlarmExtension model and raises AlarmTransition events on transitions operators must react to — Active (InAlarm false→true, default Unacknowledged), Acknowledged (Acked false→true while InAlarm still true), Inactive (InAlarm true→false). MxAccessGalaxyBackend instantiates the tracker in its constructor with delegate-based subscribe/unsubscribe/write pointers to MxAccessClient, hooks TransitionRaised to forward each transition through the existing OnAlarmEvent IPC event that PR 4 ConnectionSink wires into MessageKind.AlarmEvent frames — no new contract messages required since GalaxyAlarmEvent already exists in Shared.Contracts. Field mapping: EventId = fresh Guid.ToString('N') per transition, ObjectTagName = alarm attribute full reference, AlarmName = alarm attribute full reference, Severity = tracked Priority, StateTransition = 'Active'|'Acknowledged'|'Inactive', Message = DescAttrName or tag fallback, UtcUnixMs = transition time. DiscoverAsync caches every IsAlarm=true attribute's full reference (tag.attribute) into _discoveredAlarmTags (ConcurrentBag cleared-then-filled on every re-Discover to track Galaxy redeploys). SubscribeAlarmsAsync iterates the cache and advises each via GalaxyAlarmTracker.TrackAsync; best-effort per-alarm — a subscribe failure on one alarm doesn't abort the whole call since operators prefer partial alarm coverage to none. Tracker is internally idempotent on repeat Track calls (second invocation for same alarm tag is a no-op; already-subscribed check short-circuits before the 4 MXAccess sub calls). Subscribe-failure rollback inside TrackAsync removes the alarm state + unadvises any of the 4 that did succeed so a partial advise can't leak a phantom tracking entry. AcknowledgeAlarmAsync routes to tracker.AcknowledgeAsync which writes the operator comment to <tag>.AckMsg via MxAccessClient.WriteAsync — writes use the existing MXAccess OnWriteComplete TCS-by-handle path (PR 4 Medium 4) so a runtime-refused ack bubbles up as Success=false rather than false-positive. State-machine quirks preserved from v1: (1) initial Acked=true on subscribe does NOT fire Acknowledged (alarm at rest, pre-acknowledged — default state is Acked=true so the first subscribe callback is a no-op transition), (2) Acked false→true only fires Acknowledged when InAlarm is currently true (acking a latched-inactive alarm is not a user-visible transition), (3) Active transition clears the Acked flag in-state so the next Acked callback correctly fires Acknowledged (v1 had this buried in the ConditionState logic; we track it on the AlarmState struct directly). Priority value handled as int/short/long via type pattern match with int.MaxValue guard — Galaxy attribute category returns varying CLR types (Int32 is canonical but some older templates use Int16), and a long overflow cast to int would silently corrupt the severity. Dispose cascade in MxAccessGalaxyBackend.Dispose: alarm-tracker unsubscribe→dispose, probe-manager unsubscribe→dispose, mx.ConnectionStateChanged detach, historian dispose — same discipline PR 6 / PR 8 / PR 13 established so dangling invocation-list refs don't survive a backend recycle. #pragma warning disable CS0067 around OnAlarmEvent removed since the event is now raised. Tests (9 new, GalaxyAlarmTrackerTests): four-attribute subscribe per alarm, idempotent repeat-track, InAlarm false→true fires Active with Priority + Desc, InAlarm true→false fires Inactive, Acked false→true while InAlarm fires Acknowledged, Acked transition while InAlarm=false does not fire, AckMsg write path carries the comment, snapshot reports latest four fields, foreign probe callback for a non-tracked tag is silently dropped. Full Galaxy.Host.Tests Unit suite 84 pass / 0 fail (9 new alarm + 12 PR 13 probe + 21 PR 12 quality + 42 pre-existing). Galaxy.Host builds clean (0/0). Branches off phase-2-pr13-runtime-probe so the MxAccessGalaxyBackend constructor/Dispose chain gets the probe-manager + alarm-tracker wire-up in a coherent order; fast-forwards if PR 13 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:34:13 -04:00
Joseph Doherty
04d267d1ea Phase 2 PR 13 — port GalaxyRuntimeProbeManager state machine + wire per-platform ScanState probing. PR 8 gave operators the gateway-level transport signal (MxAccessClient.ConnectionStateChanged → OnHostStatusChanged tagged with the Wonderware client identity) — enough to detect when the entire MXAccess COM proxy dies, but silent when a specific or goes down while the gateway stays alive. This PR closes that gap: a pure-logic GalaxyRuntimeProbeManager (ported from v1's 472-LOC GalaxyRuntimeProbeManager.cs, distilled to ~240 LOC of state machine without the OPC UA node-manager entanglement) lives in Backend/Stability/, advises <TagName>.ScanState per WinPlatform + AppEngine gobject discovered during DiscoverAsync, runs the Unknown → Running → Stopped state machine with v1's documented semantics preserved verbatim, and raises a StateChanged event on transitions operators should react to. MxAccessGalaxyBackend instantiates the probe manager in the constructor with SubscribeAsync/UnsubscribeAsync delegate pointers into MxAccessClient, hooks StateChanged to forward each transition through the same OnHostStatusChanged IPC event the gateway signal uses (HostName = platform/engine TagName, RuntimeStatus = 'Running'|'Stopped'|'Unknown', LastObservedUtcUnixMs from the state-change timestamp), so Admin UI gets per-host signals flowing through the existing PR 8 wire with no additional IPC plumbing. State machine rules ported from v1 runtimestatus.md: (a) ScanState is on-change-only — a stably-Running host may go hours without a callback, so Running → Stopped is driven only by explicit ScanState=false, never by starvation; (b) Unknown → Running is a startup transition and does NOT fire StateChanged (would paint every host as 'just recovered' at startup, which is noise and can clear Bad quality set by a concurrently-stopping sibling); (c) Stopped → Running fires StateChanged for the real recovery case; (d) Running → Stopped fires StateChanged; (e) Unknown → Stopped fires StateChanged because that's the first-known-bad signal operators need when a host is down at our startup time. MxAccessGalaxyBackend.DiscoverAsync calls _probeManager.SyncAsync with the runtime-host subset of the hierarchy (CategoryId == 1 WinPlatform or 3 AppEngine) as a best-effort step after building the Discover response — probe failures are swallowed so Discover still returns the hierarchy even if a per-host advise fails; the gateway signal covers the critical rung. SyncAsync is idempotent (second call with the same set is a no-op) and handles the diff on re-Discover for tag rename / host add / host remove. Subscribe failure rolls back the host's state entry under the lock so a later probe callback for a never-advised tag can't transition a phantom state from Unknown to Stopped and fan out a false host-down signal (the same protection v1's GalaxyRuntimeProbeManager had at line 237-243 of v1 with a captured-probe-string comparison under the lock). MxAccessGalaxyBackend.Dispose unsubscribes the StateChanged handler before disposing the probe manager to prevent dangling invocation-list references across reconnects, same discipline as PR 8's ConnectionStateChanged and PR 6's SubscriptionReplayFailed. Tests (12 new GalaxyRuntimeProbeManagerTests): Sync_subscribes_to_ScanState_per_host verifies tag.ScanState subscriptions are advised per Platform/Engine; Sync_is_idempotent_on_repeat_call_with_same_set verifies no duplicate subscribes; Sync_unadvises_removed_hosts verifies the diff unadvises gone hosts; Subscribe_failure_rolls_back_host_entry_so_later_transitions_do_not_fire_stale_events covers the rollback-on-subscribe-fail guard; Unknown_to_Running_does_not_fire_StateChanged preserves the startup-noise rule; Running_to_Stopped_fires_StateChanged_with_both_states asserts OldState and NewState are both captured in the transition record; Stopped_to_Running_fires_StateChanged_for_recovery verifies the recovery case; Unknown_to_Stopped_fires_StateChanged_for_first_known_bad_signal preserves the first-bad rule; Repeated_Good_Running_callbacks_do_not_fire_duplicate_events verifies the state-tracking de-dup; Unknown_callback_for_non_tracked_probe_is_dropped asserts a foreign callback is silently ignored; Snapshot_reports_current_state_for_every_tracked_host covers the dashboard query hook; IsRuntimeHost_recognizes_WinPlatform_and_AppEngine_category_ids asserts the CategoryId filter. Galaxy.Host.Tests Unit suite 75 pass / 0 fail (12 new probe + 63 pre-existing). Galaxy.Host builds clean (0 errors / 0 warnings). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:27:56 -04:00
4448db8207 Merge pull request 'Phase 2 PR 12 � richer historian quality mapping' (#11) from phase-2-pr12-quality-mapper into v2 2026-04-18 07:22:44 -04:00
d96b513bbc Merge pull request 'Phase 2 PR 11 � HistoryReadEvents IPC (alarm history)' (#10) from phase-2-pr11-history-events into v2 2026-04-18 07:22:33 -04:00
053c4e0566 Merge pull request 'Phase 2 PR 10 � HistoryReadAtTime IPC surface' (#9) from phase-2-pr10-history-attime into v2 2026-04-18 07:22:16 -04:00
Joseph Doherty
f24f969a85 Phase 2 PR 12 — richer historian quality mapping. Replace MxAccessGalaxyBackend's inline MapHistorianQualityToOpcUa category-only helper (192+→Good, 64-191→Uncertain, 0-63→Bad) with a new public HistorianQualityMapper.Map utility that preserves specific OPC DA subcodes — BadNotConnected(8)→0x808A0000u instead of generic Bad(0x80000000u), UncertainSubNormal(88)→0x40950000u instead of generic Uncertain, Good_LocalOverride(216)→0x00D80000u instead of generic Good, etc. Mirrors v1 QualityMapper.MapToOpcUaStatusCode byte-for-byte without pulling in OPC UA types — the function returns uint32 literals that are the canonical OPC UA StatusCode wire encoding, surfaced directly as DataValueSnapshot.StatusCode on the Proxy side with no additional translation. Unknown subcodes fall back to the family category (255→Good, 150→Uncertain, 50→Bad) so a future SDK change that adds a quality code we don't map yet still gets a sensible bucket. GalaxyDataValue wire shape unchanged (StatusCode stays uint) — this is a pure fidelity upgrade on the Host side. Downstream callers (Admin UI status dashboard, OPC UA clients receiving historian samples) can now distinguish e.g. a transport outage (BadNotConnected) from a sensor fault (BadSensorFailure) from a warm-up delay (BadWaitingForInitialData) without a second round-trip or dashboard heuristic. 21 new tests (HistorianQualityMapperTests): theory with 15 rows covering every specific mapping from the v1 QualityMapper table, plus 6 fallback tests verifying unknown-subcode codes in each family (Good/Uncertain/Bad) collapse to the family default. Galaxy.Host.Tests Unit suite 56/0 (21 new + 35 existing). Galaxy.Host builds clean (0/0). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:11:02 -04:00
Joseph Doherty
ca025ebe0c Phase 2 PR 11 — HistoryReadEvents IPC (alarm history). New Shared.Contracts messages HistoryReadEventsRequest/Response + GalaxyHistoricalEvent DTO (MessageKind 0x66/0x67). IGalaxyBackend gains HistoryReadEventsAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadEventsAsync (ported in PR 5) and maps HistorianEventDto → GalaxyHistoricalEvent — Guid.ToString() for EventId wire shape, DateTime → Unix ms for both EventTime (when the event fired in the process) and ReceivedTime (when the Historian persisted it), DisplayText + Severity pass through. SourceName is string? — null means 'all sources' (passed straight through to HistorianDataSource.ReadEventsAsync which adds the AddEventFilter('Source', Equal, ...) only when non-null). Distinct from the live GalaxyAlarmEvent type because historical rows carry both timestamps and lack StateTransition (Historian logs instantaneous events, not the OPC UA Part 9 alarm lifecycle; translating to OPC UA event lifecycle is the alarm-subsystem's job). Guards: null historian → Historian-disabled error; SDK exception → Success=false with message chained. Tests (3 new): disabled-error when historian null, maps HistorianEventDto with full field set (Id/Source/EventTime/ReceivedTime/DisplayText/Severity=900) to GalaxyHistoricalEvent, null SourceName passes through unchanged (verifies the 'all sources' contract). Galaxy.Host.Tests Unit suite 34 pass / 0 fail. Galaxy.Host builds clean. Branches off phase-2-pr10-history-attime since both extend the MessageKind enum; fast-forwards if PR 10 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:08:16 -04:00
Joseph Doherty
d13f919112 Phase 2 PR 10 — HistoryReadAtTime IPC surface. New Shared.Contracts messages HistoryReadAtTimeRequest/Response (MessageKind 0x64/0x65), IGalaxyBackend gains HistoryReadAtTimeAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadAtTimeAsync (ported in PR 5, exposed now) — request timestamp array is flow-encoded as Unix ms to avoid MessagePack DateTime quirks then re-hydrated to DateTime on the Host side. Per-sample mapping uses the same ToWire(HistorianSample) helper as ReadRawAsync so the category→StatusCode mapping stays consistent (Quality byte 192+ → Good 0u, 64-191 → Uncertain, 0-63 → Bad 0x80000000u). Guards: null historian → "Historian disabled" (symmetric with other history paths); empty timestamp array short-circuits to Success=true, Values=[] without an SDK round-trip; SDK exception → Success=false with the message chained. Proxy-side IHistoryProvider.ReadAtTimeAsync capability doesn't exist in Core.Abstractions yet (OPC UA HistoryReadAtTime service is supported but the current IHistoryProvider only has ReadRawAsync + ReadProcessedAsync) — this PR adds the Host-side surface so a future Core.Abstractions extension can wire it through without needing another IPC change. Tests (4 new): disabled-error when historian null, empty-timestamp short-circuit without SDK call, Unix-ms↔DateTime round-trip with Good samples at two distinct timestamps, missing sample (Quality=0) maps to 0x80000000u Bad category. Galaxy.Host.Tests Unit suite: 31 pass / 0 fail (4 new at-time + 27 pre-existing). Galaxy.Host builds clean. Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:03:25 -04:00
d2ebb91cb1 Merge pull request 'Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end' (#8) from phase-2-pr9-alarms into v2 2026-04-18 06:59:25 -04:00
90ce0af375 Merge pull request 'Phase 2 PR 8 — gateway-level host-status push from MxAccessGalaxyBackend' (#7) from phase-2-pr8-alarms-hoststatus into v2 2026-04-18 06:59:04 -04:00
e250356e2a Merge pull request 'Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end' (#6) from phase-2-pr7-history-processed into v2 2026-04-18 06:59:02 -04:00
067ad78e06 Merge pull request 'Phase 2 PR 6 — close PR 4 monitor-loop low findings (probe leak + replay signal)' (#5) from phase-2-pr6-monitor-findings into v2 2026-04-18 06:57:57 -04:00
6cfa8d326d Merge pull request 'Phase 2 PR 4 — close 4 open MXAccess findings (push frames + reconnect + write-await + read-cancel)' (#3) from phase-2-pr4-findings into v2 2026-04-18 06:57:21 -04:00
Joseph Doherty
70a5d06b37 Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end. GalaxyRepository.GetAttributesAsync has always emitted is_alarm alongside is_historized (CASE WHEN EXISTS with the primitive_definition join on primitive_name='AlarmExtension' per v1's Extended Attributes SQL lifted byte-for-byte into the PR 5 repository port), and GalaxyAttributeRow.IsAlarm has been populated since the port, but the flag was silently dropped at the MapAttribute helper in both MxAccessGalaxyBackend and DbBackedGalaxyBackend because GalaxyAttributeInfo on the IPC side had no field to carry it — every deployed alarm attribute arrived at the Proxy with no signal that it was alarm-bearing. This PR wires the flag through the three translation boundaries: GalaxyAttributeInfo gains [Key(6)] public bool IsAlarm { get; set; } at the end of the message to preserve wire-compat with pre-PR9 payloads that omit the key (MessagePack treats missing keys as default, so a newer Proxy talking to an older Host simply gets IsAlarm=false for every attribute); both backend MapAttribute helpers copy row.IsAlarm into the IPC shape; DriverAttributeInfo in Core.Abstractions gains a new IsAlarm parameter with default value false so the positional record signature change doesn't force every non-Galaxy driver call site to flow a flag they don't produce (the existing generic node-manager and future Modbus/etc. drivers keep compiling without modification); GalaxyProxyDriver.DiscoverAsync passes attr.IsAlarm through to the DriverAttributeInfo positional constructor. This is the discovery-side foundation — the generic node-manager can now enrich alarm-bearing variables with OPC UA AlarmConditionState during address-space build (the existing v1 LmxNodeManager pattern that subscribes to <tag>.InAlarm + .Priority + .DescAttrName + .Acked and merges them into a ConditionState) but this PR deliberately stops at discovery: the full alarm subsystem (subscription management for the 4 alarm-status attributes, state-machine tracking for Active/Unacknowledged/Confirmed/Inactive transitions, OPC UA Part 9 alarm event emission, and the write-to-AckMsg ack path) is a follow-up PR 10+ because it touches the node-manager's address-space build path — orthogonal to the IPC flow this PR covers. Tests — AlarmDiscoveryTests (new, 3 cases): GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack serializes an IsAlarm=true instance and asserts the decoded flag is true + IsHistorized is true + AttributeName survives unchanged; GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack covers the default path; Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false is the wire-compat regression guard — serializes a stand-in PrePR9Shape class with only keys 0..5 (identical layout to the pre-PR9 GalaxyAttributeInfo) and asserts the newer GalaxyAttributeInfo deserializer produces IsAlarm=false without throwing, so a rolling upgrade where the Proxy ships first can talk to an old Host during the window before the Host upgrades without a MessagePack "missing key" exception. Full solution build: 0 errors, 38 warnings (existing). Galaxy.Host.Tests Unit suite: 27 pass / 0 fail (3 new alarm-discovery + 9 PR5 historian + 15 pre-existing). This PR branches off phase-2-pr5-historian because GalaxyProxyDriver's constructor signature + GalaxyHierarchyRow's IsAlarm init-only property are both ancestor state that the simpler branch bases (phase-2-pr4-findings, master) don't yet include.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:28:01 -04:00
Joseph Doherty
30ece6e22c Phase 2 PR 8 — wire gateway-level host-status push from MxAccessGalaxyBackend. PR 4 built the IPC infrastructure for OnHostStatusChanged (MessageKind.RuntimeStatusChange frame + ConnectionSink forwarding through FrameWriter) but no backend actually raised the event; the #pragma warning disable CS0067 around MxAccessGalaxyBackend.OnHostStatusChanged declared the event for interface symmetry while acknowledging the wire-up was Phase 2 follow-up. This PR closes the gateway-level signal: MxAccessClient.ConnectionStateChanged (already raised on false→true Register and true→false Unregister transitions, including the reconnect path in MonitorLoopAsync) now drives OnHostStatusChanged with a synthetic HostConnectivityStatus tagged HostName=MxAccessClient.ClientName, RuntimeStatus="Running" on reconnect + "Stopped" on disconnect, LastObservedUtcUnixMs set to the transition moment. The Admin UI's existing IHostConnectivityProbe subscriber on GalaxyProxyDriver (HostStatusChangedEventArgs) already handles the full translation — OnHostConnectivityUpdate parses "Running"/"Stopped"/"Faulted" into the Core.Abstractions HostState enum and fires OnHostStatusChanged downstream, so this single backend-side event wire-up produces an end-to-end signal with no further Proxy changes required. Per-platform and per-AppEngine ScanState probing (the 472 LOC GalaxyRuntimeProbeManager state machine in v1 that advises <Host>.ScanState on every deployed $WinPlatform + $AppEngine gobject, tracks Unknown → Running → Stopped transitions, handles the on-change-only delivery quirk of ScanState, and surfaces IsHostStopped(gobjectId) for the node manager's Read path to short-circuit on-demand reads against known-stopped runtimes) remains deferred to a follow-up PR — the gateway-level signal gives operators the top-level transport-health rung of the status ladder, which is what matters when the Galaxy COM proxy itself goes down (vs a specific platform going down). MxAccessClient.ClientName property exposes the previously-private _clientName field so the backend can tag its pushes with a stable gateway identity — operators configure this via OTOPCUA_GALAXY_CLIENT_NAME env var (default "OtOpcUa-Galaxy.Host" per Program.cs). MxAccessGalaxyBackend constructor subscribes the new _onConnectionStateChanged field before returning + Dispose unsubscribes it via _mx.ConnectionStateChanged -= _onConnectionStateChanged to prevent the backend's own dispose from leaving a dangling handler on the MxAccessClient (same shape as MxAccessClient.SubscriptionReplayFailed PR 6 dispose discipline). #pragma warning disable CS0067 removed from around OnHostStatusChanged since the event is now raised; the directive is narrowed to cover only OnAlarmEvent which stays unraised pending the alarm subsystem port (PR 9 candidate). Tests — HostStatusPushTests (new, 2 cases): ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name fires mx.ConnectAsync → mx.DisconnectAsync and asserts two notifications in order with HostName="GatewayClient" (the clientName passed to MxAccessClient ctor), RuntimeStatus="Running" then "Stopped", LastObservedUtcUnixMs > 0; Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events asserts that after backend.Dispose() a subsequent mx.DisconnectAsync does not bump the count on a registered OnHostStatusChanged handler — guards against the subscription-leak regression where a lingering backend instance would accumulate cross-reconnect notifications for a dead writer. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll (identical to the reference PR 6 adds — conflict-free cherry-pick/merge since both PRs stage the same <Reference> node; git will collapse to one when either lands first). Full Galaxy.Host.Tests Unit suite: 26 pass / 0 fail (2 new host-status + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings). Branch base — PR 8 is on phase-2-pr5-historian rather than phase-2-pr4-findings because the constructor path on MxAccessGalaxyBackend gained a new historian parameter in PR 5 and the Dispose implementation needs to coordinate the two unsubscribes; targeting the earlier base would leave a trivial conflict on Dispose.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:03:16 -04:00
Joseph Doherty
3717405aa6 Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end. PR 5 ported HistorianDataSource.ReadAggregateAsync into Galaxy.Host but left it internal — GalaxyProxyDriver.ReadProcessedAsync still threw NotSupportedException, so OPC UA clients issuing HistoryReadProcessed requests against the v2 topology got rejected at the driver boundary. This PR closes that gap by adding two new Shared.Contracts messages (HistoryReadProcessedRequest/Response, MessageKind 0x62/0x63), routing them through GalaxyFrameHandler, implementing HistoryReadProcessedAsync on all three IGalaxyBackend implementations (Stub/DbBacked return the canonical "pending" Success=false, MxAccessGalaxyBackend delegates to _historian.ReadAggregateAsync), mapping HistorianAggregateSample → GalaxyDataValue at the IPC boundary (null bucket Value → BadNoData 0x800E0000u, otherwise Good 0u), and flipping GalaxyProxyDriver.ReadProcessedAsync from the NotSupported throw to a real IPC call with OPC UA HistoryAggregateType enum mapped to Wonderware AnalogSummary column name on the Proxy side (Average → "Average", Minimum → "Minimum", Maximum → "Maximum", Count → "ValueCount", Total → NotSupported since there's no direct SDK column for sum). Decision #13 IPC data-shape stays intact — HistoryReadProcessedResponse carries GalaxyDataValue[] with the same MessagePack value + OPC UA StatusCode + timestamps shape as the other history responses, so the Proxy's existing ToSnapshot helper handles the conversion without a new code path. MxAccessGalaxyBackend.HistoryReadProcessedAsync guards: null historian → "Historian disabled" (symmetric with HistoryReadAsync); IntervalMs <= 0 → "HistoryReadProcessed requires IntervalMs > 0" (prevents division-by-zero inside the SDK's Resolution parameter); exception during SDK call → Success=false Values=[] with the message so the Proxy surfaces it as InvalidOperationException with a clean error chain. Tests — HistoryReadProcessedTests (new, 4 cases): disabled-error when historian null, rejects zero interval, maps Good sample with Value=12.34 and the Proxy-supplied AggregateColumn + IntervalMs flow unchanged through to the fake IHistorianDataSource, maps null Value bucket to 0x800E0000u BadNoData with null ValueBytes. AggregateColumnMappingTests (new, 5 cases in Proxy.Tests): theory covers all 4 supported HistoryAggregateType enum values → correct column string, and asserts Total throws NotSupportedException with a message that steers callers to Average/Minimum/Maximum/Count (the SDK's AnalogSummaryQueryResult doesn't expose a sum column — the closest is Average × ValueCount which is the responsibility of a caller-side aggregation rather than an extra IPC round-trip). InternalsVisibleTo added to Galaxy.Proxy csproj so Proxy.Tests can reach the internal MapAggregateToColumn static. Builds — Galaxy.Host (net48 x86) + Galaxy.Proxy (net10) both 0 errors, full solution 201 warnings (pre-existing) / 0 errors. Test counts — Host.Tests Unit suite: 28 pass (4 new processed + 9 PR5 historian + 15 pre-existing); Proxy.Tests Unit suite: 14 pass (5 new column-mapping + 9 pre-existing). Deferred to a later PR — ReadAtTime + ReadEvents + Health IPC surfaces (HistorianDataSource has them ported in PR 5 but they need additional contract messages and would push this PR past a comfortable review size); the alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend) which overlaps the ReadEventsAsync IPC work since both pull from HistorianAccess.CreateEventQuery on the SDK side; the Proxy-side quality-byte refinement where HistorianDataSource's per-sample raw quality byte gets decoded through the existing QualityMapper instead of the category-only mapping in ToWire(HistorianSample) — doesn't change correctness today since Good/Uncertain/Bad categories are all the Admin UI and OPC UA clients surface, but richer OPC DA status codes (BadNotConnected, UncertainSubNormal, etc.) are available on the wire and the Proxy could promote them before handing DataValueSnapshot to ISubscribable consumers. This PR branches off phase-2-pr5-historian because it directly extends the Historian IPC surface added there; if PR 5 merges first PR 7 fast-forwards, otherwise it needs a rebase after PR 5 lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 05:53:01 -04:00
Joseph Doherty
1c2bf74d38 Phase 2 PR 6 — close the 2 low findings carried forward from PR 4. Low finding #1 ($Heartbeat probe handle leak in MonitorLoopAsync): the probe calls _proxy.AddItem(connectionHandle, "$Heartbeat") on every monitor tick that observes the connection is past StaleThreshold, but previously discarded the returned item handle — so every probe (one per MonitorInterval, default 5s) leaked one item handle into the MXAccess proxy's internal handle table. Fix: capture the item handle, call RemoveItem(connectionHandle, probeHandle) in the InvokeAsync's finally block so it runs on the same pump turn as the AddItem, best-effort RemoveItem swallow so a dying proxy doesn't throw secondary exceptions out of the probe path. Probe ok becomes probeHandle > 0 so any AddItem that returns 0 (MXAccess's "could not create") counts as a failed probe, matching v1 behavior. Low finding #2 (subscription replay silently swallowed per-tag failures): after a reconnect, the replay loop iterates the pre-reconnect subscription snapshot and calls SubscribeOnPumpAsync for each; previously those failures went into a bare catch { /* skip */ } so an operator had no signal when specific tags failed to re-subscribe — the first indication downstream was a quality drop on OPC UA clients. Fix: new SubscriptionReplayFailedEventArgs (TagReference + Exception) + SubscriptionReplayFailed event on MxAccessClient that fires once per tag that fails to re-subscribe, Log.Warning per failure with the reconnect counter + tag reference, and a summary log line at the end of the replay loop ("{failed} of {total} failed" or "{total} re-subscribed cleanly"). Serilog using + ILogger Log = Serilog.Log.ForContext<MxAccessClient>() added. Tests — MxAccessClientMonitorLoopTests (new file, 2 cases): Heartbeat_probe_calls_RemoveItem_for_every_AddItem constructs a CountingProxy IMxProxy that tracks AddItem/RemoveItem pair counts scoped to the "$Heartbeat" address, runs the client with MonitorInterval=150ms + StaleThreshold=50ms for 700ms, asserts HeartbeatAddCount > 1, HeartbeatAddCount == HeartbeatRemoveCount, OutstandingHeartbeatHandles == 0 after dispose; SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay uses a ReplayFailingProxy that throws on the next $Heartbeat probe (to trigger the reconnect path) and throws on the replay-time AddItem for specified tag names ("BadTag.A", "BadTag.B"), subscribes GoodTag.X + BadTag.A + BadTag.B before triggering probe failure, collects SubscriptionReplayFailed args into a ConcurrentBag, asserts exactly 2 events fired and both bad tags are represented — GoodTag.X replays cleanly so it does not fire. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll because IMxProxy's MxDataChangeHandler delegate signature mentions MXSTATUS_PROXY and the compiler resolves all delegate parameter types when a test class implements the interface, even if the test code never names the type. No regressions: full Galaxy.Host.Tests Unit suite 26 pass / 0 fail (2 new monitor-loop tests + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings) — the new Serilog.Log.ForContext usage picks up the existing Serilog package ref that PR 4 pulled in for the monitor-loop infrastructure. Both findings were flagged as non-blocking for PR 4 merge and are now resolved alongside whichever merge order the reviewer picks; this PR branches off phase-2-pr4-findings so it can rebase cleanly if PR 4 lands first or be re-based onto master after PR 4 merges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 02:06:15 -04:00
Joseph Doherty
6df1a79d35 Phase 2 PR 5 — port Wonderware Historian SDK into Driver.Galaxy.Host/Backend/Historian/. The full v1 Historian.Aveva code path (HistorianDataSource + HistorianClusterEndpointPicker + IHistorianConnectionFactory + SdkHistorianConnectionFactory) now lives inside Galaxy.Host instead of the previously-required out-of-tree plugin + HistorianPluginLoader AssemblyResolve hack, and MxAccessGalaxyBackend.HistoryReadAsync — which previously returned a Phase 2 Task B.1.h follow-up placeholder — now delegates to the ported HistorianDataSource.ReadRawAsync, maps HistorianSample to GalaxyDataValue via the IPC wire shape, and reports Success=true with per-tag HistoryTagValues arrays. OPC-UA-free surface inside Galaxy.Host: the v1 code returned Opc.Ua.DataValue on the hot path, which would have required dragging OPCFoundation.NetStandard.Opc.Ua.Server into net48 x86 Galaxy.Host and bleeding OPC types across the IPC boundary — instead, the port introduces HistorianSample (Value, Quality byte, TimestampUtc) + HistorianAggregateSample (Value, TimestampUtc) POCOs that carry the raw MX quality byte through the pipe unchanged, and the OPC translation happens on the Proxy side via the existing QualityMapper that the live-read path already uses. Decision #13's IPC data-shape contract survives intact — GalaxyDataValue (TagReference + ValueBytes MessagePack + ValueMessagePackType + StatusCode + SourceTimestampUtcUnixMs + ServerTimestampUtcUnixMs) — so no Shared.Contracts wire break vs PR 4. Cluster failover preserved verbatim: HistorianClusterEndpointPicker is the thread-safe pure-logic picker ported verbatim with no SDK dependency (injected DateTime clock, per-node cooldown state, unknown-node-name tolerance, case-insensitive de-dup on configuration-order list), ConnectToAnyHealthyNode iterates the picker's healthy candidates, clones config per attempt, calls the factory, marks healthy on success / failed on exception with the failure message stored for dashboard surfacing, throws "All N healthy historian candidate(s) failed" with the last exception chained when every node exhausts. Process path + Event path use separate HistorianAccess connections (CreateHistoryQuery vs CreateEventQuery vs CreateAnalogSummaryQuery on the SDK surface) guarded by independent _connection/_eventConnection locks — a mid-query failure on one silo resets only that connection, the other stays open. Four SDK paths ported: ReadRawAsync (RetrievalMode.Full, BatchSize from config.MaxValuesPerRead, MoveNext pump, per-sample quality + value decode with the StringValue/Value fallback the v1 code did, limit-based early exit), ReadAggregateAsync (AnalogSummaryQuery + Resolution in ms, ExtractAggregateValue maps Average/Minimum/Maximum/ValueCount/First/Last/StdDev column names — the NodeId to column mapping is moved to the Proxy side since the IPC request carries a string column), ReadAtTimeAsync (per-timestamp HistoryQuery with RetrievalMode.Interpolated + BatchSize=1, returns Quality=0 / Value=null for missing samples), ReadEventsAsync (EventQuery + AddEventFilter("Source",Equal,sourceName) when sourceName is non-null, EventOrder.Ascending, EventCount = maxEvents or config.MaxValuesPerRead); GetHealthSnapshot returns the full runtime-health snapshot (TotalQueries/Successes/Failures + ConsecutiveFailures + LastSuccess/FailureTime + LastError + ProcessConnectionOpen/EventConnectionOpen + ActiveProcessNode/ActiveEventNode + per-node state list). ReadRaw is the only path wired through IPC in PR 5 (HistoryReadRequest/HistoryTagValues/HistoryReadResponse already existed in Shared.Contracts); Aggregate/AtTime/Events/Health are ported-but-not-yet-IPC-exposed — they stay internal to Galaxy.Host for PR 6+ to surface via new contract message kinds (aggregate = OPC UA HistoryReadProcessed, at-time = HistoryReadAtTime, events = HistoryReadEvents, health = admin dashboard IPC query). Galaxy.Host csproj gains aahClientManaged + aahClientCommon references with Private=false (managed wrappers) + None items for aahClient.dll + Historian.CBE.dll + Historian.DPAPI.dll + ArchestrA.CloudHistorian.Contract.dll native satellites staged alongside the host exe via CopyToOutputDirectory=PreserveNewest so aahClientManaged can P/Invoke into them at runtime without an AssemblyResolve hook (cleaner than the v1 HistorianPluginLoader.cs 180-LOC AssemblyResolve + Assembly.LoadFrom dance that existed solely because the plugin was loaded late from Host/bin/Debug/net48/Historian/). Program.cs adds BuildHistorianIfEnabled() that reads OTOPCUA_HISTORIAN_ENABLED (true or 1) + OTOPCUA_HISTORIAN_SERVER + OTOPCUA_HISTORIAN_SERVERS (comma-separated cluster list overrides single-server) + OTOPCUA_HISTORIAN_PORT (default 32568) + OTOPCUA_HISTORIAN_INTEGRATED (default true) + OTOPCUA_HISTORIAN_USER/OTOPCUA_HISTORIAN_PASS + OTOPCUA_HISTORIAN_TIMEOUT_SEC (30) + OTOPCUA_HISTORIAN_MAX_VALUES (10000) + OTOPCUA_HISTORIAN_COOLDOWN_SEC (60), returns null when disabled so MxAccessGalaxyBackend.HistoryReadAsync surfaces a clean "Historian disabled" Success=false instead of a localhost-SDK hang; server.RunAsync finally block now also casts backend to IDisposable.Dispose() so the historian SDK connections get cleanly closed on Ctrl+C. MxAccessGalaxyBackend gains an IHistorianDataSource? historian constructor parameter (defaults null to preserve existing Host.Tests call sites that don't exercise HistoryRead), implements IDisposable that forwards to _historian.Dispose(), and the pragma warning disable CS0618 is locally scoped to the ToDto(HistorianEvent) mapper since the SDK marks Id/Source/DisplayText/Severity obsolete but the replacement surface isn't available in the aahClientManaged version we bind against — every other deprecated-SDK use still surfaces as an error under TreatWarningsAsErrors. Ported from v1 Historian.Aveva unchanged: the CloneConfigWithServerName helper that preserves every config field except ServerName per attempt; the double-checked locking in EnsureConnected/EnsureEventConnected (fast path = Volatile.Read outside lock, slow path acquires lock + re-checks + disposes any raced-in-parallel connection); HandleConnectionError/HandleEventConnectionError that close the dead connection, clear the active-node tracker, MarkFailed the picker entry with the exception message so the node enters cooldown, and log the reset with node= for operator correlation; RecordSuccess/RecordFailure that bump counters under _healthLock. Tests: HistorianClusterEndpointPickerTests (7 cases) — single-node ServerName fallback when ServerNames empty, MarkFailed enters cooldown and skips, cooldown expires after window, MarkHealthy immediately clears, all-in-cooldown returns empty healthy list, Snapshot reports failure count + last error + IsHealthy, case-insensitive de-dup on duplicate hostnames. HistorianWiringTests (2 cases) — HistoryReadAsync returns "Historian disabled" Success=false when historian:null passed; HistoryReadAsync with a fake IHistorianDataSource maps the returned HistorianSample (Value=42.5, Quality=192 Good, Timestamp) to a GalaxyDataValue with StatusCode=0u + SourceTimestampUtcUnixMs matching the sample + MessagePack-encoded value bytes. InternalsVisibleTo("...Host.Tests") added to Galaxy.Host.csproj so tests can reach the internal HistorianClusterEndpointPicker. Full Galaxy.Host.Tests suite: 24 pass / 0 fail (9 new historian + 15 pre-existing MemoryWatchdog/PostMortemMmf/RecyclePolicy/StaPump/EndToEndIpc/Handshake). Full solution build: 0 errors (202 pre-existing warnings). The v1 Historian.Aveva project + Historian.Aveva.Tests still build intact because the archive PR (Stream D.1 destructive delete) is still ahead of us — PR 5 intentionally does not delete either; once PR 2+3 merge and the archive-delete PR lands, a follow-up cleanup can remove Historian.Aveva + its 4 source files + 18 test cases. Alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend via AlarmExtension primitives) + host-status push (OnHostStatusChanged via a ported GalaxyRuntimeProbeManager) remain PR 6 candidates; they were on the same "Task B.1.h follow-up" list and share the IPC connection-sink wiring with the historian events path — it made PR 5 scope-manageable to do Historian first since that's what has the biggest surface area (981 LOC v1 plus SDK binding) and alarms/host-status have more bespoke integration with the existing MxAccess subscription fan-out.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:44:04 -04:00
Joseph Doherty
caa9cb86f6 Phase 2 PR 4 — close the 4 open high/medium MXAccess findings from exit-gate-phase-2-final.md. High 1 (ReadAsync subscription-leak on cancel): the one-shot read now wraps subscribe→first-OnDataChange→unsubscribe in try/finally so the per-tag callback is always detached, and if the read installed the underlying MXAccess subscription itself (the prior _addressToHandle key was absent) it tears it down on the way out — no leaked probe item handles when the caller cancels or times out. High 2 (no reconnect loop): MxAccessClient gets a MxAccessClientOptions {AutoReconnect, MonitorInterval=5s, StaleThreshold=60s} + a background MonitorLoopAsync started at first ConnectAsync. The loop wakes every MonitorInterval, checks _lastObservedActivityUtc (bumped by every OnDataChange callback), and if stale probes the proxy with a no-op COM AddItem("$Heartbeat") on the StaPump; if the probe throws or returns false, the loop reconnects-with-replay — Unregister (best-effort), Register, snapshot _addressToHandle.Keys + clear, re-AddItem every previously-active subscription, ConnectionStateChanged events fire for the false→true transition, ReconnectCount bumps. Medium 3 (subscriptions don't push frames back to Proxy): IGalaxyBackend gains OnDataChange/OnAlarmEvent/OnHostStatusChanged events; new IFrameHandler.AttachConnection(FrameWriter) is called per-connection by PipeServer after Hello + the returned IDisposable disposes at connection close; GalaxyFrameHandler.ConnectionSink subscribes the events for the connection lifetime, fire-and-forget pushes them as MessageKind.OnDataChangeNotification / AlarmEvent / RuntimeStatusChange frames through the writer, swallows ObjectDisposedException for the dispose race, and unsubscribes in Dispose to prevent leaked invocation list refs across reconnects. MxAccessGalaxyBackend's existing SubscribeAsync (which previously discarded values via a (_, __) => {} callback) now wires OnTagValueChanged that fans out per-tag value changes to every subscription ID listening (one MXAccess subscription, multi-fan-out — _refToSubs reverse map). UnsubscribeAsync also reverse-walks the map to only call mx.UnsubscribeAsync when the LAST sub for a tag drops. Stub + DbBacked backends declare the events with #pragma warning disable CS0067 because they never raise them but must satisfy the interface (treat-warnings-as-errors would otherwise fail). Medium 4 (WriteValuesAsync doesn't await OnWriteComplete): MxAccessClient.WriteAsync rewritten to return Task<bool> via the v1-style TaskCompletionSource-keyed-by-item-handle pattern in _pendingWrites — adds the TCS before the Write call, awaits it with a configurable timeout (default 5s), removes the TCS in finally, returns true only when OnWriteComplete reported success. MxAccessGalaxyBackend.WriteValuesAsync now reports per-tag Bad_InternalError ("MXAccess runtime reported write failure") when the bool returns false, instead of false-positive Good. PipeServer's IFrameHandler interface adds the AttachConnection(FrameWriter):IDisposable method + a public NoopAttachment nested class (net48 doesn't support default interface methods so the empty-attach is exposed for stub implementations). StubFrameHandler returns IFrameHandler.NoopAttachment.Instance. RunOneConnectionAsync calls AttachConnection after HelloAck and usings the returned disposable so it disposes at the connection scope's finally. ConnectionStateChanged event added on MxAccessClient (caller-facing diagnostics for false→true reconnect transitions). docs/v2/implementation/pr-4-body.md is the Gitea web-UI paste-in for opening PR 4 once pushed; includes 2 new low-priority adversarial findings (probe item-handle leak; replay-loop silently swallows per-subscription failures) flagged as follow-ups not PR 4 blockers. Full solution 460 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. No regressions vs PR 2's baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:12:09 -04:00
Joseph Doherty
a3d16a28f1 Phase 2 Stream D Option B — archive v1 surface + new Driver.Galaxy.E2E parity suite. Non-destructive intermediate state: the v1 OtOpcUa.Host + Historian.Aveva + Tests + IntegrationTests projects all still build (494 v1 unit + 6 v1 integration tests still pass when run explicitly), but solution-level dotnet test ZB.MOM.WW.OtOpcUa.slnx now skips them via IsTestProject=false on the test projects + archive-status PropertyGroup comments on the src projects. The destructive deletion is reserved for Phase 2 PR 3 with explicit operator review per CLAUDE.md "only use destructive operations when truly the best approach". tests/ZB.MOM.WW.OtOpcUa.Tests/ renamed via git mv to tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/; csproj <AssemblyName> kept as the original ZB.MOM.WW.OtOpcUa.Tests so v1 OtOpcUa.Host's [InternalsVisibleTo("ZB.MOM.WW.OtOpcUa.Tests")] still matches and the project rebuilds clean. tests/ZB.MOM.WW.OtOpcUa.IntegrationTests gets <IsTestProject>false</IsTestProject>. src/ZB.MOM.WW.OtOpcUa.Host + src/ZB.MOM.WW.OtOpcUa.Historian.Aveva get PropertyGroup archive-status comments documenting they're functionally superseded but kept in-build because cascading dependencies (Historian.Aveva → Host; IntegrationTests → Host) make a single-PR deletion high blast-radius. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ project (.NET 10) with ParityFixture that spawns OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a Process.Start subprocess with OTOPCUA_GALAXY_BACKEND=db env vars, awaits 2s for the PipeServer to bind, then exposes a connected GalaxyProxyDriver; skips on non-Windows / Administrator shells (PipeAcl denies admins per decision #76) / ZB unreachable / Host EXE not built — each skip carries a SkipReason string the test method reads via Assert.Skip(SkipReason). RecordingAddressSpaceBuilder captures every Folder/Variable/AddProperty registration so parity tests can assert on the same shape v1 LmxNodeManager produced. HierarchyParityTests (3) — Discover returns gobjects with attributes; attribute full references match the tag.attribute Galaxy reference grammar; HistoryExtension flag flows through correctly. StabilityFindingsRegressionTests (4) — one test per 2026-04-13 stability finding from commits c76ab8f and 7310925: phantom probe subscription doesn't corrupt unrelated host status; HostStatusChangedEventArgs structurally carries a specific HostName + OldState + NewState (event signature mathematically prevents the v1 cross-host quality-clear bug); all GalaxyProxyDriver capability methods return Task or Task<T> (sync-over-async would deadlock OPC UA stack thread); AcknowledgeAsync completes before returning (no fire-and-forget background work that could race shutdown). Solution test count: 470 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. Run archived suites explicitly: dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive (494 pass) + dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests (6 pass). docs/v2/V1_ARCHIVE_STATUS.md inventories every archived surface with run-it-explicitly instructions + a 10-step deletion plan for PR 3 + rollback procedure (git revert restores all four projects). docs/v2/implementation/exit-gate-phase-2-final.md supersedes the two partial-exit docs with the per-stream status table (A/B/C/D/E all addressed, D split across PR 2/3 per safety protocol), the test count breakdown, fresh adversarial review of PR 2 deltas (4 new findings: medium IsTestProject=false safety net loss, medium structural-vs-behavioral stability tests, low backend=db default, low Process.Start env inheritance), the 8 carried-forward findings from exit-gate-phase-2.md, the recommended PR order (1 → 2 → 3 → 4). docs/v2/implementation/pr-2-body.md is the Gitea web-UI paste-in for opening PR 2 once pushed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:56:21 -04:00
Joseph Doherty
50f81a156d Doc — PR 1 body for Gitea web UI paste-in. PR title + summary + test matrix + reviewer test plan + follow-up tracking. Source phase-1-configuration → target v2; URL https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-1-configuration. No gh/tea CLI on this box, so the body is staged here for the operator to paste into the Gitea web UI rather than auto-created via API.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:46:23 -04:00
Joseph Doherty
7403b92b72 Phase 2 Stream D progress — non-destructive deliverables: appsettings → DriverConfig migration script, two-service Windows installer scripts, process-spawn cross-FX parity test, Stream D removal procedure doc with both Option A (rewrite 494 v1 tests) and Option B (archive + new v2 E2E suite) spelled out step-by-step. Cannot one-shot the actual legacy-Host deletion in any unattended session — explained in the procedure doc; the parity-defect debug cycle is intrinsically interactive (each iteration requires inspecting a v1↔v2 diff and deciding if it's a legitimate v2 improvement or a regression, then either widening the assertion or fixing the v2 code), and git rm -r src/ZB.MOM.WW.OtOpcUa.Host is destructive enough to need explicit operator authorization on a real PR review. scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 takes a v1 appsettings.json and emits the v2 DriverInstance.DriverConfig JSON blob (MxAccess/Database/Historian sections) ready to upsert into the central Configuration DB; null-leaf stripping; -DryRun mode; smoke-tested against the dev appsettings.json and produces the expected three-section ordered-dictionary output. scripts/install/Install-Services.ps1 registers the two v2 services with sc.exe — OtOpcUaGalaxyHost first (net48 x86 EXE with OTOPCUA_GALAXY_PIPE/OTOPCUA_ALLOWED_SID/OTOPCUA_GALAXY_SECRET/OTOPCUA_GALAXY_BACKEND/OTOPCUA_GALAXY_ZB_CONN/OTOPCUA_GALAXY_CLIENT_NAME env vars set via HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost\Environment registry), then OtOpcUa with depend=OtOpcUaGalaxyHost; resolves down-level account names to SID for the IPC ACL; generates a fresh 32-byte base64 shared secret per install if not supplied (kept out of registry — operators record offline for service rebinding scenarios); echoes start commands. scripts/install/Uninstall-Services.ps1 stops + removes both services. tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs is the production-shape parity test — Proxy (.NET 10) spawns the actual OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a subprocess via Process.Start with backend=db env vars, connects via real named pipe, calls Discover, asserts at least one Galaxy gobject comes back. Skipped when running as Administrator (PipeAcl denies admins, same guard as other IPC integration tests), when the Host EXE hasn't been built, or when the ZB SQL endpoint is unreachable. This is the cross-FX integration that the parity suite genuinely needs — the previous IPC tests all ran in-process; this one validates the production deployment topology where Proxy and Host are separate processes communicating only over the named pipe. docs/v2/implementation/stream-d-removal-procedure.md is the next-session playbook: Option A (rewrite 494 v1 tests via a ProxyMxAccessClientAdapter that implements v1's IMxAccessClient by forwarding to GalaxyProxyDriver — Vtq↔DataValueSnapshot, Quality↔StatusCode, OnTagValueChanged↔OnDataChange mapping; 3-5 days, full coverage), Option B (rename OtOpcUa.Tests → OtOpcUa.Tests.v1Archive with [Trait("Category", "v1Archive")] for opt-in CI runs; new OtOpcUa.Driver.Galaxy.E2E test project with 10-20 representative tests via the HostSubprocessParityTests pattern; 1-2 days, accreted coverage); deletion checklist with eight pre-conditions, ten ordered steps, and a rollback path (git revert restores the legacy Host alongside the v2 stack — both topologies remain installable until the downstream consumer cutover). Full solution 964 pass / 1 pre-existing Phase 0 baseline; the 494 v1 IntegrationTests + 6 v1 IntegrationTests-net48 still pass because legacy OtOpcUa.Host stays untouched until an interactive session executes the procedure doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:38:44 -04:00
Joseph Doherty
a7126ba953 Phase 2 — port MXAccess COM client to Galaxy.Host + MxAccessGalaxyBackend (3rd IGalaxyBackend) + live MXAccess smoke + Phase 2 exit-gate doc + adversarial review. The full Galaxy data-plane now flows through the v2 IPC topology end-to-end against live ArchestrA.MxAccess.dll, on this dev box, with 30/30 Host tests + 9/9 Proxy tests + 963/963 solution tests passing alongside the unchanged 494 v1 IntegrationTests baseline. Backend/MxAccess/Vtq is a focused port of v1's Vtq value-timestamp-quality DTO. Backend/MxAccess/IMxProxy abstracts LMXProxyServer (port of v1's IMxProxy with the same Register/Unregister/AddItem/RemoveItem/AdviseSupervisory/UnAdviseSupervisory/Write surface + OnDataChange + OnWriteComplete events); MxProxyAdapter is the concrete COM-backed implementation that does Marshal.ReleaseComObject-loop on Unregister, must be constructed on an STA thread. Backend/MxAccess/MxAccessClient is the focused port of v1's MxAccessClient partials — Connect/Disconnect/Read/Write/Subscribe/Unsubscribe through the new Sta/StaPump (the real Win32 GetMessage pump from the previous commit), ConcurrentDictionary handle tracking, OnDataChange event marshalling to per-tag callbacks, ReadAsync implemented as the canonical subscribe → first-OnDataChange → unsubscribe one-shot pattern. Galaxy.Host csproj flipped to x86 PlatformTarget + Prefer32Bit=true with the ArchestrA.MxAccess HintPath ..\..\lib\ArchestrA.MxAccess.dll reference (lib/ already contains the production DLL). Backend/MxAccessGalaxyBackend is the third IGalaxyBackend implementation (alongside StubGalaxyBackend and DbBackedGalaxyBackend): combines GalaxyRepository (Discover) with MxAccessClient (Read/Write/Subscribe), MessagePack-deserializes inbound write values, MessagePack-serializes outbound read values into ValueBytes, decodes ArrayDimension/SecurityClassification/category_id with the same v1 mapping. Program.cs selects between stub|db|mxaccess via OTOPCUA_GALAXY_BACKEND env var (default = mxaccess); OTOPCUA_GALAXY_ZB_CONN overrides the ZB connection string; OTOPCUA_GALAXY_CLIENT_NAME sets the Wonderware client identity; the StaPump and MxAccessClient lifecycles are tied to the server.RunAsync try/finally so a clean Ctrl+C tears down the COM proxy via Marshal.ReleaseComObject before the pump's WM_QUIT. Live MXAccess smoke tests (MxAccessLiveSmokeTests, net48 x86) — skipped when ZB unreachable or aaBootstrap not running, otherwise verify (1) MxAccessClient.ConnectAsync returns a positive LMXProxyServer handle on the StaPump, (2) MxAccessGalaxyBackend.OpenSession + Discover returns at least one gobject with attributes, (3) MxAccessGalaxyBackend.ReadValues against the first discovered attribute returns a response with the correct TagReference shape (value + quality vary by what's running, so we don't assert specific values). All 3 pass on this dev box. EndToEndIpcTests + IpcHandshakeIntegrationTests moved from Galaxy.Proxy.Tests (net10) to Galaxy.Host.Tests (net48 x86) — the previous test placement silently dropped them at xUnit discovery because Host became net48 x86 and net10 process can't load it. Rewritten to use Shared's FrameReader/FrameWriter directly instead of going through Proxy's GalaxyIpcClient (functionally equivalent — same wire protocol, framing primitives + dispatcher are the production code path verbatim). 7 IPC tests now run cleanly: Hello+heartbeat round-trip, wrong-secret rejection, OpenSession session-id assignment, Discover error-response surfacing, WriteValues per-tag bad status, Subscribe id assignment, Recycle grace window. Phase 2 exit-gate doc (docs/v2/implementation/exit-gate-phase-2.md) supersedes the partial-exit doc with the as-built state — Streams A/B/C complete; D/E gated only on the legacy-Host removal + parity-test rewrite cycle that fundamentally requires multi-day debug iteration; full adversarial-review section ranking 8 findings (2 high, 3 medium, 3 low) all explicitly deferred to Stream D/E or v2.1 with rationale; Stream-D removal checklist gives the next-session entry point with two policy options for the 494 v1 tests (rewrite-to-use-Proxy vs archive-and-write-smaller-v2-parity-suite). Cannot one-shot Stream D.1 in any single session because deleting OtOpcUa.Host requires the v1 IntegrationTests cycle to be retargeted first; that's the structural blocker, not "needs more code" — and the plan itself budgets 3-4 weeks for it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:23:24 -04:00
Joseph Doherty
549cd36662 Phase 2 — port GalaxyRepository to Galaxy.Host + DbBackedGalaxyBackend, smoke-tested against live ZB. Real Galaxy gobject hierarchy + dynamic attributes now flow through the IPC contract end-to-end without any MXAccess code involvement, so the OPC UA address-space build (Stream C.4 acceptance) becomes parity-testable today even before the COM client port lands. Backend/Galaxy/GalaxyRepository.cs is a byte-for-byte port of v1 GalaxyRepositoryService's HierarchySql + AttributesSql (the two SQL bodies, both ~50 lines of recursive CTE template-chain + deployed_package_chain logic, are identical to v1 so the row set is verifiably the same — extended-attributes + scope-filter queries from v1 are intentionally not ported yet, they're refinements not on the Phase 2 critical path); plus TestConnectionAsync (SELECT 1) and GetLastDeployTimeAsync (SELECT time_of_last_deploy FROM galaxy) for the ChangeDetection deploy-watermark path. Backend/Galaxy/GalaxyRepositoryOptions defaults to localhost ZB Integrated Security; runtime override comes from DriverConfig.Database section per plan.md §"Galaxy DriverConfig". Backend/Galaxy/GalaxyHierarchyRow + GalaxyAttributeRow are the row-shape DTOs (no required modifier — net48 lacks RequiredMemberAttribute and we'd need a polyfill shim like the existing IsExternalInit one; default-string init is simpler). System.Data.SqlClient 4.9.0 added (the same package the v1 Host uses; net48-compatible). Backend/DbBackedGalaxyBackend wraps the repository: DiscoverAsync builds a real DiscoverHierarchyResponse (groups attributes by gobject, resolves parent-by-tagname, maps category_id → human-readable template-category name mirroring v1 AlarmObjectFilter); ReadValuesAsync/WriteValuesAsync/HistoryReadAsync still surface "MXAccess code lift pending (Phase 2 Task B.1)" because runtime data values genuinely need the COM client; OpenSession/CloseSession/Subscribe/Unsubscribe/AlarmSubscribe/AlarmAck/Recycle return success without backend work (subscription ID is a synthetic counter for now). Live smoke tests (GalaxyRepositoryLiveSmokeTests) skip when localhost ZB is unreachable; when present they verify (1) TestConnection returns true, (2) GetHierarchy returns at least one deployed gobject with a non-empty TagName, (3) GetAttributes returns rows with FullTagReference matching the "tag.attribute" shape, (4) GetLastDeployTime returns a value, (5) DbBackedBackend.DiscoverAsync returns at least one gobject with attributes and a populated TemplateCategory. All 5 pass against the local Galaxy. Full solution 957 pass / 1 pre-existing Phase 0 baseline; the 494 v1 IntegrationTests + 6 v1 IntegrationTests-net48 tests still pass — legacy OtOpcUa.Host untouched. Remaining for the Phase 2 exit gate is the MXAccess COM client port itself (the v1 MxAccessClient partials + IMxProxy abstraction + StaPump-based Connect/Subscribe/Read/Write semantics) — Discover is now solved in DB-backed form, so the lift can focus exclusively on the runtime data-plane.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 23:14:09 -04:00
Joseph Doherty
32eeeb9e04 Phase 2 Streams A+B+C feature-complete — real Win32 pump, all 9 IDriver capabilities, end-to-end IPC dispatch. Streams D+E remain (Galaxy MXAccess code lift + parity-debug cycle, plan-budgeted 3-4 weeks). The 494 v1 IntegrationTests still pass — legacy OtOpcUa.Host untouched. StaPump replaces the BlockingCollection placeholder with a real Win32 message pump lifted from v1 StaComThread per CLAUDE.md "Reference Implementation": dedicated STA Thread with SetApartmentState(STA), GetMessage/PostThreadMessage/PeekMessage/TranslateMessage/DispatchMessage/PostQuitMessage P/Invoke, WM_APP=0x8000 for work-item dispatch, WM_APP+1 for graceful-drain → PostQuitMessage, peek-pm-noremove on entry to force the system to create the thread message queue before signalling Started, IsResponsiveAsync probe still no-op-round-trips through PostThreadMessage so the wedge detection works against the real pump. Concurrent ConcurrentQueue<WorkItem> drains on every WM_APP; fault path on dispose drains-and-faults all pending work-item TCSes with InvalidOperationException("STA pump has exited"). All three StaPumpTests pass against the real pump (apartment state STA, healthy probe true, wedged probe false). GalaxyProxyDriver now implements every Phase 2 Stream C capability — IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IAlarmSource, IHistoryProvider, IRediscoverable, IHostConnectivityProbe — each forwarding through the matching IPC contract. ReadAsync preserves request order even when the Host returns out-of-order values; WriteAsync MessagePack-serializes the value into ValueBytes; SubscribeAsync wraps SubscriptionId in a GalaxySubscriptionHandle record; UnsubscribeAsync uses the new SendOneWayAsync helper on GalaxyIpcClient (fire-and-forget but still gated through the call-semaphore so it doesn't interleave with CallAsync); AlarmSubscribe is one-way and the Host pushes events back via OnAlarmEvent; ReadProcessedAsync short-circuits to NotSupportedException (Galaxy historian only does raw); IRediscoverable's OnRediscoveryNeeded fires when the Host pushes a deploy-watermark notification; IHostConnectivityProbe.GetHostStatuses() snapshots and OnHostStatusChanged fires on Running↔Stopped/Faulted transitions, with IpcHostConnectivityStatus aliased to disambiguate from the Core.Abstractions namespace's same-named type. Internal RaiseDataChange/RaiseAlarmEvent/RaiseRediscoveryNeeded/OnHostConnectivityUpdate methods are the entry points the IPC client will invoke when push frames arrive. Host side: new Backend/IGalaxyBackend interface defines the seam between IPC dispatch and the live MXAccess code (so the dispatcher is unit-testable against an in-memory mock without needing live Galaxy); Backend/StubGalaxyBackend returns success for OpenSession/CloseSession/Subscribe/Unsubscribe/AlarmSubscribe/AlarmAck/Recycle and a recognizable "stub: MXAccess code lift pending (Phase 2 Task B.1)"-tagged error for Discover/ReadValues/WriteValues/HistoryRead — keeps the IPC end-to-end testable today and gives the parity team a clear seam to slot the real implementation into; Ipc/GalaxyFrameHandler is the new real dispatcher (replaces StubFrameHandler in Program.cs) — switch on MessageKind, deserialize the matching contract, await backend method, write the response (one-way for Unsubscribe/AlarmSubscribe/AlarmAck/CloseSession), heartbeat handled inline so liveness still works if the backend is sick, exceptions caught and surfaced as ErrorResponse with code "handler-exception" so the Proxy raises GalaxyIpcException instead of disconnecting. End-to-end IPC integration test (EndToEndIpcTests) drives every operation through the full stack — Initialize → Read → Write → Subscribe → Unsubscribe → SubscribeAlarms → AlarmAck → ReadRaw → ReadProcessed (short-circuit) — proving the wire protocol, dispatcher, capability forwarding, and one-way semantics agree end-to-end. Skipped on Windows administrator shells per the same PipeAcl-denies-Administrators reasoning the IpcHandshakeIntegrationTests use. Full solution 952 pass / 1 pre-existing Phase 0 baseline. Phase 2 evidence doc updated: status header now reads "Streams A+B+C complete... Streams D+E remain — gated only on the iterative Galaxy code lift + parity-debug cycle"; new Update 2026-04-17 (later) callout enumerates the upgrade with explicit "what's left for the Phase 2 exit gate" — replace StubGalaxyBackend with a MxAccessClient-backed implementation calling on the StaPump, then run the v1 IntegrationTests against the v2 topology and iterate on parity defects until green, then delete legacy OtOpcUa.Host.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 23:02:00 -04:00
Joseph Doherty
a1e9ed40fb Doc — record that this dev box (DESKTOP-6JL3KKO) hosts the full AVEVA stack required for the LmxOpcUa Phase 2 breakout, removing the "needs live MXAccess runtime" environmental blocker that the partial-exit evidence cited as gating Streams D + E. Inventory verified via Get-Service: 27 ArchestrA / Wonderware / AVEVA services running including aaBootstrap, aaGR (Galaxy Repository), aaLogger, aaUserValidator, aaPim, ArchestrADataStore, AsbServiceManager, AutoBuild_Service; the full Historian set (aahClientAccessPoint, aahGateway, aahInSight, aahSearchIndexer, aahSupervisor, InSQLStorage, InSQLConfiguration, InSQLEventSystem, InSQLIndexing, InSQLIOServer, InSQLManualStorage, InSQLSystemDriver, HistorianSearch-x64); slssvc (Wonderware SuiteLink); MXAccess COM DLL at C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll plus the matching .tlb files; OI-Gateway install at C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\ — which means the Phase 1 Task E.10 AppServer-via-OI-Gateway smoke test (decision #142) is *also* runnable on the same box, not blocked on a separate AVEVA test machine as the original deferral assumed. dev-environment.md inventory row for "Dev Galaxy" now lists every service and file path; status flips to "Fully available — Phase 2 lift unblocked"; the GLAuth row also fills out v2.4.0 actual install details (direct-bind cn={user},dc=lmxopcua,dc=local; users readonly/writeop/writetune/writeconfig/alarmack/admin/serviceaccount; running under NSSM service GLAuth; current GroupToRole mapping ReadOnly→ConfigViewer / WriteOperate→ConfigEditor / AlarmAck→FleetAdmin) and notes the v2-rebrand to dc=otopcua,dc=local is a future cosmetic change. phase-2-partial-exit-evidence.md status header gains "runtime now in place"; an Update 2026-04-17 callout enumerates the same service inventory and concludes "no environmental blocker remains"; the next-session checklist's first step changes from "stand up dev Galaxy" to "verify the local AVEVA stack is still green (Get-Service aaGR, aaBootstrap, slssvc → Running) and the Galaxy ZB repository is reachable" with a new step 9 calling out that the AppServer-via-OI-Gateway smoke test should now be folded in opportunistically. plan.md §"4. Galaxy/MXAccess as Out-of-Process Driver" gains a "Dev environment for the LmxOpcUa breakout" paragraph documenting which physical machine has the runtime so the planning doc no longer reads as if AVEVA capability were a future logistical concern. No source / test changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 22:42:15 -04:00
Joseph Doherty
18f93d72bb Phase 1 LDAP auth + SignalR real-time — closes the last two open Admin UI TODOs. LDAP: Admin/Security/ gets SecurityOptions (bound from appsettings.json Authentication:Ldap), LdapAuthResult record, ILdapAuthService + LdapAuthService ported from scadalink-design's LdapAuthService (TLS guard, search-then-bind when a service account is configured, direct-bind fallback, service-account re-bind after user bind so attribute lookup uses the service principal's read rights, LdapException-to-friendly-message translation, OperationCanceledException pass-through), RoleMapper (pure function: case-insensitive group-name match against LdapOptions.GroupToRole, returns the distinct set of mapped Admin roles). EscapeLdapFilter escapes the five LDAP filter control chars (\, *, (, ), \0); ExtractFirstRdnValue pulls the value portion of a DN's leading RDN for memberOf parsing; ExtractOuSegment added as a GLAuth-specific fallback when the directory doesn't populate memberOf but does embed ou=PrimaryGroup into user DNs (actual GLAuth config in C:\publish\glauth\glauth.cfg uses nameformat=cn, groupformat=ou — direct bind is enough). Login page rewritten: EditForm → ILdapAuthService.AuthenticateAsync → cookie sign-in with claims (Name = displayName, NameIdentifier = username, Role for each mapped role, ldap_group for each raw group); failed bind shows the service's error; empty-role-map returns an explicit "no Admin role mapped" message rather than silently succeeding. appsettings.json gains an Authentication:Ldap section with dev-GLAuth defaults (localhost:3893, UseTls=false, AllowInsecureLdap=true for dev, GroupToRole maps GLAuth's ReadOnly/WriteOperate/AlarmAck → ConfigViewer/ConfigEditor/FleetAdmin). SignalR: two hubs + a BackgroundService poller. FleetStatusHub routes per-cluster NodeStateChanged pushes (SubscribeCluster/UnsubscribeCluster on connection; FleetGroup for dashboard-wide) with a typed NodeStateChangedMessage payload. AlertHub auto-subscribes every connection to the AllAlertsGroup and exposes AcknowledgeAsync (ack persistence deferred to v2.1). FleetStatusPoller (IHostedService, 5s default cadence) scans ClusterNodeGenerationState joined with ClusterNode, caches the prior snapshot per NodeId, pushes NodeStateChanged on any delta, raises AlertMessage("apply-failed") on transition INTO Failed (sticky — the hub client acks later). Program.cs registers HttpContextAccessor (sign-in needs it), SignalR, LdapOptions + ILdapAuthService, the poller as hosted service, and maps /hubs/fleet + /hubs/alerts endpoints. ClusterDetail adds @rendermode RenderMode.InteractiveServer, @implements IAsyncDisposable, and a HubConnectionBuilder subscription that calls LoadAsync() on each NodeStateChanged for its cluster so the "current published" card refreshes without a page reload; a dismissable "Live update" info banner surfaces the most recent event. Microsoft.AspNetCore.SignalR.Client 10.0.0 + Novell.Directory.Ldap.NETStandard 3.6.0 added. Tests: 13 new — RoleMapperTests (single group, case-insensitive match, multi-group distinct-roles, unknown-group ignored, empty-map); LdapAuthServiceTests (EscapeLdapFilter with 4 inputs, ExtractFirstRdnValue with 4 inputs — all via reflection against internals); LdapLiveBindTests (skip when localhost:3893 unreachable; valid-credentials-bind-succeeds; wrong-password-fails-with-recognizable-error; empty-username-rejected-before-hitting-directory); FleetStatusPollerTests (throwaway DB, seeds cluster+node+generation+apply-state, runs PollOnceAsync, asserts NodeStateChanged hit the recorder; second test seeds a Failed state and asserts AlertRaised fired) — backed by RecordingHubContext/RecordingHubClients/RecordingClientProxy that capture SendCoreAsync invocations while throwing NotImplementedException for the IHubClients methods the poller doesn't call (fail-fast if evolution adds new dependencies). InternalsVisibleTo added so the test project can call FleetStatusPoller.PollOnceAsync directly. Full solution 946 pass / 1 pre-existing Phase 0 baseline failure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 22:28:49 -04:00
Joseph Doherty
7a5b535cd6 Phase 1 Stream E Admin UI — finish Blazor pages so operators can run the draft → publish → rollback workflow end-to-end without hand-executing SQL. Adds eight new scoped services that wrap the Configuration stored procs + managed validators: EquipmentService (CRUD with auto-derived EquipmentId per decision #125), UnsService (areas + lines), NamespaceService, DriverInstanceService (generic JSON DriverConfig editor per decision #94 — per-driver schema validation lands in each driver's phase), NodeAclService (grant + revoke with bundled-preset permission sets; full per-flag editor + bulk-grant + permission simulator deferred to v2.1), ReservationService (fleet-wide active + released reservation inspector + FleetAdmin-only sp_ReleaseExternalIdReservation wrapper with required-reason invariant), DraftValidationService (hydrates a DraftSnapshot from the draft's rows plus prior-cluster Equipment + active reservations, runs the managed DraftValidator to surface every rule in one pass for inline validation panel), AuditLogService (recent ConfigAuditLog reader). Pages: /clusters list with create-new shortcut; /clusters/new wizard that creates the cluster row + initial empty draft in one go; /clusters/{id} detail with 8 tabs (Overview / Generations / Equipment / UNS Structure / Namespaces / Drivers / ACLs / Audit) — tabs that write always target the active draft, published generations stay read-only; /clusters/{id}/draft/{gen} editor with live validation panel (errors list with stable code + message + context; publish button disabled while any error exists) and tab-embedded sub-components; /clusters/{id}/draft/{gen}/diff three-column view backed by sp_ComputeGenerationDiff with Added/Removed/Modified badges; Generations tab with per-row rollback action wired to sp_RollbackToGeneration; /reservations FleetAdmin-only page (CanPublish policy) with active + released lists and a modal release dialog that enforces non-empty reason and round-trips through sp_ReleaseExternalIdReservation; /login scaffold with stub credential accept + FleetAdmin-role cookie issuance (real LDAP bind via the ScadaLink-parity LdapAuthService is deferred until live GLAuth integration — marked in the login view and in the Phase 1 partial-exit TODO). Layout: sidebar gets Overview / Clusters / Reservations + AuthorizeView with signed-in username + roles + sign-out POST to /auth/logout; cascading authentication state registered for <AuthorizeView> to work in RenderMode.InteractiveServer. Integration testing: AdminServicesIntegrationTests creates a throwaway per-run database (same pattern as the Configuration test fixture), applies all three migrations, and exercises (1) create-cluster → add-namespace+UNS+driver+equipment → validate (expects zero errors) → publish (expects Published status) → rollback (expects one new Published + at least one Superseded); (2) cross-cluster namespace binding draft → validates to BadCrossClusterNamespaceBinding per decision #122. Old flat Components/Pages/Clusters.razor moved to Components/Pages/Clusters/ClustersList.razor so the Clusters folder can host tab sub-components without the razor generator creating a type-and-namespace collision. Dev appsettings.json connection string switched from Integrated Security to sa auth to match the otopcua-mssql container on port 14330 (remapped from 1433 to coexist with the native MSSQL14 Galaxy ZB instance). Browser smoke test completed: home page, clusters list, new-cluster form, cluster detail with a seeded row, reservations (redirected to login for anon user) all return 200 / 302-to-login as expected; full solution 928 pass / 1 pre-existing Phase 0 baseline failure. Phase 1 Stream E items explicitly deferred with TODOs: CSV import for Equipment, SignalR FleetStatusHub + AlertHub real-time push, bulk-grant workflow, permission-simulator trie, merge-equipment draft, AppServer-via-OI-Gateway end-to-end smoke test (decision #142), and the real LDAP bind replacing the Login page stub.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 21:52:42 -04:00
Joseph Doherty
01fd90c178 Phase 1 Streams B–E scaffold + Phase 2 Streams A–C scaffold — 8 new projects with ~70 new tests, all green alongside the 494 v1 IntegrationTests baseline (parity preserved: no v1 tests broken; legacy OtOpcUa.Host untouched). Phase 1 finish: Configuration project (16 entities + 10 enums + DbContext + DesignTimeDbContextFactory + InitialSchema/StoredProcedures/AuthorizationGrants migrations — 8 procs including sp_PublishGeneration with MERGE on ExternalIdReservation per decision #124, sp_RollbackToGeneration cloning rows into a new published generation, sp_ValidateDraft with cross-cluster-namespace + EquipmentUuid-immutability + ZTag/SAPID reservation pre-flight, sp_ComputeGenerationDiff with CHECKSUM-based row signature — plus OtOpcUaNode/OtOpcUaAdmin SQL roles with EXECUTE grants scoped to per-principal-class proc sets and DENY UPDATE/DELETE/INSERT/SELECT on dbo schema); managed DraftValidator covering UNS segment regex, path length, EquipmentUuid immutability across generations, same-cluster namespace binding (decision #122), reservation pre-flight, EquipmentId derivation (decision #125), driver↔namespace compatibility — returning every failing rule in one pass; LiteDB local cache with round-trip + ring pruning + corruption-fast-fail; GenerationApplier with per-entity Added/Removed/Modified diff and dependency-ordered callbacks (namespace → driver → device → equipment → poll-group → tag, Removed before Added); Core project with GenericDriverNodeManager (scaffold for the Phase 2 Galaxy port) and DriverHost lifecycle registry; Server project using Microsoft.Extensions.Hosting BackgroundService replacing TopShelf, with NodeBootstrap that falls back to LiteDB cache when the central DB is unreachable (decision #79); Admin project scaffolded as Blazor Server with Bootstrap 5 sidebar layout, cookie auth, three admin roles (ConfigViewer/ConfigEditor/FleetAdmin), Cluster + Generation services fronting the stored procs. Phase 2 scaffold: Driver.Galaxy.Shared (netstandard2.0) with full MessagePack IPC contract surface — Hello version negotiation, Open/CloseSession, Heartbeat, DiscoverHierarchy + GalaxyObjectInfo/GalaxyAttributeInfo, Read/WriteValues, Subscribe/Unsubscribe/OnDataChange, AlarmSubscribe/Event/Ack, HistoryRead, HostConnectivityStatus, Recycle — plus length-prefixed framing (decision #28) with a 16 MiB cap and thread-safe FrameWriter/FrameReader; Driver.Galaxy.Host (net48) implementing the Tier C cross-cutting protections from driver-stability.md — strict PipeAcl (allow configured server SID only, explicit deny on LocalSystem + Administrators), PipeServer with caller-SID verification via pipe.RunAsClient + WindowsIdentity.GetCurrent and per-process shared-secret Hello, Galaxy-specific MemoryWatchdog (warn at max(1.5×baseline, +200 MB), soft-recycle at max(2×baseline, +200 MB), hard ceiling 1.5 GB, slope ≥5 MB/min over 30-min rolling window), RecyclePolicy (1 soft recycle per hour cap + 03:00 local daily scheduled), PostMortemMmf (1000-entry ring buffer in %ProgramData%\OtOpcUa\driver-postmortem\galaxy.mmf, survives hard crash, readable cross-process), MxAccessHandle : SafeHandle (ReleaseHandle loops Marshal.ReleaseComObject until refcount=0 then calls optional unregister callback), StaPump with responsiveness probe (BlockingCollection dispatcher for Phase 1 — real Win32 GetMessage/DispatchMessage pump slots in with the same semantics when the Galaxy code lift happens), IsExternalInit shim for init setters on .NET 4.8; Driver.Galaxy.Proxy (net10) implementing IDriver + ITagDiscovery forwarding over the IPC channel with MX data-type and security-classification mapping, plus Supervisor pieces — Backoff (5s → 15s → 60s capped, reset-on-stable-run), CircuitBreaker (3 crashes per 5 min opens; 1h → 4h → manual cooldown escalation; sticky alert doesn't auto-clear), HeartbeatMonitor (2s cadence, 3 consecutive misses = host dead per driver-stability.md). Infrastructure: docker SQL Server remapped to host port 14330 to coexist with the native MSSQL14 Galaxy ZB DB instance on 1433; NuGetAuditSuppress applied per-project for two System.Security.Cryptography.Xml advisories that only reach via EF Core Design with PrivateAssets=all (fix ships in 11.0.0-preview); .slnx gains 14 project registrations. Deferred with explicit TODOs in docs/v2/implementation/phase-2-partial-exit-evidence.md: Phase 1 Stream E Admin UI pages (Generations listing + draft-diff-publish, Equipment CRUD with OPC 40010 fields, UNS Areas/Lines tabs, ACLs + permission simulator, Generic JSON config editor, SignalR real-time, Release-Reservation + Merge-Equipment workflows, LDAP login page, AppServer smoke test per decision #142), Phase 2 Stream D (Galaxy MXAccess code lift out of legacy OtOpcUa.Host, dual-service installer, appsettings → DriverConfig migration script, legacy Host deletion — blocked by parity), Phase 2 Stream E (v1 IntegrationTests against v2 topology, Client.CLI walkthrough diff, four 2026-04-13 stability findings regression tests, adversarial review — requires live MXAccess runtime).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 21:35:25 -04:00
Joseph Doherty
fc0ce36308 Add Installed Inventory section to dev-environment.md tracking every v2 dev service, toolchain, credential, port, data location, and container volume stood up on this machine. Records what is actually running (not just planned) so future setup work and troubleshooting has a single source of truth. Four subsections: Host (machine identity, VM platform, CPU, OS features); Toolchain (.NET 10 SDK 10.0.201 + runtimes 10.0.5, WSL2 default v2 with docker-desktop distro Running, Docker Desktop 29.3.1 / engine 29.3.1, dotnet-ef CLI 10.0.6 — each row records install method and date); Services (SQL Server 2022 container otopcua-mssql at localhost:1433 with sa/OtOpcUaDev_2026! credentials and Docker named volume otopcua-mssql-data mounted at /var/opt/mssql, dev Galaxy, GLAuth at C:\publish\glauth\ on ports 3893/3894, plus rows for not-yet-standing services like OPC Foundation reference server / FOCAS stub / Modbus simulator / ab_server / Snap7 / TwinCAT XAR VM with target ports to stand up later); Connection strings for appsettings.Development.json (copy-paste-ready, flagged never-commit); Container management quick reference (start/stop/logs/shell/query/nuclear-reset); Credential rotation note.
Per decision #137 (dev env credentials documented openly in dev-environment.md; production uses Integrated Security / gMSA per decision #46 and never any value from this table). Section lives at the top of the doc immediately after Two Environment Tiers, so it's discoverable as the single source of truth for "what's actually running here right now".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:57:09 -04:00
Joseph Doherty
bf6741ba7f Doc — flesh out dev-environment.md inner-loop bootstrap with the explicit Windows install steps that surfaced when actually trying to stand up SQL Server on the local box: prereq winget commands per tool (.NET 10 SDK / .NET Framework 4.8 SDK + targeting pack / Git / PowerShell 7.4+); WSL2 install (UAC-elevated) as a separate sub-step before Docker Desktop; Docker Desktop install (UAC-elevated) followed by sign-out/sign-in for docker-users group membership; explicit post-install Docker Desktop config checklist (WSL 2 based engine = checked, Windows containers = NOT checked, WSL Integration enabled for Ubuntu) per decision #134; named volume otopcua-mssql-data:/var/opt/mssql on the SQL Server container so DB files survive container restart and docker rm; sqlcmd verification command using the new mssql-tools18 path that the 2022 image ships with; EF Core CLI install for use starting in Phase 1 Stream B; bumped step count from 8 → 10. Also adds a Troubleshooting subsection covering the seven most common Windows install snags (WSL distro not auto-installed needs -d Ubuntu; Docker PATH not refreshed needs new shell or sign-in; docker-users group membership needs sign-out/in; WSL 2 kernel update needs manual install on legacy systems; SA password complexity rules; Linux vs Windows containers mode mismatch; Hyper-V coexistence with Docker requires WSL 2 backend not Hyper-V backend per decision #134). Step 1 acceptance criteria gain "docker ps shows otopcua-mssql Up" and explicit note that steps 4a/4b need admin elevation (no silent admin-free path exists on Windows).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:54:52 -04:00
Joseph Doherty
980ea5190c Phase 1 Stream A — Core.Abstractions project + 11 capability interfaces + DriverTypeRegistry + interface-independence tests
New project src/ZB.MOM.WW.OtOpcUa.Core.Abstractions (.NET 10, BCL-only dependencies, GenerateDocumentationFile=true, TreatWarningsAsErrors=true) defining the contract surface every driver implements. Per docs/v2/plan.md decisions #4 (composable capability interfaces), #52 (streaming IAddressSpaceBuilder), #53 (capability discovery via `is` checks no flag enum), #54 (optional IRediscoverable sub-interface), #59 (Core.Abstractions internal-only for now design as if public).

Eleven capability interfaces:
- IDriver — required lifecycle / health / config-apply / memory-footprint accounting (per driver-stability.md Tier A/B allocation tracking)
- ITagDiscovery — discovers tags streaming to IAddressSpaceBuilder
- IReadable — on-demand reads idempotent for Polly retry
- IWritable — writes NOT auto-retried by default per decisions #44 + #45
- ISubscribable — data-change subscriptions covering both native (Galaxy MXAccess advisory, OPC UA monitored items, TwinCAT ADS) and driver-internal polled (Modbus, AB CIP, S7, FOCAS) mechanisms; OnDataChange callback regardless of source
- IAlarmSource — alarm events + acknowledge + AlarmSeverity enum mirroring acl-design.md NodePermissions alarm-severity values
- IHistoryProvider — HistoryReadRaw + HistoryReadProcessed with continuation points
- IRediscoverable — opt-in change-detection signal; static drivers don't implement
- IHostConnectivityProbe — generalized from Galaxy's GalaxyRuntimeProbeManager per plan §5a
- IDriverConfigEditor — Admin UI plug-point for per-driver custom config editors deferred to each driver's phase per decision #27
- IAddressSpaceBuilder — streaming builder API for driver-driven address-space construction

Plus DTOs: DriverDataType, SecurityClassification (mirroring v1 Galaxy model), DriverAttributeInfo (replaces Galaxy-specific GalaxyAttributeInfo per plan §5a), DriverHealth + DriverState, DataValueSnapshot (universal OPC UA quality + timestamp carrier per decision #13), HostConnectivityStatus + HostState + HostStatusChangedEventArgs, RediscoveryEventArgs, DataChangeEventArgs, AlarmEventArgs + AlarmAcknowledgeRequest + AlarmSeverity, WriteRequest + WriteResult, HistoryReadResult + HistoryAggregateType, ISubscriptionHandle + IAlarmSubscriptionHandle + IVariableHandle.

DriverTypeRegistry singleton with Register / Get / TryGet / All; thread-safe via Interlocked.Exchange snapshot replacement on registration; case-insensitive lookups; rejects duplicate registrations; rejects empty type names. DriverTypeMetadata record carries TypeName + AllowedNamespaceKinds (NamespaceKindCompatibility flags enum per decision #111) + per-config-tier JSON Schemas the validator checks at draft-publish time (decision #91).

Tests project tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests (xUnit v3 1.1.0 matching existing test projects). 24 tests covering: 1) interface independence reflection check (no references outside BCL/System; all public types in root namespace; every capability interface is public); 2) DriverTypeRegistry round-trip, case-insensitive lookups, KeyNotFoundException on unknown, null on TryGet of unknown, InvalidOperationException on duplicate registration (case-insensitive too), All() enumeration, NamespaceKindCompatibility bitmask combinations, ArgumentException on empty type names.

Build: 0 errors, 4 warnings (only pre-existing transitive package vulnerability + analyzer hints). Full test suite: 845 passing / 1 failing — strict improvement over Phase 0 baseline (821/1) by the 24 new Core.Abstractions tests; no regressions in any other test project.

Phase 1 entry-gate record (docs/v2/implementation/entry-gate-phase-1.md) documents the deviation: only Stream A executed in this continuation since Streams B-E need SQL Server / GLAuth / Galaxy infrastructure standup per dev-environment.md Step 1, which is currently TODO.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:15:55 -04:00
Joseph Doherty
45ffa3e7d4 Merge phase-0-rename into v2
Phase 0 exit gate cleared. 284 files changed (720 insertions / 720 deletions — pure rename). Build clean (0 errors, 30 warnings, lower than baseline 167). Tests at strict improvement over baseline (821 passing / 1 failing vs baseline 820 / 2). 23 LmxOpcUa references retained per Phase 0 Out-of-Scope rules (runtime identifiers preserved for v1/v2 client trust during coexistence).

Implementation lead: Claude (Opus 4.7). Reviewer signoff: pending — being absorbed into v2 with the deferred service-install verification flagged for the deployment-side reviewer at the next field deployment milestone.

See docs/v2/implementation/exit-gate-phase-0.md for the full compliance checklist (6 PASS + 1 DEFERRED).
2026-04-17 14:03:06 -04:00
422 changed files with 27705 additions and 23954 deletions

View File

@@ -1,15 +1,29 @@
<Solution>
<Folder Name="/src/">
<Project Path="src/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/ZB.MOM.WW.OtOpcUa.Historian.Aveva.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Configuration/ZB.MOM.WW.OtOpcUa.Configuration.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Core/ZB.MOM.WW.OtOpcUa.Core.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Server/ZB.MOM.WW.OtOpcUa.Server.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Admin/ZB.MOM.WW.OtOpcUa.Admin.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ZB.MOM.WW.OtOpcUa.Driver.Modbus.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.Shared/ZB.MOM.WW.OtOpcUa.Client.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.CLI/ZB.MOM.WW.OtOpcUa.Client.CLI.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.UI/ZB.MOM.WW.OtOpcUa.Client.UI.csproj"/>
</Folder>
<Folder Name="/tests/">
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Tests/ZB.MOM.WW.OtOpcUa.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/ZB.MOM.WW.OtOpcUa.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Configuration.Tests/ZB.MOM.WW.OtOpcUa.Configuration.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Core.Tests/ZB.MOM.WW.OtOpcUa.Core.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Server.Tests/ZB.MOM.WW.OtOpcUa.Server.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ZB.MOM.WW.OtOpcUa.Admin.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.UI.Tests/ZB.MOM.WW.OtOpcUa.Client.UI.Tests.csproj"/>

View File

@@ -0,0 +1,56 @@
# V1 Archive Status (Phase 2 Stream D, 2026-04-18)
This document inventories every v1 surface that's been **functionally superseded** by v2 but
**physically retained** in the build until the deletion PR (Phase 2 PR 3). Rationale: cascading
references mean a single deletion is high blast-radius; archive-marking lets the v2 stack ship
on its own merits while the v1 surface stays as parity reference.
## Archived projects
| Path | Status | Replaced by | Build behavior |
|---|---|---|---|
| `src/ZB.MOM.WW.OtOpcUa.Host/` | Archive (executable in build) | `OtOpcUa.Server` + `Driver.Galaxy.Host` + `Driver.Galaxy.Proxy` | Builds; not deployed by v2 install scripts |
| `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` | Archive (plugin in build) | TODO: port into `Driver.Galaxy.Host/Backend/Historian/` (Task B.1.h follow-up) | Builds; loaded only by archived Host |
| `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/` | Archive | `Driver.Galaxy.E2E` + per-component test projects | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
| `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/` | Archive | `Driver.Galaxy.E2E` | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
## How to run the archived suites explicitly
```powershell
# v1 unit tests (494):
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive
# v1 integration tests (6):
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests
```
Both still pass on this dev box — they're the parity reference for Phase 2 PR 3's deletion
decision.
## Deletion plan (Phase 2 PR 3)
Pre-conditions:
- [ ] `Driver.Galaxy.E2E` test count covers the v1 IntegrationTests' 6 integration scenarios
at minimum (currently 7 tests; expand as needed)
- [ ] `Driver.Galaxy.Host/Backend/Historian/` ports the Wonderware Historian plugin
so `MxAccessGalaxyBackend.HistoryReadAsync` returns real data (Task B.1.h)
- [ ] Operator review on a separate PR — destructive change
Steps:
1. `git rm -r src/ZB.MOM.WW.OtOpcUa.Host/`
2. `git rm -r src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/`
(or move it under Driver.Galaxy.Host first if the lift is part of the same PR)
3. `git rm -r tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
4. `git rm -r tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/`
5. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the four project lines
6. `dotnet build ZB.MOM.WW.OtOpcUa.slnx` → confirm clean
7. `dotnet test ZB.MOM.WW.OtOpcUa.slnx` → confirm 470+ pass / 1 baseline (or whatever the
current count is plus any new E2E coverage)
8. Commit: "Phase 2 Stream D — delete v1 archive (Host + Historian.Aveva + v1Tests + IntegrationTests)"
9. PR 3 against `v2`, link this doc + exit-gate-phase-2-final.md
10. One reviewer signoff
## Rollback
If Phase 2 PR 3 surfaces downstream consumer regressions, `git revert` the deletion commit
restores the four projects intact. The v2 stack continues to ship from the v2 branch.

View File

@@ -22,6 +22,102 @@ Per decision #99:
The tier split keeps developer onboarding fast (no Docker required for first build) while concentrating the heavy simulator setup on one machine the team maintains.
## Installed Inventory — This Machine
Running record of every v2 dev service stood up on this developer machine. Updated on every install / config change. Credentials here are **dev-only** per decision #137 — production uses Integrated Security / gMSA per decision #46 and never any value in this table.
**Last updated**: 2026-04-17
### Host
| Attribute | Value |
|-----------|-------|
| Machine name | `DESKTOP-6JL3KKO` |
| User | `dohertj2` (member of local Administrators + `docker-users`) |
| VM platform | VMware (`VMware20,1`), nested virtualization enabled |
| CPU | Intel Xeon E5-2697 v4 @ 2.30GHz (3 vCPUs) |
| OS | Windows (WSL2 + Hyper-V Platform features installed) |
### Toolchain
| Tool | Version | Location | Install method |
|------|---------|----------|----------------|
| .NET SDK | 10.0.201 | `C:\Program Files\dotnet\sdk\` | Pre-installed |
| .NET AspNetCore runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App\` | Pre-installed |
| .NET NETCore runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.NETCore.App\` | Pre-installed |
| .NET WindowsDesktop runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App\` | Pre-installed |
| .NET Framework 4.8 SDK | — | Pending (needed for Phase 2 Galaxy.Host; not yet required) | — |
| Git | Pre-installed | Standard | — |
| PowerShell 7 | Pre-installed | Standard | — |
| winget | v1.28.220 | Standard Windows feature | — |
| WSL | Default v2, distro `docker-desktop` `STATE Running` | — | `wsl --install --no-launch` (2026-04-17) |
| Docker Desktop | 29.3.1 (engine) / Docker Desktop 4.68.0 (app) | Standard | `winget install --id Docker.DockerDesktop` (2026-04-17) |
| `dotnet-ef` CLI | 10.0.6 | `%USERPROFILE%\.dotnet\tools\dotnet-ef.exe` | `dotnet tool install --global dotnet-ef --version 10.0.*` (2026-04-17) |
### Services
| Service | Container / Process | Version | Host:Port | Credentials (dev-only) | Data location | Status |
|---------|---------------------|---------|-----------|------------------------|---------------|--------|
| **Central config DB** | Docker container `otopcua-mssql` (image `mcr.microsoft.com/mssql/server:2022-latest`) | 16.0.4250.1 (RTM-CU24-GDR, KB5083252) | `localhost:14330` (host) → `1433` (container) — remapped from 1433 to avoid collision with the native MSSQL14 instance that hosts the Galaxy `ZB` DB (both bind 0.0.0.0:1433; whichever wins the race gets connections) | User `sa` / Password `OtOpcUaDev_2026!` | Docker named volume `otopcua-mssql-data` (mounted at `/var/opt/mssql` inside container) | ✅ Running — `InitialSchema` migration applied, 16 entity tables live |
| Dev Galaxy (AVEVA System Platform) | Local install on this dev box — full ArchestrA + Historian + OI-Server stack | v1 baseline | Local COM via MXAccess (`C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`); Historian via `aaH*` services; SuiteLink via `slssvc` | Windows Auth | Galaxy repository DB `ZB` on local SQL Server (separate instance from `otopcua-mssql` — legacy v1 Galaxy DB, not related to v2 config DB) | ✅ **Fully available — Phase 2 lift unblocked.** 27 ArchestrA / AVEVA / Wonderware services running incl. `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`, `ArchestrADataStore`, `AsbServiceManager`, `AutoBuild_Service`; full Historian set (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `aahSupervisor`, `InSQLStorage`, `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`, `InSQLManualStorage`, `InSQLSystemDriver`, `HistorianSearch-x64`); `slssvc` (Wonderware SuiteLink); `OI-Gateway` install present at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` (decision #142 AppServer-via-OI-Gateway smoke test now also unblocked) |
| GLAuth (LDAP) | Local install at `C:\publish\glauth\` | v2.4.0 | `localhost:3893` (LDAP) / `3894` (LDAPS, disabled) | Direct-bind `cn={user},dc=lmxopcua,dc=local` per `auth.md`; users `readonly`/`writeop`/`writetune`/`writeconfig`/`alarmack`/`admin`/`serviceaccount` (passwords in `glauth.cfg` as SHA-256) | `C:\publish\glauth\` | ✅ Running (NSSM service `GLAuth`). Phase 1 Admin uses GroupToRole map `ReadOnly→ConfigViewer`, `WriteOperate→ConfigEditor`, `AlarmAck→FleetAdmin`. v2-rebrand to `dc=otopcua,dc=local` is a future cosmetic change |
| OPC Foundation reference server | Not yet built | — | `localhost:62541` (target) | `user1` / `password1` (reference-server defaults) | — | Pending (needed for Phase 5 OPC UA Client driver testing) |
| FOCAS TCP stub | Not yet built | — | `localhost:8193` (target) | n/a | — | Pending (built in Phase 5) |
| Modbus simulator (`oitc/modbus-server`) | — | — | `localhost:502` (target) | n/a | — | Pending (needed for Phase 3 Modbus driver; moves to integration host per two-tier model) |
| libplctag `ab_server` | — | — | `localhost:44818` (target) | n/a | — | Pending (Phase 3/4 AB CIP and AB Legacy drivers) |
| Snap7 Server | — | — | `localhost:102` (target) | n/a | — | Pending (Phase 4 S7 driver) |
| TwinCAT XAR VM | — | — | `localhost:48898` (ADS) (target) | TwinCAT default route creds | — | Pending — runs in Hyper-V VM, not on this dev box (per decision #135) |
### Connection strings for `appsettings.Development.json`
Copy-paste-ready. **Never commit these to the repo** — they go in `appsettings.Development.json` (gitignored per the standard .NET convention) or in user-scoped dotnet secrets.
```jsonc
{
"ConfigDatabase": {
"ConnectionString": "Server=localhost,14330;Database=OtOpcUaConfig_Dev;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=true;Encrypt=false;"
},
"Authentication": {
"Ldap": {
"Host": "localhost",
"Port": 3893,
"UseLdaps": false,
"BindDn": "cn=admin,dc=otopcua,dc=local",
"BindPassword": "<see glauth-otopcua.cfg — pending seeding>"
}
}
}
```
For xUnit test fixtures that need a throwaway DB per test run, build connection strings with `Database=OtOpcUaConfig_Test_{timestamp}` to avoid cross-run pollution.
### Container management quick reference
```powershell
# Start / stop the SQL Server container (survives reboots via Docker Desktop auto-start)
docker stop otopcua-mssql
docker start otopcua-mssql
# Logs (useful for diagnosing startup failures or login issues)
docker logs otopcua-mssql --tail 50
# Shell into the container (rarely needed; sqlcmd is the usual tool)
docker exec -it otopcua-mssql bash
# Query via sqlcmd inside the container (Git Bash needs MSYS_NO_PATHCONV=1 to avoid path mangling)
MSYS_NO_PATHCONV=1 docker exec otopcua-mssql /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P "OtOpcUaDev_2026!" -C -Q "SELECT @@VERSION"
# Nuclear reset: drop the container + volume (destroys all DB data)
docker stop otopcua-mssql
docker rm otopcua-mssql
docker volume rm otopcua-mssql-data
# …then re-run the docker run command from Bootstrap Step 6
```
### Credential rotation
Dev credentials in this inventory are convenience defaults, not secrets. Change them at will per developer — just update this doc + each developer's `appsettings.Development.json`. There is no shared secret store for dev.
## Resource Inventory
### A. Always-required (every developer + integration host)
@@ -39,7 +135,7 @@ The tier split keeps developer onboarding fast (no Docker required for first bui
| Resource | Purpose | Type | Default port | Default credentials | Owner |
|----------|---------|------|--------------|---------------------|-------|
| **SQL Server 2022 dev edition** | Central config DB; integration tests against `Configuration` project | Local install OR Docker container `mcr.microsoft.com/mssql/server:2022-latest` | 1433 | `sa` / `OtOpcUaDev_2026!` (dev only — production uses Integrated Security or gMSA per decision #46) | Developer (per machine) |
| **SQL Server 2022 dev edition** | Central config DB; integration tests against `Configuration` project | Local install OR Docker container `mcr.microsoft.com/mssql/server:2022-latest` | 1433 default, or 14330 when a native MSSQL instance (e.g. the Galaxy `ZB` host) already occupies 1433 | `sa` / `OtOpcUaDev_2026!` (dev only — production uses Integrated Security or gMSA per decision #46) | Developer (per machine) |
| **GLAuth (LDAP server)** | Admin UI authentication tests; data-path ACL evaluation tests | Local binary at `C:\publish\glauth\` per existing CLAUDE.md | 3893 (LDAP) / 3894 (LDAPS) | Service principal: `cn=admin,dc=otopcua,dc=local` / `OtOpcUaDev_2026!`; test users defined in GLAuth config | Developer (per machine) |
| **Local dev Galaxy** (Aveva System Platform) | Galaxy driver tests; v1 IntegrationTests parity | Existing on dev box per CLAUDE.md | n/a (local COM) | Windows Auth | Developer (already present per project setup) |
@@ -108,25 +204,104 @@ The tier split keeps developer onboarding fast (no Docker required for first bui
## Bootstrap Order — Inner-loop Developer Machine
Order matters because some installs have prerequisites. ~3060 min total on a fresh machine.
Order matters because some installs have prerequisites and several need admin elevation (UAC). ~6090 min total on a fresh Windows machine, including reboots.
**Admin elevation appears at**: WSL2 install (step 4a), Docker Desktop install (step 4b), and any `wsl --install -d` call. winget will prompt UAC interactively when these run; accept it. There is no fully-silent admin-free install path on Windows for Docker Desktop's prerequisites.
1. **Install .NET 10 SDK** (https://dotnet.microsoft.com/) — required to build anything
```powershell
winget install --id Microsoft.DotNet.SDK.10 --accept-package-agreements --accept-source-agreements
```
2. **Install .NET Framework 4.8 SDK + targeting pack** — only needed when starting Phase 2 (Galaxy.Host); skip for Phase 01 if not yet there
```powershell
winget install --id Microsoft.DotNet.Framework.DeveloperPack_4 --accept-package-agreements --accept-source-agreements
```
3. **Install Git + PowerShell 7.4+**
4. **Clone repos**:
```powershell
winget install --id Git.Git --accept-package-agreements --accept-source-agreements
winget install --id Microsoft.PowerShell --accept-package-agreements --accept-source-agreements
```
4. **Install Docker Desktop** (with WSL2 backend per decision #134, leaves Hyper-V free for the future TwinCAT XAR VM):
**4a. Enable WSL2** — UAC required:
```powershell
wsl --install
```
Reboot when prompted. After reboot, the default Ubuntu distro launches and asks for a username/password — set them (these are WSL-internal, not used for Docker auth).
Verify after reboot:
```powershell
wsl --status
wsl --list --verbose
```
Expected: `Default Version: 2`, at least one distro (typically `Ubuntu`) with `STATE Running` or `Stopped`.
**4b. Install Docker Desktop** — UAC required:
```powershell
winget install --id Docker.DockerDesktop --accept-package-agreements --accept-source-agreements
```
The installer adds you to the `docker-users` Windows group. **Sign out and back in** (or reboot) so the group membership takes effect.
**4c. Configure Docker Desktop** — open it once after sign-in:
- **Settings → General**: confirm "Use the WSL 2 based engine" is **checked** (decision #134 — coexists with future Hyper-V VMs)
- **Settings → General**: confirm "Use Windows containers" is **NOT checked** (we use Linux containers for `mcr.microsoft.com/mssql/server`, `oitc/modbus-server`, etc.)
- **Settings → Resources → WSL Integration**: enable for the default Ubuntu distro
- (Optional, large fleets) **Settings → Resources → Advanced**: bump CPU / RAM allocation if you have headroom
Verify:
```powershell
docker --version
docker ps
```
Expected: version reported, `docker ps` returns an empty table (no containers running yet, but the daemon is reachable).
5. **Clone repos**:
```powershell
git clone https://gitea.dohertylan.com/dohertj2/lmxopcua.git
git clone https://gitea.dohertylan.com/dohertj2/scadalink-design.git
git clone https://gitea.dohertylan.com/dohertj2/3yearplan.git
```
5. **Install SQL Server 2022 dev edition** (local install) OR start the Docker container (see Resource B):
6. **Start SQL Server** (Linux container; runs in the WSL2 backend):
```powershell
docker run --name otopcua-mssql -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=OtOpcUaDev_2026!" `
-p 1433:1433 -d mcr.microsoft.com/mssql/server:2022-latest
docker run --name otopcua-mssql `
-e "ACCEPT_EULA=Y" `
-e "MSSQL_SA_PASSWORD=OtOpcUaDev_2026!" `
-p 14330:1433 `
-v otopcua-mssql-data:/var/opt/mssql `
-d mcr.microsoft.com/mssql/server:2022-latest
```
6. **Install GLAuth** at `C:\publish\glauth\` per existing CLAUDE.md instructions; populate `glauth-otopcua.cfg` with the test users + groups (template in `docs/v2/dev-environment-glauth-config.md` — to be added in the setup task)
7. **Run `dotnet restore`** in the `lmxopcua` repo
8. **Run `dotnet build ZB.MOM.WW.OtOpcUa.slnx`** (post-Phase-0) or `ZB.MOM.WW.LmxOpcUa.slnx` (pre-Phase-0) — verifies the toolchain
The host port is **14330**, not 1433, to coexist with the native MSSQL14 instance that hosts the Galaxy `ZB` DB on port 1433. Both the native instance and Docker's port-proxy will happily bind `0.0.0.0:1433`, but only one of them catches any given connection — which is effectively non-deterministic and produces confusing "Login failed for user 'sa'" errors when the native instance wins. Using 14330 eliminates the race entirely.
The `-v otopcua-mssql-data:/var/opt/mssql` named volume preserves database files across container restarts and `docker rm` — drop it only if you want a strictly throwaway instance.
Verify:
```powershell
docker ps --filter name=otopcua-mssql
docker exec -it otopcua-mssql /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P "OtOpcUaDev_2026!" -C -Q "SELECT @@VERSION"
```
Expected: container `STATUS Up`, `SELECT @@VERSION` returns `Microsoft SQL Server 2022 (...)`.
To stop / start later:
```powershell
docker stop otopcua-mssql
docker start otopcua-mssql
```
7. **Install GLAuth** at `C:\publish\glauth\` per existing CLAUDE.md instructions; populate `glauth-otopcua.cfg` with the test users + groups (template in `docs/v2/dev-environment-glauth-config.md` — to be added in the setup task)
8. **Install EF Core CLI** (used to apply migrations against the SQL Server container starting in Phase 1 Stream B):
```powershell
dotnet tool install --global dotnet-ef --version 10.0.*
```
9. **Run `dotnet restore`** in the `lmxopcua` repo
10. **Run `dotnet build ZB.MOM.WW.OtOpcUa.slnx`** (post-Phase-0) or `ZB.MOM.WW.LmxOpcUa.slnx` (pre-Phase-0) — verifies the toolchain
9. **Run `dotnet test`** with the inner-loop filter — should pass on a fresh machine
## Bootstrap Order — Integration Host
@@ -213,11 +388,22 @@ Seeds are idempotent (re-runnable) and gitignored where they contain credentials
### Step 1 — Inner-loop dev environment (each developer, ~1 day with documentation)
**Owner**: developer
**Prerequisite**: Bootstrap order steps 19 above
**Prerequisite**: Bootstrap order steps 110 above (note: steps 4a, 4b, and any later `wsl --install -d` call require admin elevation / UAC interaction — there is no fully-silent admin-free install path on Windows for Docker Desktop's prerequisites)
**Acceptance**:
- `dotnet test ZB.MOM.WW.OtOpcUa.slnx` passes
- A test that touches the central config DB succeeds (proves SQL Server reachable)
- A test that authenticates against GLAuth succeeds (proves LDAP reachable)
- `docker ps --filter name=otopcua-mssql` shows the SQL Server container `STATUS Up`
### Troubleshooting (common Windows install snags)
- **`wsl --install` says "Windows Subsystem for Linux has no installed distributions"** after first reboot — open a fresh PowerShell and run `wsl --install -d Ubuntu` (the `-d` form forces a distro install if the prereq-only install ran first).
- **Docker Desktop install completes but `docker --version` reports "command not found"** — `PATH` doesn't pick up the new Docker shims until a new shell is opened. Open a fresh PowerShell, or sign out/in, and retry.
- **`docker ps` reports "permission denied" or "Cannot connect to the Docker daemon"** — your user account isn't in the `docker-users` group yet. Sign out and back in (group membership is loaded at login). Verify with `whoami /groups | findstr docker-users`.
- **Docker Desktop refuses to start with "WSL 2 installation is incomplete"** — open the WSL2 kernel update from https://aka.ms/wsl2kernel, install, then restart Docker Desktop. (Modern `wsl --install` ships the kernel automatically; this is mostly a legacy problem.)
- **SQL Server container starts but immediately exits** — SA password complexity. The default `OtOpcUaDev_2026!` meets the requirement (≥8 chars, upper + lower + digit + symbol); if you change it, keep complexity. Check `docker logs otopcua-mssql` for the exact failure.
- **`docker run` fails with "image platform does not match host platform"** — your Docker is configured for Windows containers. Switch to Linux containers in Docker Desktop tray menu ("Switch to Linux containers"), or recheck Settings → General per step 4c.
- **Hyper-V conflict when later setting up TwinCAT XAR VM** — confirm Docker Desktop is on the **WSL 2 backend**, not Hyper-V backend. The two coexist only when Docker uses WSL 2.
### Step 2 — Integration host (one-time, ~1 week)

View File

@@ -0,0 +1,56 @@
# Phase 1 — Entry Gate Record
**Phase**: 1 — Configuration project + Core.Abstractions + Admin scaffold
**Branch**: `phase-1-configuration`
**Date**: 2026-04-17
**Implementation lead**: Claude (executing on behalf of dohertj2)
## Entry conditions
| Check | Required | Actual | Pass |
|-------|----------|--------|------|
| Phase 0 exit gate cleared | Rename complete, all v1 tests pass under OtOpcUa names | Phase 0 merged to `v2` at commit `45ffa3e` | ✅ |
| `v2` branch is clean | Clean | Clean post-merge | ✅ |
| Phase 0 PR merged | — | Merged via `--no-ff` to v2 | ✅ |
| SQL Server 2019+ instance available | For development | NOT YET AVAILABLE — see deviation below | ⚠️ |
| LDAP/GLAuth dev instance available | For Admin auth integration testing | Existing v1 GLAuth at `C:\publish\glauth\` | ✅ |
| ScadaLink CentralUI source accessible | For parity reference | `C:\Users\dohertj2\Desktop\scadalink-design\` per memory | ✅ |
| Phase 1-relevant design docs reviewed | All read by impl lead | ✅ Read in preceding sessions | ✅ |
| Decisions read | #1142 covered cumulatively | ✅ | ✅ |
## Deviation: SQL Server dev instance not yet stood up
The Phase 1 entry gate requires a SQL Server 2019+ dev instance for the `Configuration` project's EF Core migrations + tests. This is per `dev-environment.md` Step 1, which is currently TODO.
**Decision**: proceed with **Stream A only** (Core.Abstractions) in this continuation. Stream A has zero infrastructure dependencies — it's a `.NET 10` project with BCL-only references defining capability interfaces and DTOs. Streams B (Configuration), C (Core), D (Server), and E (Admin) all have infrastructure dependencies (SQL Server, GLAuth, Galaxy) and require the dev environment standup to be productive.
The SQL Server standup is a one-line `docker run` per `dev-environment.md` §"Bootstrap Order — Inner-loop Developer Machine" step 5. It can happen in parallel with subsequent Stream A work but is not a blocker for Stream A itself.
**This continuation will execute only Stream A.** Streams BE require their own continuations after the dev environment is stood up.
## Phase 1 work scope (for reference)
Per `phase-1-configuration-and-admin-scaffold.md`:
| Stream | Scope | Status this continuation |
|--------|-------|--------------------------|
| **A. Core.Abstractions** | 11 capability interfaces + DTOs + DriverTypeRegistry | ▶ EXECUTING |
| B. Configuration | EF Core schema, stored procs, LiteDB cache, generation-diff applier | DEFERRED — needs SQL Server |
| C. Core | `LmxNodeManager → GenericDriverNodeManager` rename, `IAddressSpaceBuilder`, driver hosting | DEFERRED — depends on Stream A + needs Galaxy |
| D. Server | `Microsoft.Extensions.Hosting` host, credential-bound bootstrap | DEFERRED — depends on Stream B |
| E. Admin | Blazor Server scaffold mirroring ScadaLink | DEFERRED — depends on Stream B |
## Baseline metrics (carried from Phase 0 exit)
- **Total tests**: 822 (pass + fail)
- **Pass count**: 821 (improved from baseline 820 — one flaky test happened to pass at Phase 0 exit)
- **Fail count**: 1 (the second pre-existing failure may flap; either 1 or 2 failures is consistent with baseline)
- **Build warnings**: 30 (lower than original baseline 167)
- **Build errors**: 0
Phase 1 must not introduce new failures or new errors against this baseline.
## Signoff
Implementation lead: Claude (Opus 4.7) — 2026-04-17
Reviewer: pending — Stream A PR will require a second reviewer per overview.md exit-gate rules

View File

@@ -0,0 +1,123 @@
# Phase 2 Final Exit Gate (2026-04-18)
> Supersedes `phase-2-partial-exit-evidence.md` and `exit-gate-phase-2.md`. Captures the
> as-built state at the close of Phase 2 work delivered across two PRs.
## Status: **All five Phase 2 streams addressed. Stream D split across PR 2 (archive) + PR 3 (delete) per safety protocol.**
## Stream-by-stream status
| Stream | Plan §reference | Status | PR |
|---|---|---|---|
| A — Driver.Galaxy.Shared | §A.1A.3 | ✅ Complete | PR 1 (merged or pending) |
| B — Driver.Galaxy.Host | §B.1B.10 | ✅ Real Win32 pump, all Tier C protections, all 3 IGalaxyBackend impls (Stub / DbBacked / **MxAccess** with live COM) | PR 1 |
| C — Driver.Galaxy.Proxy | §C.1C.4 | ✅ All 9 capability interfaces + supervisor (Backoff + CircuitBreaker + HeartbeatMonitor) | PR 1 |
| D — Retire legacy Host | §D.1D.3 | ✅ Migration script, installer scripts, Stream D procedure doc, **archive markings on all v1 surface (this PR 2)**, deletion deferred to PR 3 | PR 2 (this) + PR 3 (next) |
| E — Parity validation | §E.1E.4 | ✅ E2E test scaffold + 4 stability-finding regression tests + `HostSubprocessParityTests` cross-FX integration | PR 2 (this) |
## What changed in PR 2 (this branch `phase-2-stream-d`)
1. **`tests/ZB.MOM.WW.OtOpcUa.Tests/`** renamed to `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`,
`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so the v1 Host's `InternalsVisibleTo`
still matches, `<IsTestProject>false</IsTestProject>` so `dotnet test slnx` excludes it.
2. **Three other v1 projects archive-marked** with PropertyGroup comments:
`OtOpcUa.Host`, `Historian.Aveva`, `IntegrationTests`. `IntegrationTests` also gets
`<IsTestProject>false</IsTestProject>`.
3. **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable, when Host EXE not built, or when running as
Administrator (PipeAcl denies admins).
- `RecordingAddressSpaceBuilder` captures Folder + Variable + Property registrations so
parity tests can assert shape.
- `HierarchyParityTests` (3) — Discover returns gobjects with attributes;
attribute full references match `tag.attribute` shape; HistoryExtension flag flows
through.
- `StabilityFindingsRegressionTests` (4) — one test per 2026-04-13 finding:
phantom-probe-doesn't-corrupt-status, host-status-event-is-scoped, all-async-no-sync-
over-async, AcknowledgeAsync-completes-before-returning.
4. **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
5. **`docs/v2/implementation/exit-gate-phase-2-final.md`** (this doc) — supersedes the two
partial-exit docs.
## Test counts
**Solution-level `dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 baseline failure**.
| Project | Pass | Skip |
|---|---:|---:|
| Core.Abstractions.Tests | 24 | 0 |
| Configuration.Tests | 42 | 0 |
| Core.Tests | 4 | 0 |
| Server.Tests | 2 | 0 |
| Admin.Tests | 21 | 0 |
| Driver.Galaxy.Shared.Tests | 6 | 0 |
| Driver.Galaxy.Host.Tests | 30 | 0 |
| Driver.Galaxy.Proxy.Tests | 10 | 0 |
| **Driver.Galaxy.E2E (NEW)** | **0** | **7** (all skip with documented reason — admin shell) |
| Client.Shared.Tests | 131 | 0 |
| Client.UI.Tests | 98 | 0 |
| Client.CLI.Tests | 51 / 1 fail | 0 |
| Historian.Aveva.Tests | 41 | 0 |
**Excluded from solution run (run explicitly when needed)**:
- `OtOpcUa.Tests.v1Archive` — 494 pass (v1 unit tests, kept as parity reference)
- `OtOpcUa.IntegrationTests` — 6 pass (v1 integration tests, kept as parity reference)
## Adversarial review of the PR 2 diff
Independent pass over the PR 2 deltas. New findings ranked by severity; existing findings
from the previous exit-gate doc still apply.
### New findings
**Medium 1 — `IsTestProject=false` on `OtOpcUa.IntegrationTests` removes the safety net.**
The 6 v1 integration tests no longer run on solution test. *Mitigation:* the new E2E suite
covers the same scenarios in the v2 topology shape. *Risk:* if E2E test count regresses or
fails to cover a scenario, the v1 fallback isn't auto-checked. **Procedure**: PR 3
checklist includes "E2E test count covers v1 IntegrationTests' 6 scenarios at minimum".
**Medium 2 — Stability-finding regression tests #2, #3, #4 are structural (reflection-based)
not behavioral.** Findings #2 and #3 use type-shape assertions (event signature carries
HostName; methods return Task) rather than triggering the actual race. *Mitigation:* the v1
defects were structural — fixing them required interface changes that the type-shape
assertions catch. *Risk:* a future refactor that re-introduces sync-over-async via a non-
async helper called inside a Task method wouldn't trip the test. **Filed as v2.1**: add a
runtime async-call-stack analyzer (Roslyn or post-build).
**Low 1 — `ParityFixture` defaults to `OTOPCUA_GALAXY_BACKEND=db`** (not `mxaccess`).
Discover works against ZB without needing live MXAccess. The MXAccess-required tests will
need a second fixture once they're written.
**Low 2 — `Process.Start(EnvironmentVariables)` doesn't always inherit clean state.** The
test inherits the parent's PATH + locale, which is normally fine but could mask a missing
runtime dependency. *Mitigation:* in CI, pin a clean environment block.
### Existing findings (carried forward from `exit-gate-phase-2.md`)
All 8 still apply unchanged. Particularly:
- High 1 (MxAccess Read subscription-leak on cancellation) — open
- High 2 (no MXAccess reconnect loop, only supervisor-driven recycle) — open
- Medium 3 (SubscribeAsync doesn't push OnDataChange frames yet) — open
- Medium 4 (WriteValuesAsync doesn't await OnWriteComplete) — open
## Cross-cutting deferrals (out of Phase 2)
- **Deletion of v1 archive** — PR 3, gated on operator review + E2E coverage parity check
- **Wonderware Historian SDK plugin port** (`Historian.Aveva``Driver.Galaxy.Host/Backend/Historian/`) — Task B.1.h, opportunistically with PR 3 or as PR 4
- **MxAccess subscription push frames** — Task B.1.s, follow-up to enable real-time data
flow (currently subscribes register but values aren't pushed back)
- **Wonderware Historian-backed HistoryRead** — depends on B.1.h
- **Alarm subsystem wire-up** — `MxAccessGalaxyBackend.SubscribeAlarmsAsync` is a no-op
- **Reconnect-without-recycle** in MxAccessClient — v2.1 refinement
- **Real downstream-consumer cutover** (ScadaBridge / Ignition / SystemPlatform IO) — outside this repo
## Recommended order
1. **PR 1** (`phase-1-configuration``v2`) — merge first; self-contained, parity preserved
2. **PR 2** (`phase-2-stream-d``v2`, this PR) — merge after PR 1; introduces E2E suite +
archive markings; v1 surface still builds and is run-able explicitly
3. **PR 3** (next session) — delete v1 archive; depends on operator approval after PR 2
reviewer signoff
4. **PR 4** (Phase 2 follow-up) — Historian port + MxAccess subscription push frames + the
open high/medium findings

View File

@@ -0,0 +1,181 @@
# Phase 2 Exit Gate Record (2026-04-18)
> Supersedes `phase-2-partial-exit-evidence.md`. Captures the as-built state of Phase 2 after
> the MXAccess COM client port + DB-backed and MXAccess-backed Galaxy backends + adversarial
> review.
## Status: **Streams A, B, C complete. Stream D + E gated only on legacy-Host removal + parity-test rewrite.**
The Phase 2 plan exit criterion ("v1 IntegrationTests pass against v2 Galaxy.Proxy + Galaxy.Host
topology byte-for-byte") still cannot be auto-validated in a single session. The blocker is no
longer "the Galaxy code lift" — that's done in this session — but the structural fact that the
494 v1 IntegrationTests instantiate v1 `OtOpcUa.Host` classes directly. They have to be rewritten
to use the IPC-fronted Proxy topology before legacy `OtOpcUa.Host` can be deleted, and the plan
budgets that work as a multi-day debug-cycle (Task E.1).
What changed today: the MXAccess COM client now exists in Galaxy.Host with a real
`ArchestrA.MxAccess.dll` reference, runs end-to-end against live `LMXProxyServer`, and 3 live
COM smoke tests pass on this dev box. `MxAccessGalaxyBackend` (the third
`IGalaxyBackend` implementation, alongside `StubGalaxyBackend` and `DbBackedGalaxyBackend`)
combines the ported `GalaxyRepository` with the ported `MxAccessClient` so Discover / Read /
Write / Subscribe all flow through one production-shape backend. `Program.cs` selects between
the three backends via the `OTOPCUA_GALAXY_BACKEND` env var (default = `mxaccess`).
## Delivered in Phase 2 (full scope, not just scaffolds)
### Stream A — Driver.Galaxy.Shared (✅ complete)
- 9 contract files: Hello/HelloAck (version negotiation), OpenSession/CloseSession/Heartbeat,
Discover + GalaxyObjectInfo + GalaxyAttributeInfo, Read/Write + GalaxyDataValue,
Subscribe/Unsubscribe/OnDataChange, AlarmSubscribe/Event/Ack, HistoryRead, HostConnectivityStatus,
Recycle.
- Length-prefixed framing (4-byte BE length + 1-byte kind + MessagePack body) with a
16 MiB cap.
- Thread-safe `FrameWriter` (semaphore-gated) and single-consumer `FrameReader`.
- 6 round-trip tests + reflection-scan that asserts contracts only reference BCL + MessagePack.
### Stream B — Driver.Galaxy.Host (✅ complete, exceeded original scope)
- Real Win32 message pump in `StaPump``GetMessage`/`PostThreadMessage`/`PeekMessage`/
`PostQuitMessage` P/Invoke, dedicated STA thread, `WM_APP=0x8000` work dispatch, `WM_APP+1`
graceful-drain → `PostQuitMessage`, 5s join-on-dispose, responsiveness probe.
- Strict `PipeAcl` (allow configured server SID only, deny LocalSystem + Administrators),
`PipeServer` with caller-SID verification + per-process shared-secret `Hello` handshake.
- Galaxy-specific `MemoryWatchdog` (warn `max(1.5×baseline, +200 MB)`, soft-recycle
`max(2×baseline, +200 MB)`, hard ceiling 1.5 GB, slope ≥5 MB/min over 30-min window).
- `RecyclePolicy` (1/hr cap + 03:00 daily scheduled), `PostMortemMmf` (1000-entry ring
buffer, hard-crash survivable, cross-process readable), `MxAccessHandle : SafeHandle`.
- `IGalaxyBackend` interface + 3 implementations:
- **`StubGalaxyBackend`** — keeps IPC end-to-end testable without Galaxy.
- **`DbBackedGalaxyBackend`** — real Discover via the ported `GalaxyRepository` against ZB.
- **`MxAccessGalaxyBackend`** — Discover via DB + Read/Write/Subscribe via the ported
`MxAccessClient` over the StaPump.
- `GalaxyRepository` ported from v1 (HierarchySql + AttributesSql byte-for-byte identical).
- `MxAccessClient` ported from v1 (Connect/Read/Write/Subscribe/Unsubscribe + ConcurrentDict
handle tracking + OnDataChange / OnWriteComplete event marshalling). The reconnect loop +
Historian plugin loader + extended-attribute query are explicit follow-ups.
- `MxProxyAdapter` + `IMxProxy` for COM-isolation testability.
- `Program.cs` env-driven backend selection (`OTOPCUA_GALAXY_BACKEND=stub|db|mxaccess`,
`OTOPCUA_GALAXY_ZB_CONN`, `OTOPCUA_GALAXY_CLIENT_NAME`, plus the Phase 2 baseline
`OTOPCUA_GALAXY_PIPE` / `OTOPCUA_ALLOWED_SID` / `OTOPCUA_GALAXY_SECRET`).
- ArchestrA.MxAccess.dll referenced via HintPath at `lib/ArchestrA.MxAccess.dll`. Project
flipped to **x86 platform target** (the COM interop requires it).
### Stream C — Driver.Galaxy.Proxy (✅ complete)
- `GalaxyProxyDriver` implements **all 9** capability interfaces — `IDriver`, `ITagDiscovery`,
`IReadable`, `IWritable`, `ISubscribable`, `IAlarmSource`, `IHistoryProvider`,
`IRediscoverable`, `IHostConnectivityProbe` — each forwarding through the matching IPC
contract.
- `GalaxyIpcClient` with `CallAsync` (request/response gated through a semaphore so concurrent
callers don't interleave frames) + `SendOneWayAsync` for fire-and-forget calls
(Unsubscribe / AlarmAck / CloseSession).
- `Backoff` (5s → 15s → 60s, capped, reset-on-stable-run), `CircuitBreaker` (3 crashes per
5 min opens; 1h → 4h → manual escalation; sticky alert), `HeartbeatMonitor` (2s cadence,
3 misses = host dead).
### Tests
- **963 pass / 1 pre-existing baseline** across the full solution.
- New in this session:
- `StaPumpTests` — pump still passes 3/3 against the real Win32 implementation
- `EndToEndIpcTests` (5) — every IPC operation through Pipe + dispatcher + StubBackend
- `IpcHandshakeIntegrationTests` (2) — Hello + heartbeat + secret rejection
- `GalaxyRepositoryLiveSmokeTests` (5) — live SQL against ZB, skip when ZB unreachable
- `MxAccessLiveSmokeTests` (3) — live COM against running `aaBootstrap` + `LMXProxyServer`
- All net48 x86 to match Galaxy.Host
## Adversarial review findings
Independent pass over the Phase 2 deltas. Findings ranked by severity; **all open items are
explicitly deferred to Stream D/E or v2.1 with rationale.**
### Critical — none.
### High
1. **MxAccess `ReadAsync` has a subscription-leak window on cancellation.** The one-shot read
uses subscribe → first-OnDataChange → unsubscribe. If the caller cancels between the
`SubscribeOnPumpAsync` await and the `tcs.Task` await, the subscription stays installed.
*Mitigation:* the StaPump's idempotent unsubscribe path drops orphan subs at disconnect, but
a long-running session leaks them. **Fix scoped to Phase 2 follow-up** alongside the proper
subscription registry that v1 had.
2. **No reconnect loop on the MXAccess COM connection.** v1's `MxAccessClient.Monitor` polled
a probe tag and triggered reconnect-with-replay on disconnection. The ported client's
`ConnectAsync` is one-shot and there's no health monitor. *Mitigation:* the Tier C
supervisor on the Proxy side (CircuitBreaker + HeartbeatMonitor) restarts the whole Host
process on liveness failure, so connection loss surfaces as a process recycle rather than
silent data loss. **Reconnect-without-recycle is a v2.1 refinement** per `driver-stability.md`.
### Medium
3. **`MxAccessGalaxyBackend.SubscribeAsync` doesn't push OnDataChange frames back to the
Proxy.** The wire frame `MessageKind.OnDataChangeNotification` is defined and `GalaxyProxyDriver`
has the `RaiseDataChange` internal entry point, but the Host-side push pipeline isn't wired —
the subscribe registers on the COM side but the value just gets discarded. *Mitigation:* the
SubscribeAsync handle is still useful for the ack flow, and one-shot reads work. **Push
plumbing is the next-session item.**
4. **`WriteValuesAsync` doesn't await the OnWriteComplete callback.** v1's implementation
awaited a TCS keyed on the item handle; the port fires the write and returns success without
confirming the runtime accepted it. *Mitigation:* the StatusCode in the response will be 0
(Good) for a fire-and-forget — false positive if the runtime rejects post-callback. **Fix
needs the same TCS-by-handle pattern as v1; queued.**
5. **`MxAccessGalaxyBackend.Discover` re-queries SQL on every call.** v1 cached the tree and
only refreshed on the deploy-watermark change. *Mitigation:* AttributesSql is the slow one
(~30s for a large Galaxy); first-call latency is the symptom, not data loss. **Caching +
`IRediscoverable` push is a v2.1 follow-up.**
### Low
6. **Live MXAccess test `Backend_ReadValues_against_discovered_attribute_returns_a_response_shape`
silently passes if no readable attribute is found.** Documented; the test asserts the *shape*
not the *value* because some Galaxy installs are configuration-only.
7. **`FrameWriter` allocates the length-prefix as a 4-byte heap array per call.** Could be
stackalloc. Microbenchmark not done — currently irrelevant.
8. **`MxProxyAdapter.Unregister` swallows exceptions during `Unregister(handle)`.** v1 did the
same; documented as best-effort during teardown. Consider logging the swallow.
### Out of scope (correctly deferred)
- Stream D.1 — delete legacy `OtOpcUa.Host`. **Cannot be done in any single session** because
the 494 v1 IntegrationTests reference Host classes directly. Requires the test rewrite cycle
in Stream E.
- Stream E.1 — run v1 IntegrationTests against v2 topology. Requires (a) test rewrite to use
Proxy/Host instead of in-process Host classes, then (b) the parity-debug iteration that the
plan budgets 3-4 weeks for.
- Stream E.2 — Client.CLI walkthrough diff. Requires the v1 baseline capture.
- Stream E.3 — four 2026-04-13 stability findings regression tests. Requires the parity test
harness from Stream E.1.
- Wonderware Historian SDK plugin loader (Task B.1.h). HistoryRead returns a recognisable
error until the plugin loader is wired.
- Alarm subsystem wire-up (`MxAccessGalaxyBackend.SubscribeAlarmsAsync` is a no-op today).
v1's alarm tracking is its own subtree; queued as Phase 2 follow-up.
## Stream-D removal checklist (next session)
1. Decide policy on the 494 v1 tests:
- **Option A**: rewrite to use `Driver.Galaxy.Proxy` + `Driver.Galaxy.Host` topology
(multi-day; full parity validation as a side effect)
- **Option B**: archive them as `OtOpcUa.Tests.v1Archive` and write a smaller v2 parity suite
against the new topology (faster; less coverage initially)
2. Execute the chosen option.
3. Delete `src/ZB.MOM.WW.OtOpcUa.Host/`, remove from `.slnx`.
4. Update Windows service installer to register two services
(`OtOpcUa` + `OtOpcUaGalaxyHost`) with the correct service-account SIDs.
5. Migration script for `appsettings.json` Galaxy sections → `DriverInstance.DriverConfig` JSON.
6. PR + adversarial review + `exit-gate-phase-2-final.md`.
## What ships from this session
Eight commits on `phase-1-configuration` since the previous push:
- `01fd90c` Phase 1 finish + Phase 2 scaffold
- `7a5b535` Admin UI core
- `18f93d7` LDAP + SignalR
- `a1e9ed4` AVEVA-stack inventory doc
- `32eeeb9` Phase 2 A+B+C feature-complete
- `549cd36` GalaxyRepository ported + DbBackedBackend + live ZB smoke
- `(this commit)` MXAccess COM port + MxAccessGalaxyBackend + live MXAccess smoke + adversarial review
`494/494` v1 tests still pass. No regressions.

View File

@@ -0,0 +1,209 @@
# Phase 2 — Partial Exit Evidence (2026-04-17)
> This records what Phase 2 of v2 completed in the current session and what was explicitly
> deferred. See `phase-2-galaxy-out-of-process.md` for the full task plan; this is the as-built
> delta.
## Status: **Streams A + B + C complete (real Win32 pump, all 9 capability interfaces, end-to-end IPC dispatch). Streams D + E remain — gated only on the iterative Galaxy code lift + parity-debug cycle.**
The goal per the plan is "parity, not regression" — the phase exit gate requires v1
IntegrationTests to pass against the v2 Galaxy.Proxy + Galaxy.Host topology byte-for-byte.
Achieving that requires live MXAccess runtime plus the Galaxy code lift out of the legacy
`OtOpcUa.Host`. Without that cycle, deleting the legacy Host would break the 494 passing v1
tests that are the parity baseline.
> **Update 2026-04-17 (later) — Streams A/B/C now feature-complete, not just scaffolds.**
> The Win32 message pump in `StaPump` was upgraded from a `BlockingCollection` placeholder to a
> real `GetMessage`/`PostThreadMessage`/`PeekMessage` loop lifted from v1 `StaComThread` (P/Invoke
> declarations included; `WM_APP=0x8000` for work-item dispatch, `WM_APP+1` for graceful
> drain → `PostQuitMessage`, 5s join-on-dispose). `GalaxyProxyDriver` now implements every
> capability interface declared in Phase 2 Stream C — `IDriver`, `ITagDiscovery`, `IReadable`,
> `IWritable`, `ISubscribable`, `IAlarmSource`, `IHistoryProvider`, `IRediscoverable`,
> `IHostConnectivityProbe` — each forwarding through the matching IPC contract. `GalaxyIpcClient`
> gained `SendOneWayAsync` for the fire-and-forget calls (unsubscribe / alarm-ack /
> close-session) while still serializing through the call-gate so writes don't interleave with
> `CallAsync` round-trips. Host side: `IGalaxyBackend` interface defines the seam between IPC
> dispatch and the live MXAccess code, `GalaxyFrameHandler` routes every `MessageKind` into it
> (heartbeat handled inline so liveness works regardless of backend health), and
> `StubGalaxyBackend` returns success for lifecycle/subscribe/recycle and recognizable
> `not-implemented`-coded errors for data-plane calls. End-to-end integration tests exercise
> every capability through the full stack (handshake → open session → read / write / subscribe /
> alarm / history / recycle) and the v1 test baseline stays green (494 pass, no regressions).
>
> **What's left for the Phase 2 exit gate:** the actual Galaxy code lift (Task B.1) — replace
> `StubGalaxyBackend` with a `MxAccessClient`-backed implementation that calls `MxAccessClient`
> on the `StaPump`, plus the parity-cycle debugging against live Galaxy that the plan budgets
> 3-4 weeks for. Removing the legacy `OtOpcUa.Host` (Task D.1) follows once the parity tests
> are green against the v2 topology.
> **Update 2026-04-17 — runtime confirmed local.** The dev box has the full AVEVA stack required
> for the LmxOpcUa breakout: 27 ArchestrA / Wonderware / AVEVA services running including
> `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`,
> `ArchestrADataStore`, `AsbServiceManager`; the full Historian set
> (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `InSQLStorage`,
> `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`,
> `HistorianSearch-x64`); SuiteLink (`slssvc`); MXAccess COM at
> `C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`; and the OI-Gateway
> install at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` (so the
> AppServer-via-OI-Gateway smoke test from decision #142 is *also* runnable here, not blocked
> on a dedicated AVEVA test box).
>
> The "needs a dev Galaxy" prerequisite is therefore satisfied. Stream D + E can start whenever
> the team is ready to take the parity-cycle hit on the 494 v1 tests; no environmental blocker
> remains.
What *is* done: all scaffolding, IPC contracts, supervisor logic, and stability protections
needed to hang the real MXAccess code onto. Every piece has unit-level or IPC-level test
coverage.
## Delivered
### Stream A — `Driver.Galaxy.Shared` (1 week estimate, **complete**)
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/` (.NET Standard 2.0, MessagePack-only
dependency)
- **Contracts**: `Hello`/`HelloAck` (version negotiation per Task A.3), `OpenSessionRequest`/
`OpenSessionResponse`/`CloseSessionRequest`, `Heartbeat`/`HeartbeatAck`, `ErrorResponse`,
`DiscoverHierarchyRequest`/`Response` + `GalaxyObjectInfo` + `GalaxyAttributeInfo`,
`ReadValuesRequest`/`Response`, `WriteValuesRequest`/`Response`, `SubscribeRequest`/
`Response`/`UnsubscribeRequest`/`OnDataChangeNotification`, `AlarmSubscribeRequest`/
`GalaxyAlarmEvent`/`AlarmAckRequest`, `HistoryReadRequest`/`Response`+`HistoryTagValues`,
`HostConnectivityStatus`+`RuntimeStatusChangeNotification`, `RecycleHostRequest`/
`RecycleStatusResponse`
- **Framing**: length-prefixed (decision #28) + 1-byte kind tag + MessagePack body. 16 MiB
body cap. `FrameWriter`/`FrameReader` with thread-safe write gate.
- **Tests (6)**: reflection-scan round-trip for every `[MessagePackObject]`, referenced-
assemblies guard (only MessagePack allowed outside BCL), Hello version defaults,
`FrameWriter``FrameReader` interop, oversize-frame rejection.
### Stream B — `Driver.Galaxy.Host` (34 week estimate, **scaffold complete; MXAccess lift deferred**)
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/` (.NET Framework 4.8 AnyCPU — flips to x86 when
the Galaxy code lift happens per Task B.1 scope)
- **`Ipc/PipeAcl`**: builds the strict `PipeSecurity` — allow configured server-principal SID,
explicit deny on LocalSystem + Administrators, owner = allowed SID (decision #76).
- **`Ipc/PipeServer`**: named-pipe server that (1) enforces the ACL, (2) verifies caller SID
via `pipe.RunAsClient` + `WindowsIdentity.GetCurrent`, (3) requires the per-process shared
secret in the Hello frame before any other RPC, (4) rejects major-version mismatches.
- **`Stability/MemoryWatchdog`**: Galaxy thresholds — warn at `max(1.5×baseline, +200 MB)`,
soft-recycle at `max(2×baseline, +200 MB)`, hard ceiling 1.5 GB, slope ≥5 MB/min over 30 min.
Pluggable RSS source for unit testability.
- **`Stability/RecyclePolicy`**: 1-recycle/hr cap; 03:00 local daily scheduled recycle.
- **`Stability/PostMortemMmf`**: ring buffer of 1000 × 256-byte entries in `%ProgramData%\
OtOpcUa\driver-postmortem\galaxy.mmf`. Single-writer / multi-reader. Survives hard crash;
supervisor reads the MMF via a second process.
- **`Sta/MxAccessHandle`**: `SafeHandle` subclass — `ReleaseHandle` calls `Marshal.ReleaseComObject`
in a loop until refcount = 0 then invokes the optional `unregister` callback. Finalizer-safe.
Wraps any RCW via `object` so we can unit-test against a mock; the real wiring to
`ArchestrA.MxAccess.LMXProxyServer` lands with the deferred code move.
- **`Sta/StaPump`**: dedicated STA thread with `BlockingCollection` work queue + `InvokeAsync`
dispatch. Responsiveness probe (`IsResponsiveAsync`) returns false on wedge. The real
Win32 `GetMessage/DispatchMessage` pump from v1 `LmxProxy.Host` slots in here with the same
dispatch semantics.
- **`IsExternalInit` shim**: required for `init` setters on .NET 4.8.
- **`Program.cs`**: reads `OTOPCUA_GALAXY_PIPE`, `OTOPCUA_ALLOWED_SID`, `OTOPCUA_GALAXY_SECRET`
from env (supervisor sets at spawn), runs the pipe server, logs via Serilog to
`%ProgramData%\OtOpcUa\galaxy-host-YYYY-MM-DD.log`.
- **`Ipc/StubFrameHandler`**: placeholder that heartbeat-acks and returns `not-implemented`
errors. Swapped for the real Galaxy-backed handler when the MXAccess code move completes.
- **Tests (15)**: `MemoryWatchdog` thresholds + slope detection; `RecyclePolicy` cap + daily
schedule; `PostMortemMmf` round-trip + ring-wrap + truncation-safety; `StaPump`
apartment-state + responsiveness-probe wedge detection.
### Stream C — `Driver.Galaxy.Proxy` (1.5 week estimate, **complete as IPC-forwarder**)
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/` (.NET 10)
- **`Ipc/GalaxyIpcClient`**: Hello handshake + shared-secret authentication + single-call
request/response over the data-plane pipe. Serializes concurrent callers via
`SemaphoreSlim`. Lifts `ErrorResponse` to `GalaxyIpcException` with the error code.
- **`GalaxyProxyDriver`**: implements `IDriver` + `ITagDiscovery`. Forwards lifecycle and
discovery over IPC; maps Galaxy MX data types → `DriverDataType` and security classifications
→ `SecurityClassification`. Stream C-plan capability interfaces for `IReadable`, `IWritable`,
`ISubscribable`, `IAlarmSource`, `IHistoryProvider`, `IHostConnectivityProbe`,
`IRediscoverable` are structured identically — wire them in when the Host's MXAccess backend
exists so the round-trips can actually serve data.
- **`Supervisor/Backoff`**: 5s → 15s → 60s capped; `RecordStableRun` resets after 2-min
successful run.
- **`Supervisor/CircuitBreaker`**: 3 crashes per 5 min opens; cooldown escalates
1h → 4h → manual (`TimeSpan.MaxValue`). Sticky alert doesn't auto-clear when cooldown
elapses; `ManualReset` only.
- **`Supervisor/HeartbeatMonitor`**: 2s cadence, 3 consecutive misses = host dead.
- **Tests (11)**: `Backoff` sequence + reset; `CircuitBreaker` full 1h/4h/manual escalation
path; `HeartbeatMonitor` miss-count + ack-reset; full IPC handshake round-trip
(Host + Proxy over a real named pipe, heartbeat ack verified; shared-secret mismatch
rejected with `UnauthorizedAccessException`).
## Deferred (explicitly noted as TODO)
### Stream D — Retire legacy `OtOpcUa.Host`
**Not executable until Stream E parity passes.** Deleting the legacy project now would break
the 494 v1 IntegrationTests that are the parity baseline. Recovery requires:
1. Host MXAccess code lift (Task B.1 "move Galaxy code") from `OtOpcUa.Host/` into
`OtOpcUa.Driver.Galaxy.Host/` — STA pump wiring, `MxAccessHandle` backing the real
`LMXProxyServer`, `GalaxyRepository` and its SQL queries, `GalaxyRuntimeProbeManager`,
Historian loader, the Ipc stub handler replaced with a real `IFrameHandler` that invokes
the handle.
2. Address-space build via `IAddressSpaceBuilder` produces byte-equivalent OPC UA browse
output to v1 (Task C.4).
3. Windows service installer registers two services (`OtOpcUa` + `OtOpcUaGalaxyHost`) with
the correct service-account SIDs and per-process secret provisioning. Galaxy.Host starts
before OtOpcUa.
4. `appsettings.json` Galaxy config (MxAccess / Galaxy / Historian sections) migrated into
`DriverInstance.DriverConfig` JSON in the Configuration DB via an idempotent migration
script. Post-migration, the local `appsettings.json` keeps only `Cluster.NodeId`,
`ClusterId`, and the DB conn string per decision #18.
### Stream E — Parity validation
Requires live MXAccess + Galaxy runtime and the above lift complete. Work items:
- Run v1 IntegrationTests against the v2 Galaxy.Proxy + Galaxy.Host topology. Pass count =
v1 baseline; failures = 0. Per-test duration regression report flags any test >2× baseline.
- Scripted Client.CLI walkthrough recorded at Phase 2 entry gate against v1, replayed
against v2; diff must show only timestamp/latency differences.
- Regression tests for the four 2026-04-13 stability findings (phantom probe, cross-host
quality clear, sync-over-async guard, fire-and-forget alarm drain).
- `/codex:adversarial-review --base v2` on the merged Phase 2 diff — findings closed or
deferred with rationale.
## Also deferred from Stream B
- **Task B.10 FaultShim** (test-only `ArchestrA.MxAccess` substitute for fault injection).
Needs the production `ArchestrA.MxAccess` reference in place first; flagged as part of the
plan's "mid-gate review" fallback (Risk row 7).
- **Task B.8 WM_QUIT hard-exit escalation** — wired in when the real Win32 pump replaces the
`BlockingCollection` dispatcher. The `StaPump.IsResponsiveAsync` probe already exists; the
supervisor escalation-to-`Environment.Exit(2)` belongs to the Program main loop after the
pump integration.
## Cross-session impact on the build
- **Full solution**: 926 tests pass, 1 fails (pre-existing Phase 0 baseline
`Client.CLI.Tests.SubscribeCommandTests.Execute_PrintsSubscriptionMessage` — not a Phase 2
regression; was red before Phase 1 and stays red through Phase 2).
- **New projects added to `.slnx`**: `Driver.Galaxy.Shared`, `Driver.Galaxy.Host`,
`Driver.Galaxy.Proxy`, plus the three matching test projects.
- **No existing tests broke.** The 494 v1 `OtOpcUa.Tests` (net48) and 6 `IntegrationTests`
(net48) still pass because the legacy `OtOpcUa.Host` is untouched.
## Next-session checklist for Stream D + E
1. Verify the local AVEVA stack is still green (`Get-Service aaGR, aaBootstrap, slssvc` →
Running) and the Galaxy `ZB` repository is reachable from `sqlcmd -S localhost -d ZB -E`.
The runtime is already on this machine — no install step needed.
2. Capture Client.CLI walkthrough baseline against v1 (the parity reference).
3. Move Galaxy-specific files from `OtOpcUa.Host` into `Driver.Galaxy.Host`, renaming
namespaces. Replace `StubFrameHandler` with the real one.
4. Wire up the real Win32 pump inside `StaPump` (lift from scadalink-design's
`LmxProxy.Host` reference per CLAUDE.md).
5. Run v1 IntegrationTests against the v2 topology — iterate on parity defects until green.
6. Run Client.CLI walkthrough and diff.
7. Regression tests for the four 2026-04-13 stability findings.
8. Delete legacy `OtOpcUa.Host`; update `.slnx`; update installer scripts.
9. Optional but valuable now that the runtime is local: AppServer-via-OI-Gateway smoke test
(decision #142 / Phase 1 Task E.10) — the OI-Gateway install at
`C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` is in place; the test was deferred
for "needs live AVEVA runtime" reasons that no longer apply on this dev box.
10. Adversarial review; `exit-gate-phase-2.md` recorded; PR merged.

View File

@@ -0,0 +1,80 @@
# PR 1 — Phase 1 + Phase 2 A/B/C → v2
**Source**: `phase-1-configuration` (commits `980ea51..7403b92`, 11 commits)
**Target**: `v2`
**URL**: https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-1-configuration
## Summary
- **Phase 1 complete** — Configuration project with 16 entities + 3 EF migrations
(InitialSchema + 8 stored procs + AuthorizationGrants), Core + Server + full Admin UI
(Blazor Server with cluster CRUD, draft → diff → publish → rollback, equipment with
OPC 40010, UNS, namespaces, drivers, ACLs, reservations, audit), LDAP via GLAuth
(`localhost:3893`), SignalR real-time fleet status + alerts.
- **Phase 2 Streams A + B + C feature-complete** — full IPC contract surface
(Galaxy.Shared, netstandard2.0, MessagePack), Galaxy.Host with real Win32 STA pump,
ACL + caller-SID + per-process-secret IPC, Galaxy-specific MemoryWatchdog +
RecyclePolicy + PostMortemMmf + MxAccessHandle, three `IGalaxyBackend`
implementations (Stub / DbBacked / **MxAccess** — real ArchestrA.MxAccess.dll
reference, x86, smoke-tested live against `LMXProxyServer`), Galaxy.Proxy with all
9 capability interfaces (`IDriver` / `ITagDiscovery` / `IReadable` / `IWritable` /
`ISubscribable` / `IAlarmSource` / `IHistoryProvider` / `IRediscoverable` /
`IHostConnectivityProbe`) + supervisor (Backoff + CircuitBreaker +
HeartbeatMonitor).
- **Phase 2 Stream D non-destructive deliverables** — appsettings.json → DriverConfig
migration script, two-service Windows installer scripts, process-spawn cross-FX
parity test, Stream D removal procedure doc with both Option A (rewrite 494 v1
tests) and Option B (archive + new v2 E2E suite) spelled out step-by-step.
## What's NOT in this PR
- Legacy `OtOpcUa.Host` deletion (Stream D.1) — reserved for a follow-up PR after
Option B's E2E suite is green. The 494 v1 tests still pass against the unchanged
legacy Host.
- Live-Galaxy parity validation (Stream E) — needs the iterative debug cycle the
removal-procedure doc describes.
## Tests
**964 pass / 1 pre-existing Phase 0 baseline failure**, across 14 test projects:
| Project | Pass | Notes |
|---|---:|---|
| Core.Abstractions.Tests | 24 | |
| Configuration.Tests | 42 | incl. 7 schema compliance, 8 stored-proc, 3 SQL-role auth, 13 validator, 6 LiteDB cache, 5 generation-applier |
| Core.Tests | 4 | DriverHost lifecycle |
| Server.Tests | 2 | NodeBootstrap + LiteDB cache fallback |
| Admin.Tests | 21 | incl. 5 RoleMapper, 6 LdapAuth, 3 LiveLdap, 2 FleetStatusPoller, 2 services-integration |
| Driver.Galaxy.Shared.Tests | 6 | Round-trip + framing |
| Driver.Galaxy.Host.Tests | 30 | incl. 5 GalaxyRepository live ZB, 3 live MXAccess COM, 5 EndToEndIpc, 2 IpcHandshake, 4 MemoryWatchdog, 3 RecyclePolicy, 3 PostMortemMmf, 3 StaPump, 2 service-installer dry-run |
| Driver.Galaxy.Proxy.Tests | 10 | 9 unit + 1 process-spawn parity |
| Client.Shared.Tests | 131 | unchanged |
| Client.UI.Tests | 98 | unchanged |
| Client.CLI.Tests | 51 / 1 fail | pre-existing baseline failure |
| Historian.Aveva.Tests | 41 | unchanged |
| IntegrationTests (net48) | 6 | unchanged — v1 parity baseline |
| **OtOpcUa.Tests (net48)** | **494** | **unchanged — v1 parity baseline** |
## Test plan for reviewers
- [ ] `dotnet build ZB.MOM.WW.OtOpcUa.slnx` succeeds with no warnings beyond the
known NuGetAuditSuppress + xUnit1051 warnings
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` shows the same 964/1 result
- [ ] `Get-Service aaGR, aaBootstrap` reports Running on the merger's box
- [ ] `docker ps --filter name=otopcua-mssql` shows the SQL container Up
- [ ] Admin UI boots (`dotnet run --project src/ZB.MOM.WW.OtOpcUa.Admin`); home page
renders at http://localhost:5123/; LDAP sign-in with GLAuth `readonly` /
`readonly123` succeeds
- [ ] Migration script dry-run: `powershell -File
scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 -DryRun` produces
a well-formed DriverConfig JSON
- [ ] Spot-read three commit messages to confirm the deferred-with-rationale items
are explicitly documented (`549cd36`, `a7126ba`, `7403b92` are the most
recent and most detailed)
## Follow-up tracking
PR 2 (next session) will execute Stream D Option B — archive `OtOpcUa.Tests` as
`OtOpcUa.Tests.v1Archive`, build the new `OtOpcUa.Driver.Galaxy.E2E` test project,
delete legacy `OtOpcUa.Host`, and run the parity-validation cycle. See
`docs/v2/implementation/stream-d-removal-procedure.md`.

View File

@@ -0,0 +1,69 @@
# PR 2 — Phase 2 Stream D Option B (archive v1 + E2E suite) → v2
**Source**: `phase-2-stream-d` (branched from `phase-1-configuration`)
**Target**: `v2`
**URL** (after push): https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-2-stream-d
## Summary
Phase 2 Stream D Option B per `docs/v2/implementation/stream-d-removal-procedure.md`:
- **Archived the v1 surface** without deleting:
- `tests/ZB.MOM.WW.OtOpcUa.Tests/``tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
(`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so v1 Host's `InternalsVisibleTo`
still matches; `<IsTestProject>false</IsTestProject>` so solution test runs skip it).
- `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/``<IsTestProject>false</IsTestProject>`
+ archive comment.
- `src/ZB.MOM.WW.OtOpcUa.Host/` + `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` — archive
PropertyGroup comments. Both still build (Historian plugin + 41 historian tests still
pass) so Phase 2 PR 3 can delete them in a focused, reviewable destructive change.
- **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** test project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as a subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable / Host EXE not built / Administrator shell.
- `HierarchyParityTests` (3) and `StabilityFindingsRegressionTests` (4) — one test per
2026-04-13 stability finding (phantom probe, cross-host quality clear, sync-over-async,
fire-and-forget alarm shutdown race).
- **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
- **`docs/v2/implementation/exit-gate-phase-2-final.md`** — supersedes the two partial-exit
docs with the as-built state, adversarial review of PR 2 deltas (4 new findings), and the
recommended PR sequence (1 → 2 → 3 → 4).
## What's NOT in this PR
- Deletion of the v1 archive — saved for PR 3 with explicit operator review (destructive change).
- Wonderware Historian SDK plugin port — Task B.1.h, follow-up to enable real `HistoryRead`.
- MxAccess subscription push-frames — Task B.1.s, follow-up to enable real-time
data-change push from Host → Proxy.
## Tests
**`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 pre-existing baseline**.
The 7 skips are the new E2E tests, all skipping with the documented reason
"PipeAcl denies Administrators on dev shells" — the production install runs as a non-admin
service account and these tests will execute there.
Run the archived v1 suites explicitly:
```powershell
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive # → 494 pass
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests # → 6 pass
```
## Test plan for reviewers
- [ ] `dotnet build ZB.MOM.WW.OtOpcUa.slnx` succeeds with no warnings beyond the known
NuGetAuditSuppress + NU1702 cross-FX
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` shows the 470/7-skip/1-baseline result
- [ ] Both archived suites pass when run explicitly
- [ ] Build the Galaxy.Host EXE (`dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`),
then run E2E tests on a non-admin shell — they should actually execute and pass
against live Galaxy ZB
- [ ] Spot-read `docs/v2/V1_ARCHIVE_STATUS.md` and confirm the deletion plan is acceptable
## Follow-up tracking
- **PR 3** (next session, when ready): execute the deletion plan in `V1_ARCHIVE_STATUS.md`.
4 projects removed, .slnx updated, full solution test confirms parity.
- **PR 4** (Phase 2 follow-up): port Historian plugin + wire MxAccess subscription pushes +
close the high/medium open findings from `exit-gate-phase-2-final.md`.

View File

@@ -0,0 +1,91 @@
# PR 4 — Phase 2 follow-up: close the 4 open MXAccess findings
**Source**: `phase-2-pr4-findings` (branched from `phase-2-stream-d`)
**Target**: `v2`
## Summary
Closes the 4 high/medium open findings carried forward in `exit-gate-phase-2-final.md`:
- **High 1 — `ReadAsync` subscription-leak on cancel.** One-shot read now wraps the
subscribe→first-OnDataChange→unsubscribe pattern in a `try/finally` so the per-tag
callback is always detached, and if the read installed the underlying MXAccess
subscription itself (no other caller had it), it tears it down on the way out.
- **High 2 — No reconnect loop on the MXAccess COM connection.** New
`MxAccessClientOptions { AutoReconnect, MonitorInterval, StaleThreshold }` + a background
`MonitorLoopAsync` that watches a stale-activity threshold + probes the proxy via a
no-op COM call, then reconnects-with-replay (re-Register, re-AddItem every active
subscription) when the proxy is dead. Liveness signal: every `OnDataChange` callback bumps
`_lastObservedActivityUtc`. Defaults match v1 monitor cadence (5s poll, 60s stale).
`ReconnectCount` exposed for diagnostics; `ConnectionStateChanged` event for downstream
consumers (the supervisor on the Proxy side already surfaces this through its
HeartbeatMonitor, but the Host-side event lets local logging/metrics hook in).
- **Medium 3 — `MxAccessGalaxyBackend.SubscribeAsync` doesn't push OnDataChange frames back to
the Proxy.** New `IGalaxyBackend.OnDataChange` / `OnAlarmEvent` / `OnHostStatusChanged`
events that the new `GalaxyFrameHandler.AttachConnection` subscribes per-connection and
forwards as outbound `OnDataChangeNotification` / `AlarmEvent` /
`RuntimeStatusChange` frames through the connection's `FrameWriter`. `MxAccessGalaxyBackend`
fans out per-tag value changes to every `SubscriptionId` that's listening to that tag
(multiple Proxy subs may share a Galaxy attribute — single COM subscription, multi-fan-out
on the wire). Stub + DbBacked backends declare the events with `#pragma warning disable
CS0067` (treat-warnings-as-errors would otherwise fail on never-raised events that exist
only to satisfy the interface).
- **Medium 4 — `WriteValuesAsync` doesn't await `OnWriteComplete`.** New
`WriteAsync(...)` overload returns `bool` after awaiting the OnWriteComplete callback via
the v1-style `TaskCompletionSource`-keyed-by-item-handle pattern in `_pendingWrites`.
`MxAccessGalaxyBackend.WriteValuesAsync` now reports per-tag `Bad_InternalError` when the
runtime rejected the write, instead of false-positive `Good`.
## Pipe server change
`IFrameHandler` gains `AttachConnection(FrameWriter writer): IDisposable` so the handler can
register backend event sinks on each accepted connection and detach them at disconnect. The
`PipeServer.RunOneConnectionAsync` calls it after the Hello handshake and disposes it in the
finally of the per-connection scope. `StubFrameHandler` returns `IFrameHandler.NoopAttachment.Instance`
(net48 doesn't support default interface methods, so the empty-attach lives as a public nested
class).
## Tests
**`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **460 pass / 7 skip (E2E on admin shell) / 1
pre-existing baseline failure**. No regressions. The Driver.Galaxy.Host unit tests + 5 live
ZB smoke + 3 live MXAccess COM smoke all pass unchanged.
## Test plan for reviewers
- [ ] `dotnet build` clean
- [ ] `dotnet test` shows 460/7-skip/1-baseline
- [ ] Spot-check `MxAccessClient.MonitorLoopAsync` against v1's `MxAccessClient.Monitor`
partial (`src/ZB.MOM.WW.OtOpcUa.Host/MxAccess/MxAccessClient.Monitor.cs`) — same
polling cadence, same probe-then-reconnect-with-replay shape
- [ ] Read `GalaxyFrameHandler.ConnectionSink.Dispose` and confirm event handlers are
detached on connection close (no leaked invocation list refs)
- [ ] `WriteValuesAsync` returning `Bad_InternalError` on a runtime-rejected write is the
correct shape — confirm against the v1 `MxAccessClient.ReadWrite.cs` pattern
## What's NOT in this PR
- Wonderware Historian SDK plugin port (Task B.1.h) — separate PR, larger scope.
- Alarm subsystem wire-up (`MxAccessGalaxyBackend.SubscribeAlarmsAsync` is still a no-op).
`OnAlarmEvent` is declared on the backend interface and pushed by the frame handler when
raised; `MxAccessGalaxyBackend` just doesn't raise it yet (waits for the alarm-tracking
port from v1's `AlarmObjectFilter` + Galaxy alarm primitives).
- Host-status push (`OnHostStatusChanged`) — declared on the interface and pushed by the
frame handler; `MxAccessGalaxyBackend` doesn't raise it (the Galaxy.Host's
`HostConnectivityProbe` from v1 needs porting too, scoped under the Historian PR).
## Adversarial review
Quick pass over the PR 4 deltas. No new findings beyond:
- **Low 1** — `MonitorLoopAsync`'s `$Heartbeat` probe item-handle is leaked
(`AddItem` succeeds, never `RemoveItem`'d). Cosmetic — the probe item is internal to
the COM connection, dies with `Unregister` at disconnect/recycle. Worth a follow-up
to call `RemoveItem` after the probe succeeds.
- **Low 2** — Replay loop in `MonitorLoopAsync` swallows per-subscription failures. If
Galaxy permanently rejects a previously-valid reference (rare but possible after a
re-deploy), the user gets silent data loss for that one subscription. The stub-handler-
unaware operator wouldn't notice. Worth surfacing as a `ConnectionStateChanged(false)
→ ConnectionStateChanged(true)` payload that includes the replay-failures list.
Both are low-priority follow-ups, not PR 4 blockers.

View File

@@ -0,0 +1,103 @@
# Stream D — Legacy `OtOpcUa.Host` Removal Procedure
> Sequenced playbook for the next session that takes Phase 2 to its full exit gate.
> All Stream A/B/C work is committed. The blocker is structural: the 494 v1
> `OtOpcUa.Tests` instantiate v1 `Host` classes directly, so they must be
> retargeted (or archived) before the Host project can be deleted.
## Decision: Option A or Option B
### Option A — Rewrite the 494 v1 tests to use v2 topology
**Effort**: 3-5 days. Highest fidelity (full v1 test coverage carries forward).
**Steps**:
1. Build a `ProxyMxAccessClientAdapter` in a new `OtOpcUa.LegacyTestCompat/` project that
implements v1's `IMxAccessClient` by forwarding to `Driver.Galaxy.Proxy.GalaxyProxyDriver`.
Maps v1 `Vtq` ↔ v2 `DataValueSnapshot`, v1 `Quality` enum ↔ v2 `StatusCode` u32, the v1
`OnTagValueChanged` event ↔ v2 `ISubscribable.OnDataChange`.
2. Same idea for `IGalaxyRepository` — adapter that wraps v2's `Backend.Galaxy.GalaxyRepository`.
3. Replace `MxAccessClient` constructions in `OtOpcUa.Tests` test fixtures with the adapter.
Most tests use a single fixture so the change-set is concentrated.
4. For each test class: run; iterate on parity defects until green. Expected defect families:
timing-sensitive assertions (IPC adds ~5ms latency; widen tolerances), Quality enum vs
StatusCode mismatches, value-byte-encoding differences.
5. Once all 494 pass: proceed to deletion checklist below.
**When to pick A**: regulatory environments that need the full historical test suite green,
or when the v2 parity gate is itself a release-blocking artifact downstream consumers will
look for.
### Option B — Archive the 494 v1 tests, build a smaller v2 parity suite
**Effort**: 1-2 days. Faster to green; less coverage initially, accreted over time.
**Steps**:
1. Rename `tests/ZB.MOM.WW.OtOpcUa.Tests/``tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`.
Add `<IsTestProject>false</IsTestProject>` so CI doesn't run them; mark every class with
`[Trait("Category", "v1Archive")]` so a future operator can opt in via `--filter`.
2. New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/` project (.NET 10):
- `ParityFixture` spawns Galaxy.Host EXE per test class with `OTOPCUA_GALAXY_BACKEND=mxaccess`
pointing at the dev box's live Galaxy. Pattern from `HostSubprocessParityTests`.
- 10-20 representative tests covering the core paths: hierarchy shape, attribute count,
read-Manufacturer-Boolean, write-Operate-Float roundtrip, subscribe-receives-OnDataChange,
Bad-quality on disconnect, alarm-event-shape.
3. The four 2026-04-13 stability findings get individual regression tests in this project.
4. Once green: proceed to deletion checklist below.
**When to pick B**: typical dev velocity case. The v1 archive is reference, the new suite is
the live parity bar.
## Deletion checklist (after Option A or B is green)
Pre-conditions:
- [ ] Chosen-option test suite green (494 retargeted OR new E2E suite passing on this box)
- [ ] `phase-2-compliance.ps1` runs and exits 0
- [ ] `Get-Service aaGR, aaBootstrap` → Running
- [ ] `Driver.Galaxy.Host` x86 publish output verified at
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Release/net48/`
- [ ] Migration script tested: `scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1
-AppSettingsPath src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json -DryRun` produces a
well-formed DriverConfig
- [ ] Service installer scripts dry-run on a test box: `scripts/install/Install-Services.ps1
-InstallRoot C:\OtOpcUa -ServiceAccount LOCALHOST\testuser` registers both services
and they start
Steps:
1. Delete `src/ZB.MOM.WW.OtOpcUa.Host/` (the legacy in-process Host project).
2. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the legacy Host `<Project>` line; keep all v2
project lines.
3. Migrate the dev `appsettings.json` Galaxy sections to `DriverConfig` JSON via the
migration script; insert into the Configuration DB for the dev cluster's Galaxy driver
instance.
4. Run the chosen test suite once more — confirm zero regressions from the deletion.
5. Build full solution (`dotnet build ZB.MOM.WW.OtOpcUa.slnx`) — confirm clean build with
no references to the deleted project.
6. Commit:
`git rm -r src/ZB.MOM.WW.OtOpcUa.Host` followed by the slnx + cleanup edits in one
atomic commit titled "Phase 2 Stream D — retire legacy OtOpcUa.Host".
7. Run `/codex:adversarial-review --base v2` on the merged Phase 2 diff.
8. Record `exit-gate-phase-2-final.md` with: Option chosen, deletion-commit SHA, parity
test count + duration, adversarial-review findings (each closed or deferred with link).
9. Open PR against `v2`, link the exit-gate doc + compliance script output + parity report.
10. Merge after one reviewer signoff.
## Rollback
If Stream D causes downstream consumer failures (ScadaBridge / Ignition / SystemPlatform IO
clients seeing different OPC UA behavior), the rollback is `git revert` of the deletion
commit — the whole v2 codebase keeps Galaxy.Proxy + Galaxy.Host installed alongside the
restored legacy Host. Production can run either topology. `OtOpcUa.Driver.Galaxy.Proxy`
becomes dormant until the next attempt.
## Why this can't one-shot in an autonomous session
- The parity-defect debug cycle is intrinsically interactive: each iteration requires running
the test suite against live Galaxy, inspecting the diff, deciding if the difference is a
legitimate v2 improvement or a regression, then either widening the assertion or fixing the
v2 code. That decision-making is the bottleneck, not the typing.
- The legacy-Host deletion is destructive — needs explicit operator authorization on a real
PR review, not unattended automation.
- The downstream consumer cutover (ScadaBridge, Ignition, AppServer) lives outside this repo
and on an integration-team track; "Phase 2 done" inside this repo is a precondition, not
the full release.

109
docs/v2/lmx-followups.md Normal file
View File

@@ -0,0 +1,109 @@
# LMX Galaxy bridge — remaining follow-ups
State after PR 19: the Galaxy driver is functionally at v1 parity through the
`IDriver` abstraction; the OPC UA server runs with LDAP-authenticated
Basic256Sha256 endpoints and alarms are observable through
`AlarmConditionState.ReportEvent`. The items below are what remains LMX-
specific before the stack can fully replace the v1 deployment, in
rough priority order.
## 1. Proxy-side `IHistoryProvider` for `ReadAtTime` / `ReadEvents`
**Status**: Host-side IPC shipped (PR 10 + PR 11). Proxy consumer not written.
PR 10 added `HistoryReadAtTimeRequest/Response` on the IPC wire and
`MxAccessGalaxyBackend.HistoryReadAtTimeAsync` delegates to
`HistorianDataSource.ReadAtTimeAsync`. PR 11 did the same for events
(`HistoryReadEventsRequest/Response` + `GalaxyHistoricalEvent`). The Proxy
side (`GalaxyProxyDriver`) doesn't call those yet — `Core.Abstractions.IHistoryProvider`
only exposes `ReadRawAsync` + `ReadProcessedAsync`.
**To do**:
- Extend `IHistoryProvider` with `ReadAtTimeAsync(string, DateTime[], …)` and
`ReadEventsAsync(string?, DateTime, DateTime, int, …)`.
- `GalaxyProxyDriver` calls the new IPC message kinds.
- `DriverNodeManager` wires the new capability methods onto `HistoryRead`
`AtTime` + `Events` service handlers.
- Integration test: OPC UA client calls `HistoryReadAtTime` / `HistoryReadEvents`,
value flows through IPC to the Host's `HistorianDataSource`, back to the client.
## 2. Write-gating by role — **DONE (PR 26)**
Landed in PR 26. `WriteAuthzPolicy` in `Server/Security/` maps
`SecurityClassification` → required role (`FreeAccess` → no role required,
`Operate`/`SecuredWrite``WriteOperate`, `Tune``WriteTune`,
`Configure`/`VerifiedWrite``WriteConfigure`, `ViewOnly` → deny regardless).
`DriverNodeManager` caches the classification per variable during discovery and
checks the session's roles (via `IRoleBearer`) in `OnWriteValue` before calling
`IWritable.WriteAsync`. Roles do not cascade — a session with `WriteOperate`
can't write a `Tune` attribute unless it also carries `WriteTune`.
See `feedback_acl_at_server_layer.md` in memory for the architectural directive
that authz stays at the server layer and never delegates to driver-specific auth.
## 3. Admin UI client-cert trust management — **DONE (PR 28)**
PR 28 shipped `/certificates` in the Admin UI. `CertTrustService` reads the OPC
UA server's PKI store root (`OpcUaServerOptions.PkiStoreRoot` — default
`%ProgramData%\OtOpcUa\pki`) and lists rejected + trusted certs by parsing the
`.der` files directly, so it has no `Opc.Ua` dependency and runs on any
Admin host that can reach the shared PKI directory.
Operator actions: Trust (moves `rejected/certs/*.der``trusted/certs/*.der`),
Delete rejected, Revoke trust. The OPC UA stack re-reads the trusted store on
each new client handshake, so no explicit reload signal is needed —
operators retry the rejected client's connection after trusting.
Deferred: flipping `AutoAcceptUntrustedClientCertificates` to `false` as the
deployment default. That's a production-hardening config change, not a code
gap — the Admin UI is now ready to be the trust gate.
## 4. Live-LDAP integration test
**Status**: PR 19 unit-tested the auth-flow shape; the live bind path is
exercised only by the pre-existing `Admin.Tests/LdapLiveBindTests.cs` which
uses the same Novell library against a running GLAuth at `localhost:3893`.
**To do**:
- Add `OpcUaServerIntegrationTests.Valid_username_authenticates_against_live_ldap`
with the same skip-when-unreachable guard.
- Assert `session.Identity` on the server side carries the expected role
after bind — requires exposing a test hook or reading identity from a
new `IHostConnectivityProbe`-style "whoami" variable in the address space.
## 5. Full Galaxy live-service smoke test against the merged v2 stack
**Status**: Individual pieces have live smoke tests (PR 5 MXAccess, PR 13
probe manager, PR 14 alarm tracker), but the full loop — OPC UA client →
`OtOpcUaServer``GalaxyProxyDriver` (in-process) → named-pipe to
Galaxy.Host subprocess → live MXAccess runtime → real Galaxy objects — has
no single end-to-end smoke test.
**To do**:
- Test that spawns the full topology, discovers a deployed Galaxy object,
subscribes to one of its attributes, writes a value back, and asserts the
write round-tripped through MXAccess. Skip when ArchestrA isn't running.
## 6. Second driver instance on the same server
**Status**: `DriverHost.RegisterAsync` supports multiple drivers; the OPC UA
server creates one `DriverNodeManager` per driver and isolates their
subtrees under distinct namespace URIs. Not proven with two active
`GalaxyProxyDriver` instances pointing at different Galaxies.
**To do**:
- Integration test that registers two driver instances, each with a distinct
`DriverInstanceId` + endpoint in its own session, asserts nodes from both
appear under the correct subtrees, alarm events land on the correct
instance's condition nodes.
## 7. Host-status per-AppEngine granularity → Admin UI dashboard
**Status**: PR 13 ships per-platform/per-AppEngine `ScanState` probing; PR 17
surfaces the resulting `OnHostStatusChanged` events through OPC UA. Admin
UI doesn't render a per-host dashboard yet.
**To do**:
- SignalR hub push of `HostStatusChangedEventArgs` to the Admin UI.
- Dashboard page showing each tracked host, current state, last transition
time, failure count.

103
docs/v2/modbus-test-plan.md Normal file
View File

@@ -0,0 +1,103 @@
# Modbus driver — test plan + device-quirk catalog
The Modbus TCP driver unit tests (PRs 2124) cover the protocol surface against an
in-memory fake transport. They validate the codec, state machine, and function-code
routing against a textbook Modbus server. That's necessary but not sufficient: real PLC
populations disagree with the spec in small, device-specific ways, and a driver that
passes textbook tests can still misbehave against actual equipment.
This doc is the harness-and-quirks playbook. It's what gets wired up in the
`tests/Driver.Modbus.IntegrationTests` project when we ship that (PR 26 candidate).
## Harness
**Chosen simulator: ModbusPal** (Java, scriptable). Rationale:
- Scriptable enough to mimic device-specific behaviors (non-standard register
layouts, custom exception codes, intentional response delays).
- Runs locally, no CI dependency. Tests skip when `localhost:502` (or the configured
simulator endpoint) isn't reachable.
- Free + long-maintained — physical PLC bench is unavailable in most dev
environments, and renting cloud PLCs isn't worth the per-test cost.
**Setup pattern** (not yet codified in a script — will land alongside the integration
test project):
1. Install ModbusPal, load the per-device `.xmpp` profile from
`tests/Driver.Modbus.IntegrationTests/ModbusPal/` (TBD directory).
2. Start the simulator listening on `localhost:502` (or override via
`MODBUS_SIM_ENDPOINT` env var).
3. `dotnet test` the integration project — tests auto-skip when the endpoint is
unreachable, so forgetting to start the simulator doesn't wedge CI.
## Per-device quirk catalog
### AutomationDirect DL205
First known target device. Quirks to document and cover with named tests (to be
filled in when user validates each behavior in ModbusPal with a DL205 profile):
- **Word order for 32-bit values**: _pending_ — confirm whether DL205 uses ABCD
(Modbus TCP standard) or CDAB (Siemens-style word-swap) for Int32/UInt32/Float32.
Test name: `DL205_Float32_word_order_is_CDAB` (or `ABCD`, whichever proves out).
- **Register-zero access**: _pending_ — some DL205 configurations reject FC03 at
register 0 with exception code 02 (illegal data address). If confirmed, the
integration test suite verifies `ModbusProbeOptions.ProbeAddress` default of 0
triggers the rejection and operators must override; test name:
`DL205_FC03_at_register_0_returns_IllegalDataAddress`.
- **Coil addressing base**: _pending_ — DL205 documentation sometimes uses 1-based
coil addresses; verify the driver's zero-based addressing matches the physical
PLC without an off-by-one adjustment.
- **Maximum registers per FC03**: _pending_ — Modbus spec caps at 125; some DL205
models enforce a lower limit (e.g., 64). Test name:
`DL205_FC03_beyond_max_registers_returns_IllegalDataValue`.
- **Response framing under sustained load**: _pending_ — the driver's
single-flight semaphore assumes the server pairs requests/responses by
transaction id; at least one DL205 firmware revision is reported to drop the
TxId under load. If reproduced in ModbusPal we add a retry + log-and-continue
path to `ModbusTcpTransport`.
- **Exception code on coil write to a protected bit**: _pending_ — some DL205
setups protect internal coils; the driver should surface the PLC's exception
PDU as `BadNotWritable` rather than `BadInternalError`.
_User action item_: as each quirk is validated in ModbusPal, replace the _pending_
marker with the confirmed behavior and file a named test in the integration suite.
### Future devices
One section per device class, same shape as DL205. Quirks that apply across
multiple devices (e.g., "all AB PLCs use CDAB") can be noted in the cross-device
patterns section below once we have enough data points.
## Cross-device patterns
Once multiple device catalogs accumulate, quirks that recur across two or more
vendors get promoted into driver defaults or opt-in options:
- _(empty — filled in as catalogs grow)_
## Test conventions
- **One named test per quirk.** `DL205_word_order_is_CDAB_for_Float32` is easier to
diagnose on failure than a generic `Float32_roundtrip`. The `DL205_` prefix makes
filtering by device class trivial (`--filter "DisplayName~DL205"`).
- **Skip with a clear SkipReason.** Follow the pattern from
`GalaxyRepositoryLiveSmokeTests`: check reachability in the fixture, capture
a `SkipReason` string, and have each test call `Assert.Skip(SkipReason)` when
it's set. Don't throw — skipped tests read cleanly in CI logs.
- **Use the real `ModbusTcpTransport`.** Integration tests exercise the wire
protocol end-to-end. The in-memory `FakeTransport` from the unit test suite is
deliberately not used here — its value is speed + determinism, which doesn't
help reproduce device-specific issues.
- **Don't depend on ModbusPal state between tests.** Each test resets the
simulator's register bank or uses a unique address range. Avoid relying on
"previous test left value at register 10" setups that flake when tests run in
parallel or re-order.
## Next concrete PRs
- **PR 26 — Integration test project + DL205 profile scaffold**: creates
`tests/Driver.Modbus.IntegrationTests`, imports the ModbusPal profile (or
generates it from JSON), adds the fixture with skip-when-unreachable, plus
one smoke test that reads a register. No DL205-specific assertions yet — that
waits for the user to validate each quirk.
- **PR 27+**: one PR per confirmed DL205 quirk, landing the named test + any
driver-side adjustment (e.g., retry on dropped TxId) needed to pass it.

View File

@@ -234,6 +234,8 @@ All of these stay in the Galaxy Host process (.NET 4.8 x86). The `GalaxyProxy` i
- Refactor is **incremental**: extract `IDriver` / `ISubscribable` / `ITagDiscovery` etc. against the existing `LmxNodeManager` first (still in-process on v2 branch), validate the system still runs, *then* move the implementation behind the IPC boundary into Galaxy.Host. Keeps the system runnable at each step and de-risks the out-of-process move.
- **Parity test**: run the existing v1 IntegrationTests suite against the v2 Galaxy driver (same Galaxy, same expectations) **plus** a scripted Client.CLI walkthrough (connect / browse / read / write / subscribe / history / alarms) on a dev Galaxy. Automated regression + human-observable behavior.
**Dev environment for the LmxOpcUa breakout:** the Phase 0/1 dev box (`DESKTOP-6JL3KKO`) hosts the full AVEVA stack required to execute Phase 2 Streams D + E — 27 ArchestrA / Wonderware / AVEVA services running including `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`, `ArchestrADataStore`, `AsbServiceManager`; the full Historian set (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `InSQLStorage`, `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`, `HistorianSearch-x64`); SuiteLink (`slssvc`); MXAccess COM at `C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`; and OI-Gateway at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` — so the Phase 1 Task E.10 AppServer-via-OI-Gateway smoke test (decision #142) is also runnable on the same box, no separate AVEVA test machine required. Inventory captured in `dev-environment.md`.
---
### 4. Configuration Model — Centralized MSSQL + Local Cache

View File

@@ -0,0 +1,102 @@
<#
.SYNOPSIS
Registers the two v2 Windows services on a node: OtOpcUa (main server, net10) and
OtOpcUaGalaxyHost (out-of-process Galaxy COM host, net48 x86).
.DESCRIPTION
Phase 2 Stream D.2 — replaces the v1 single-service install (TopShelf-based OtOpcUa.Host).
Installs both services with the correct service-account SID + per-process shared secret
provisioning per `driver-stability.md §"IPC Security"`. Galaxy.Host depends on OtOpcUa
(Galaxy.Host must be reachable when OtOpcUa starts; service dependency wiring + retry
handled by OtOpcUa.Server NodeBootstrap).
.PARAMETER InstallRoot
Where the binaries live (typically C:\Program Files\OtOpcUa).
.PARAMETER ServiceAccount
Service account SID or DOMAIN\name. Both services run under this account; the
Galaxy.Host pipe ACL only allows this SID to connect (decision #76).
.PARAMETER GalaxySharedSecret
Per-process secret passed to Galaxy.Host via env var. Generated freshly per install.
.PARAMETER ZbConnection
Galaxy ZB SQL connection string (passed to Galaxy.Host via env var).
.EXAMPLE
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua'
#>
[CmdletBinding()]
param(
[Parameter(Mandatory)] [string]$InstallRoot,
[Parameter(Mandatory)] [string]$ServiceAccount,
[string]$GalaxySharedSecret,
[string]$ZbConnection = 'Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;',
[string]$GalaxyClientName = 'OtOpcUa-Galaxy.Host',
[string]$GalaxyPipeName = 'OtOpcUaGalaxy'
)
$ErrorActionPreference = 'Stop'
if (-not (Test-Path "$InstallRoot\OtOpcUa.Server.exe")) {
Write-Error "OtOpcUa.Server.exe not found at $InstallRoot — copy the publish output first"
exit 1
}
if (-not (Test-Path "$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe")) {
Write-Error "OtOpcUa.Driver.Galaxy.Host.exe not found at $InstallRoot\Galaxy — copy the publish output first"
exit 1
}
# Generate a fresh shared secret per install if not supplied. Stored in DPAPI-protected file
# rather than the registry so the service account can read it but other local users cannot.
if (-not $GalaxySharedSecret) {
$bytes = New-Object byte[] 32
[System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
$GalaxySharedSecret = [Convert]::ToBase64String($bytes)
}
# Resolve the SID — the IPC ACL needs the SID, not the down-level name.
$sid = if ($ServiceAccount.StartsWith('S-1-')) {
$ServiceAccount
} else {
(New-Object System.Security.Principal.NTAccount $ServiceAccount).Translate([System.Security.Principal.SecurityIdentifier]).Value
}
# --- Install OtOpcUaGalaxyHost first (OtOpcUa starts after, depends on it being up).
$galaxyEnv = @(
"OTOPCUA_GALAXY_PIPE=$GalaxyPipeName"
"OTOPCUA_ALLOWED_SID=$sid"
"OTOPCUA_GALAXY_SECRET=$GalaxySharedSecret"
"OTOPCUA_GALAXY_BACKEND=mxaccess"
"OTOPCUA_GALAXY_ZB_CONN=$ZbConnection"
"OTOPCUA_GALAXY_CLIENT_NAME=$GalaxyClientName"
) -join "`0"
$galaxyEnv += "`0`0"
Write-Host "Installing OtOpcUaGalaxyHost..."
& sc.exe create OtOpcUaGalaxyHost binPath= "`"$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe`"" `
DisplayName= 'OtOpcUa Galaxy Host (out-of-process MXAccess)' `
start= auto `
obj= $ServiceAccount | Out-Null
# Set per-service environment variables via the registry — sc.exe doesn't expose them directly.
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost"
$envValue = $galaxyEnv.Split("`0") | Where-Object { $_ -ne '' }
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $envValue
# --- Install OtOpcUa (depends on Galaxy host being installed; doesn't strictly require it
# started — OtOpcUa.Server NodeBootstrap retries on the IPC connect path).
Write-Host "Installing OtOpcUa..."
& sc.exe create OtOpcUa binPath= "`"$InstallRoot\OtOpcUa.Server.exe`"" `
DisplayName= 'OtOpcUa Server' `
start= auto `
depend= 'OtOpcUaGalaxyHost' `
obj= $ServiceAccount | Out-Null
Write-Host ""
Write-Host "Installed. Start with:"
Write-Host " sc.exe start OtOpcUaGalaxyHost"
Write-Host " sc.exe start OtOpcUa"
Write-Host ""
Write-Host "Galaxy shared secret (record this offline — required for service rebinding):"
Write-Host " $GalaxySharedSecret"

View File

@@ -0,0 +1,18 @@
<#
.SYNOPSIS
Stops + removes the two v2 services. Mirrors Install-Services.ps1.
#>
[CmdletBinding()] param()
$ErrorActionPreference = 'Continue'
foreach ($svc in 'OtOpcUa', 'OtOpcUaGalaxyHost') {
if (Get-Service $svc -ErrorAction SilentlyContinue) {
Write-Host "Stopping $svc..."
Stop-Service $svc -Force -ErrorAction SilentlyContinue
Write-Host "Removing $svc..."
& sc.exe delete $svc | Out-Null
} else {
Write-Host "$svc not installed — skipping"
}
}
Write-Host "Done."

View File

@@ -0,0 +1,107 @@
<#
.SYNOPSIS
Translates a v1 OtOpcUa.Host appsettings.json into a v2 DriverInstance.DriverConfig JSON
blob suitable for upserting into the central Configuration DB.
.DESCRIPTION
Phase 2 Stream D.3 — moves the legacy MxAccess + GalaxyRepository + Historian sections out
of node-local appsettings.json and into the central DB so each node only needs Cluster.NodeId
+ ClusterId + DB conn (per decision #18). Idempotent + dry-run-able.
Output shape matches the Galaxy DriverType schema in `docs/v2/plan.md` §"Galaxy DriverConfig":
{
"MxAccess": { "ClientName": "...", "RequestTimeoutSeconds": 30 },
"Database": { "ConnectionString": "...", "PollIntervalSeconds": 60 },
"Historian": { "Enabled": false }
}
.PARAMETER AppSettingsPath
Path to the v1 appsettings.json. Defaults to ../../src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
relative to the script.
.PARAMETER OutputPath
Where to write the generated DriverConfig JSON. Defaults to stdout.
.PARAMETER DryRun
Print what would be written without writing.
.EXAMPLE
pwsh ./Migrate-AppSettings-To-DriverConfig.ps1 -AppSettingsPath C:\OtOpcUa\appsettings.json -OutputPath C:\tmp\galaxy-driverconfig.json
#>
[CmdletBinding()]
param(
[string]$AppSettingsPath,
[string]$OutputPath,
[switch]$DryRun
)
$ErrorActionPreference = 'Stop'
if (-not $AppSettingsPath) {
$AppSettingsPath = Join-Path (Split-Path -Parent $PSScriptRoot) '..\src\ZB.MOM.WW.OtOpcUa.Host\appsettings.json'
}
if (-not (Test-Path $AppSettingsPath)) {
Write-Error "AppSettings file not found: $AppSettingsPath"
exit 1
}
$src = Get-Content -Raw $AppSettingsPath | ConvertFrom-Json
$mx = $src.MxAccess
$gr = $src.GalaxyRepository
$hi = $src.Historian
$driverConfig = [ordered]@{
MxAccess = [ordered]@{
ClientName = $mx.ClientName
NodeName = $mx.NodeName
GalaxyName = $mx.GalaxyName
RequestTimeoutSeconds = $mx.ReadTimeoutSeconds
WriteTimeoutSeconds = $mx.WriteTimeoutSeconds
MaxConcurrentOps = $mx.MaxConcurrentOperations
MonitorIntervalSec = $mx.MonitorIntervalSeconds
AutoReconnect = $mx.AutoReconnect
ProbeTag = $mx.ProbeTag
}
Database = [ordered]@{
ConnectionString = $gr.ConnectionString
ChangeDetectionIntervalSec = $gr.ChangeDetectionIntervalSeconds
CommandTimeoutSeconds = $gr.CommandTimeoutSeconds
ExtendedAttributes = $gr.ExtendedAttributes
Scope = $gr.Scope
PlatformName = $gr.PlatformName
}
Historian = [ordered]@{
Enabled = if ($null -ne $hi -and $null -ne $hi.Enabled) { $hi.Enabled } else { $false }
}
}
# Strip null-valued leaves so the resulting JSON is compact and round-trippable.
function Remove-Nulls($obj) {
$keys = @($obj.Keys)
foreach ($k in $keys) {
if ($null -eq $obj[$k]) { $obj.Remove($k) | Out-Null }
elseif ($obj[$k] -is [System.Collections.Specialized.OrderedDictionary]) { Remove-Nulls $obj[$k] }
}
}
Remove-Nulls $driverConfig
$json = $driverConfig | ConvertTo-Json -Depth 8
if ($DryRun) {
Write-Host "=== DriverConfig (dry-run, would write to $OutputPath) ==="
Write-Host $json
return
}
if ($OutputPath) {
$dir = Split-Path -Parent $OutputPath
if ($dir -and -not (Test-Path $dir)) { New-Item -ItemType Directory -Path $dir | Out-Null }
Set-Content -Path $OutputPath -Value $json -Encoding UTF8
Write-Host "Wrote DriverConfig to $OutputPath"
}
else {
$json
}

View File

@@ -0,0 +1,18 @@
@* Root Blazor component. *@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<title>OtOpcUa Admin</title>
<base href="/"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css"/>
<link rel="stylesheet" href="app.css"/>
<HeadOutlet/>
</head>
<body>
<Routes/>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js"></script>
<script src="_framework/blazor.web.js"></script>
</body>
</html>

View File

@@ -0,0 +1,36 @@
@inherits LayoutComponentBase
<div class="d-flex" style="min-height: 100vh;">
<nav class="bg-dark text-light p-3" style="width: 220px;">
<h5 class="mb-4">OtOpcUa Admin</h5>
<ul class="nav flex-column">
<li class="nav-item"><a class="nav-link text-light" href="/">Overview</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/fleet">Fleet status</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/clusters">Clusters</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/reservations">Reservations</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/certificates">Certificates</a></li>
</ul>
<div class="mt-5">
<AuthorizeView>
<Authorized>
<div class="small text-light">
Signed in as <a class="text-light" href="/account"><strong>@context.User.Identity?.Name</strong></a>
</div>
<div class="small text-muted">
@string.Join(", ", context.User.Claims.Where(c => c.Type.EndsWith("/role")).Select(c => c.Value))
</div>
<form method="post" action="/auth/logout">
<button class="btn btn-sm btn-outline-light mt-2" type="submit">Sign out</button>
</form>
</Authorized>
<NotAuthorized>
<a class="btn btn-sm btn-outline-light" href="/login">Sign in</a>
</NotAuthorized>
</AuthorizeView>
</div>
</nav>
<main class="flex-grow-1 p-4">
@Body
</main>
</div>

View File

@@ -0,0 +1,129 @@
@page "/account"
@attribute [Microsoft.AspNetCore.Authorization.Authorize]
@using System.Security.Claims
@using ZB.MOM.WW.OtOpcUa.Admin.Services
<h1 class="mb-4">My account</h1>
<AuthorizeView>
<Authorized>
@{
var username = context.User.FindFirst(ClaimTypes.NameIdentifier)?.Value ?? "—";
var displayName = context.User.Identity?.Name ?? "—";
var roles = context.User.Claims
.Where(c => c.Type == ClaimTypes.Role).Select(c => c.Value).ToList();
var ldapGroups = context.User.Claims
.Where(c => c.Type == "ldap_group").Select(c => c.Value).ToList();
}
<div class="row g-4">
<div class="col-md-6">
<div class="card">
<div class="card-body">
<h5 class="card-title">Identity</h5>
<dl class="row mb-0">
<dt class="col-sm-4">Username</dt><dd class="col-sm-8"><code>@username</code></dd>
<dt class="col-sm-4">Display name</dt><dd class="col-sm-8">@displayName</dd>
</dl>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card">
<div class="card-body">
<h5 class="card-title">Admin roles</h5>
@if (roles.Count == 0)
{
<p class="text-muted mb-0">No Admin roles mapped — sign-in would have been blocked, so if you're seeing this, the session claim is likely stale.</p>
}
else
{
<div class="mb-2">
@foreach (var r in roles)
{
<span class="badge bg-primary me-1">@r</span>
}
</div>
<small class="text-muted">LDAP groups: @(ldapGroups.Count == 0 ? "(none surfaced)" : string.Join(", ", ldapGroups))</small>
}
</div>
</div>
</div>
<div class="col-12">
<div class="card">
<div class="card-body">
<h5 class="card-title">Capabilities</h5>
<p class="text-muted small">
Each Admin role grants a fixed capability set per <code>admin-ui.md</code> §Admin Roles.
Pages below reflect what this session can access; the route's <code>[Authorize]</code> guard
is the ground truth — this table mirrors it for readability.
</p>
<table class="table table-sm align-middle mb-0">
<thead>
<tr>
<th>Capability</th>
<th>Required role(s)</th>
<th class="text-end">You have it?</th>
</tr>
</thead>
<tbody>
@foreach (var cap in Capabilities)
{
var has = cap.RequiredRoles.Any(r => roles.Contains(r, StringComparer.OrdinalIgnoreCase));
<tr>
<td>@cap.Name<br /><small class="text-muted">@cap.Description</small></td>
<td>@string.Join(" or ", cap.RequiredRoles)</td>
<td class="text-end">
@if (has)
{
<span class="badge bg-success">Yes</span>
}
else
{
<span class="badge bg-secondary">No</span>
}
</td>
</tr>
}
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="mt-4">
<form method="post" action="/auth/logout">
<button class="btn btn-outline-danger" type="submit">Sign out</button>
</form>
</div>
</Authorized>
</AuthorizeView>
@code {
private sealed record Capability(string Name, string Description, string[] RequiredRoles);
// Kept in sync with Program.cs authorization policies + each page's [Authorize] attribute.
// When a new page or policy is added, extend this list so operators can self-service check
// whether their session has access without trial-and-error navigation.
private static readonly IReadOnlyList<Capability> Capabilities =
[
new("View clusters + fleet status",
"Read-only access to the cluster list, fleet dashboard, and generation history.",
[AdminRoles.ConfigViewer, AdminRoles.ConfigEditor, AdminRoles.FleetAdmin]),
new("Edit configuration drafts",
"Create and edit draft generations, manage namespace bindings and node ACLs. CanEdit policy.",
[AdminRoles.ConfigEditor, AdminRoles.FleetAdmin]),
new("Publish generations",
"Promote a draft to Published — triggers node roll-out. CanPublish policy.",
[AdminRoles.FleetAdmin]),
new("Manage certificate trust",
"Trust rejected client certs + revoke trust. FleetAdmin-only because the trust decision gates OPC UA client access.",
[AdminRoles.FleetAdmin]),
new("Manage external-ID reservations",
"Reserve / release external IDs that map into Galaxy contained names.",
[AdminRoles.ConfigEditor, AdminRoles.FleetAdmin]),
];
}

View File

@@ -0,0 +1,154 @@
@page "/certificates"
@attribute [Microsoft.AspNetCore.Authorization.Authorize(Roles = AdminRoles.FleetAdmin)]
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@inject CertTrustService Certs
@inject AuthenticationStateProvider AuthState
@inject ILogger<Certificates> Log
<h1 class="mb-4">Certificate trust</h1>
<div class="alert alert-info small mb-4">
PKI store root <code>@Certs.PkiStoreRoot</code>. Trusting a rejected cert moves the file into the trusted store — the OPC UA server picks up the change on the next client handshake, so operators should retry the rejected client's connection after trusting.
</div>
@if (_status is not null)
{
<div class="alert alert-@_statusKind alert-dismissible">
@_status
<button type="button" class="btn-close" @onclick="ClearStatus"></button>
</div>
}
<h2 class="h4">Rejected (@_rejected.Count)</h2>
@if (_rejected.Count == 0)
{
<p class="text-muted">No rejected certificates. Clients that fail to handshake with an untrusted cert land here.</p>
}
else
{
<table class="table table-sm align-middle">
<thead><tr><th>Subject</th><th>Issuer</th><th>Thumbprint</th><th>Valid</th><th class="text-end">Actions</th></tr></thead>
<tbody>
@foreach (var c in _rejected)
{
<tr>
<td>@c.Subject</td>
<td>@c.Issuer</td>
<td><code class="small">@c.Thumbprint</code></td>
<td class="small">@c.NotBefore.ToString("yyyy-MM-dd") → @c.NotAfter.ToString("yyyy-MM-dd")</td>
<td class="text-end">
<button class="btn btn-sm btn-success me-1" @onclick="() => TrustAsync(c)">Trust</button>
<button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteRejectedAsync(c)">Delete</button>
</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h4 mt-5">Trusted (@_trusted.Count)</h2>
@if (_trusted.Count == 0)
{
<p class="text-muted">No client certs have been explicitly trusted. The server's own application cert lives in <code>own/</code> and is not listed here.</p>
}
else
{
<table class="table table-sm align-middle">
<thead><tr><th>Subject</th><th>Issuer</th><th>Thumbprint</th><th>Valid</th><th class="text-end">Actions</th></tr></thead>
<tbody>
@foreach (var c in _trusted)
{
<tr>
<td>@c.Subject</td>
<td>@c.Issuer</td>
<td><code class="small">@c.Thumbprint</code></td>
<td class="small">@c.NotBefore.ToString("yyyy-MM-dd") → @c.NotAfter.ToString("yyyy-MM-dd")</td>
<td class="text-end">
<button class="btn btn-sm btn-outline-danger" @onclick="() => UntrustAsync(c)">Revoke</button>
</td>
</tr>
}
</tbody>
</table>
}
@code {
private IReadOnlyList<CertInfo> _rejected = [];
private IReadOnlyList<CertInfo> _trusted = [];
private string? _status;
private string _statusKind = "success";
protected override void OnInitialized() => Reload();
private void Reload()
{
_rejected = Certs.ListRejected();
_trusted = Certs.ListTrusted();
}
private async Task TrustAsync(CertInfo c)
{
if (Certs.TrustRejected(c.Thumbprint))
{
await LogActionAsync("cert.trust", c);
Set($"Trusted cert {c.Subject} ({Short(c.Thumbprint)}).", "success");
}
else
{
Set($"Could not trust {Short(c.Thumbprint)} — file missing; another admin may have already handled it.", "warning");
}
Reload();
}
private async Task DeleteRejectedAsync(CertInfo c)
{
if (Certs.DeleteRejected(c.Thumbprint))
{
await LogActionAsync("cert.delete.rejected", c);
Set($"Deleted rejected cert {c.Subject} ({Short(c.Thumbprint)}).", "success");
}
else
{
Set($"Could not delete {Short(c.Thumbprint)} — file missing.", "warning");
}
Reload();
}
private async Task UntrustAsync(CertInfo c)
{
if (Certs.UntrustCert(c.Thumbprint))
{
await LogActionAsync("cert.untrust", c);
Set($"Revoked trust for {c.Subject} ({Short(c.Thumbprint)}).", "success");
}
else
{
Set($"Could not revoke {Short(c.Thumbprint)} — file missing.", "warning");
}
Reload();
}
private async Task LogActionAsync(string action, CertInfo c)
{
// Cert trust changes are operator-initiated and security-sensitive — Serilog captures the
// user + thumbprint trail. CertTrustService also logs at Information on each filesystem
// move/delete; this line ties the action to the authenticated admin user so the two logs
// correlate. DB-level ConfigAuditLog persistence is deferred — its schema is
// cluster-scoped and cert actions are cluster-agnostic.
var state = await AuthState.GetAuthenticationStateAsync();
var user = state.User.Identity?.Name ?? "(anonymous)";
Log.LogInformation("Admin cert action: user={User} action={Action} thumbprint={Thumbprint} subject={Subject}",
user, action, c.Thumbprint, c.Subject);
}
private void Set(string message, string kind)
{
_status = message;
_statusKind = kind;
}
private void ClearStatus() => _status = null;
private static string Short(string thumbprint) =>
thumbprint.Length > 12 ? thumbprint[..12] + "…" : thumbprint;
}

View File

@@ -0,0 +1,126 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject NodeAclService AclSvc
<div class="d-flex justify-content-between mb-3">
<h4>Access-control grants</h4>
<button class="btn btn-sm btn-primary" @onclick="() => _showForm = true">Add grant</button>
</div>
@if (_acls is null) { <p>Loading…</p> }
else if (_acls.Count == 0) { <p class="text-muted">No ACL grants in this draft. Publish will result in a cluster with no external access.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>LDAP group</th><th>Scope</th><th>Scope ID</th><th>Permissions</th><th></th></tr></thead>
<tbody>
@foreach (var a in _acls)
{
<tr>
<td>@a.LdapGroup</td>
<td>@a.ScopeKind</td>
<td><code>@(a.ScopeId ?? "-")</code></td>
<td><code>@a.PermissionFlags</code></td>
<td><button class="btn btn-sm btn-outline-danger" @onclick="() => RevokeAsync(a.NodeAclRowId)">Revoke</button></td>
</tr>
}
</tbody>
</table>
}
@if (_showForm)
{
<div class="card">
<div class="card-body">
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">LDAP group</label>
<input class="form-control" @bind="_group"/>
</div>
<div class="col-md-4">
<label class="form-label">Scope kind</label>
<select class="form-select" @bind="_scopeKind">
@foreach (var k in Enum.GetValues<NodeAclScopeKind>()) { <option value="@k">@k</option> }
</select>
</div>
<div class="col-md-4">
<label class="form-label">Scope ID (empty for Cluster-wide)</label>
<input class="form-control" @bind="_scopeId"/>
</div>
<div class="col-12">
<label class="form-label">Permissions (bundled presets — per-flag editor in v2.1)</label>
<select class="form-select" @bind="_preset">
<option value="Read">Read (Browse + Read)</option>
<option value="WriteOperate">Read + Write Operate</option>
<option value="Engineer">Read + Write Tune + Write Configure</option>
<option value="AlarmAck">Read + Alarm Ack</option>
<option value="Full">Full (every flag)</option>
</select>
</div>
</div>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button class="btn btn-sm btn-primary" @onclick="SaveAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="() => _showForm = false">Cancel</button>
</div>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<NodeAcl>? _acls;
private bool _showForm;
private string _group = string.Empty;
private NodeAclScopeKind _scopeKind = NodeAclScopeKind.Cluster;
private string _scopeId = string.Empty;
private string _preset = "Read";
private string? _error;
protected override async Task OnParametersSetAsync() =>
_acls = await AclSvc.ListAsync(GenerationId, CancellationToken.None);
private NodePermissions ResolvePreset() => _preset switch
{
"Read" => NodePermissions.Browse | NodePermissions.Read,
"WriteOperate" => NodePermissions.Browse | NodePermissions.Read | NodePermissions.WriteOperate,
"Engineer" => NodePermissions.Browse | NodePermissions.Read | NodePermissions.WriteTune | NodePermissions.WriteConfigure,
"AlarmAck" => NodePermissions.Browse | NodePermissions.Read | NodePermissions.AlarmRead | NodePermissions.AlarmAcknowledge,
"Full" => unchecked((NodePermissions)(-1)),
_ => NodePermissions.Browse | NodePermissions.Read,
};
private async Task SaveAsync()
{
_error = null;
if (string.IsNullOrWhiteSpace(_group)) { _error = "LDAP group is required"; return; }
var scopeId = _scopeKind == NodeAclScopeKind.Cluster ? null
: string.IsNullOrWhiteSpace(_scopeId) ? null : _scopeId;
if (_scopeKind != NodeAclScopeKind.Cluster && scopeId is null)
{
_error = $"ScopeId required for {_scopeKind}";
return;
}
try
{
await AclSvc.GrantAsync(GenerationId, ClusterId, _group, _scopeKind, scopeId,
ResolvePreset(), notes: null, CancellationToken.None);
_group = string.Empty; _scopeId = string.Empty;
_showForm = false;
_acls = await AclSvc.ListAsync(GenerationId, CancellationToken.None);
}
catch (Exception ex) { _error = ex.Message; }
}
private async Task RevokeAsync(Guid rowId)
{
await AclSvc.RevokeAsync(rowId, CancellationToken.None);
_acls = await AclSvc.ListAsync(GenerationId, CancellationToken.None);
}
}

View File

@@ -0,0 +1,35 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject AuditLogService AuditSvc
<h4>Recent audit log</h4>
@if (_entries is null) { <p>Loading…</p> }
else if (_entries.Count == 0) { <p class="text-muted">No audit entries for this cluster yet.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>When</th><th>Principal</th><th>Event</th><th>Node</th><th>Generation</th><th>Details</th></tr></thead>
<tbody>
@foreach (var a in _entries)
{
<tr>
<td>@a.Timestamp.ToString("u")</td>
<td>@a.Principal</td>
<td><code>@a.EventType</code></td>
<td>@a.NodeId</td>
<td>@a.GenerationId</td>
<td><small class="text-muted">@a.DetailsJson</small></td>
</tr>
}
</tbody>
</table>
}
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<ConfigAuditLog>? _entries;
protected override async Task OnParametersSetAsync() =>
_entries = await AuditSvc.ListRecentAsync(ClusterId, limit: 100, CancellationToken.None);
}

View File

@@ -0,0 +1,165 @@
@page "/clusters/{ClusterId}"
@using Microsoft.AspNetCore.Components.Web
@using Microsoft.AspNetCore.SignalR.Client
@using ZB.MOM.WW.OtOpcUa.Admin.Hubs
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@implements IAsyncDisposable
@rendermode RenderMode.InteractiveServer
@inject ClusterService ClusterSvc
@inject GenerationService GenerationSvc
@inject NavigationManager Nav
@if (_cluster is null)
{
<p>Loading…</p>
}
else
{
@if (_liveBanner is not null)
{
<div class="alert alert-info py-2 small">
<strong>Live update:</strong> @_liveBanner
<button type="button" class="btn-close float-end" @onclick="() => _liveBanner = null"></button>
</div>
}
<div class="d-flex justify-content-between align-items-center mb-3">
<div>
<h1 class="mb-0">@_cluster.Name</h1>
<code class="text-muted">@_cluster.ClusterId</code>
@if (!_cluster.Enabled) { <span class="badge bg-secondary ms-2">Disabled</span> }
</div>
<div>
@if (_currentDraft is not null)
{
<a href="/clusters/@ClusterId/draft/@_currentDraft.GenerationId" class="btn btn-outline-primary">
Edit current draft (gen @_currentDraft.GenerationId)
</a>
}
else
{
<button class="btn btn-primary" @onclick="CreateDraftAsync" disabled="@_busy">New draft</button>
}
</div>
</div>
<ul class="nav nav-tabs mb-3">
<li class="nav-item"><button class="nav-link @Tab("overview")" @onclick='() => _tab = "overview"'>Overview</button></li>
<li class="nav-item"><button class="nav-link @Tab("generations")" @onclick='() => _tab = "generations"'>Generations</button></li>
<li class="nav-item"><button class="nav-link @Tab("equipment")" @onclick='() => _tab = "equipment"'>Equipment</button></li>
<li class="nav-item"><button class="nav-link @Tab("uns")" @onclick='() => _tab = "uns"'>UNS Structure</button></li>
<li class="nav-item"><button class="nav-link @Tab("namespaces")" @onclick='() => _tab = "namespaces"'>Namespaces</button></li>
<li class="nav-item"><button class="nav-link @Tab("drivers")" @onclick='() => _tab = "drivers"'>Drivers</button></li>
<li class="nav-item"><button class="nav-link @Tab("acls")" @onclick='() => _tab = "acls"'>ACLs</button></li>
<li class="nav-item"><button class="nav-link @Tab("audit")" @onclick='() => _tab = "audit"'>Audit</button></li>
</ul>
@if (_tab == "overview")
{
<dl class="row">
<dt class="col-sm-3">Enterprise / Site</dt><dd class="col-sm-9">@_cluster.Enterprise / @_cluster.Site</dd>
<dt class="col-sm-3">Redundancy</dt><dd class="col-sm-9">@_cluster.RedundancyMode (@_cluster.NodeCount node@(_cluster.NodeCount == 1 ? "" : "s"))</dd>
<dt class="col-sm-3">Current published</dt>
<dd class="col-sm-9">
@if (_currentPublished is not null) { <span>@_currentPublished.GenerationId (@_currentPublished.PublishedAt?.ToString("u"))</span> }
else { <span class="text-muted">none published yet</span> }
</dd>
<dt class="col-sm-3">Created</dt><dd class="col-sm-9">@_cluster.CreatedAt.ToString("u") by @_cluster.CreatedBy</dd>
</dl>
}
else if (_tab == "generations")
{
<Generations ClusterId="@ClusterId"/>
}
else if (_tab == "equipment" && _currentDraft is not null)
{
<EquipmentTab GenerationId="@_currentDraft.GenerationId"/>
}
else if (_tab == "uns" && _currentDraft is not null)
{
<UnsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "namespaces" && _currentDraft is not null)
{
<NamespacesTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "drivers" && _currentDraft is not null)
{
<DriversTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "acls" && _currentDraft is not null)
{
<AclsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "audit")
{
<AuditTab ClusterId="@ClusterId"/>
}
else
{
<p class="text-muted">Open a draft to edit this cluster's content.</p>
}
}
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
private ServerCluster? _cluster;
private ConfigGeneration? _currentDraft;
private ConfigGeneration? _currentPublished;
private string _tab = "overview";
private bool _busy;
private HubConnection? _hub;
private string? _liveBanner;
private string Tab(string key) => _tab == key ? "active" : string.Empty;
protected override async Task OnInitializedAsync()
{
await LoadAsync();
await ConnectHubAsync();
}
private async Task LoadAsync()
{
_cluster = await ClusterSvc.FindAsync(ClusterId, CancellationToken.None);
var gens = await GenerationSvc.ListRecentAsync(ClusterId, 50, CancellationToken.None);
_currentDraft = gens.FirstOrDefault(g => g.Status == GenerationStatus.Draft);
_currentPublished = gens.FirstOrDefault(g => g.Status == GenerationStatus.Published);
}
private async Task ConnectHubAsync()
{
_hub = new HubConnectionBuilder()
.WithUrl(Nav.ToAbsoluteUri("/hubs/fleet"))
.WithAutomaticReconnect()
.Build();
_hub.On<NodeStateChangedMessage>("NodeStateChanged", async msg =>
{
if (msg.ClusterId != ClusterId) return;
_liveBanner = $"Node {msg.NodeId}: {msg.LastAppliedStatus ?? "seen"} at {msg.LastAppliedAt?.ToString("u") ?? msg.LastSeenAt?.ToString("u") ?? "-"}";
await LoadAsync();
await InvokeAsync(StateHasChanged);
});
await _hub.StartAsync();
await _hub.SendAsync("SubscribeCluster", ClusterId);
}
private async Task CreateDraftAsync()
{
_busy = true;
try
{
var draft = await GenerationSvc.CreateDraftAsync(ClusterId, createdBy: "admin-ui", CancellationToken.None);
Nav.NavigateTo($"/clusters/{ClusterId}/draft/{draft.GenerationId}");
}
finally { _busy = false; }
}
public async ValueTask DisposeAsync()
{
if (_hub is not null) await _hub.DisposeAsync();
}
}

View File

@@ -0,0 +1,56 @@
@page "/clusters"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject ClusterService ClusterSvc
<div class="d-flex justify-content-between align-items-center mb-4">
<h1>Clusters</h1>
<a href="/clusters/new" class="btn btn-primary">New cluster</a>
</div>
@if (_clusters is null)
{
<p>Loading…</p>
}
else if (_clusters.Count == 0)
{
<p class="text-muted">No clusters yet. Create the first one.</p>
}
else
{
<table class="table table-hover">
<thead>
<tr>
<th>ClusterId</th><th>Name</th><th>Enterprise</th><th>Site</th>
<th>RedundancyMode</th><th>NodeCount</th><th>Enabled</th><th></th>
</tr>
</thead>
<tbody>
@foreach (var c in _clusters)
{
<tr>
<td><code>@c.ClusterId</code></td>
<td>@c.Name</td>
<td>@c.Enterprise</td>
<td>@c.Site</td>
<td>@c.RedundancyMode</td>
<td>@c.NodeCount</td>
<td>
@if (c.Enabled) { <span class="badge bg-success">Active</span> }
else { <span class="badge bg-secondary">Disabled</span> }
</td>
<td><a href="/clusters/@c.ClusterId" class="btn btn-sm btn-outline-primary">Open</a></td>
</tr>
}
</tbody>
</table>
}
@code {
private List<ServerCluster>? _clusters;
protected override async Task OnInitializedAsync()
{
_clusters = await ClusterSvc.ListAsync(CancellationToken.None);
}
}

View File

@@ -0,0 +1,73 @@
@page "/clusters/{ClusterId}/draft/{GenerationId:long}/diff"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject GenerationService GenerationSvc
<div class="d-flex justify-content-between align-items-center mb-3">
<div>
<h1 class="mb-0">Draft diff</h1>
<small class="text-muted">
Cluster <code>@ClusterId</code> — from last published (@(_fromLabel)) → to draft @GenerationId
</small>
</div>
<a class="btn btn-outline-secondary" href="/clusters/@ClusterId/draft/@GenerationId">Back to editor</a>
</div>
@if (_rows is null)
{
<p>Computing diff…</p>
}
else if (_error is not null)
{
<div class="alert alert-danger">@_error</div>
}
else if (_rows.Count == 0)
{
<p class="text-muted">No differences — draft is structurally identical to the last published generation.</p>
}
else
{
<table class="table table-hover table-sm">
<thead><tr><th>Table</th><th>LogicalId</th><th>ChangeKind</th></tr></thead>
<tbody>
@foreach (var r in _rows)
{
<tr>
<td>@r.TableName</td>
<td><code>@r.LogicalId</code></td>
<td>
@switch (r.ChangeKind)
{
case "Added": <span class="badge bg-success">@r.ChangeKind</span> break;
case "Removed": <span class="badge bg-danger">@r.ChangeKind</span> break;
case "Modified": <span class="badge bg-warning text-dark">@r.ChangeKind</span> break;
default: <span class="badge bg-secondary">@r.ChangeKind</span> break;
}
</td>
</tr>
}
</tbody>
</table>
}
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
[Parameter] public long GenerationId { get; set; }
private List<DiffRow>? _rows;
private string _fromLabel = "(empty)";
private string? _error;
protected override async Task OnParametersSetAsync()
{
try
{
var all = await GenerationSvc.ListRecentAsync(ClusterId, 50, CancellationToken.None);
var from = all.FirstOrDefault(g => g.Status == GenerationStatus.Published);
_fromLabel = from is null ? "(empty)" : $"gen {from.GenerationId}";
_rows = await GenerationSvc.ComputeDiffAsync(from?.GenerationId ?? 0, GenerationId, CancellationToken.None);
}
catch (Exception ex) { _error = ex.Message; }
}
}

View File

@@ -0,0 +1,103 @@
@page "/clusters/{ClusterId}/draft/{GenerationId:long}"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Validation
@inject GenerationService GenerationSvc
@inject DraftValidationService ValidationSvc
@inject NavigationManager Nav
<div class="d-flex justify-content-between align-items-center mb-3">
<div>
<h1 class="mb-0">Draft editor</h1>
<small class="text-muted">Cluster <code>@ClusterId</code> · generation @GenerationId</small>
</div>
<div>
<a class="btn btn-outline-secondary" href="/clusters/@ClusterId">Back to cluster</a>
<a class="btn btn-outline-primary ms-2" href="/clusters/@ClusterId/draft/@GenerationId/diff">View diff</a>
<button class="btn btn-primary ms-2" disabled="@(_errors.Count != 0 || _busy)" @onclick="PublishAsync">Publish</button>
</div>
</div>
<ul class="nav nav-tabs mb-3">
<li class="nav-item"><button class="nav-link @Active("equipment")" @onclick='() => _tab = "equipment"'>Equipment</button></li>
<li class="nav-item"><button class="nav-link @Active("uns")" @onclick='() => _tab = "uns"'>UNS</button></li>
<li class="nav-item"><button class="nav-link @Active("namespaces")" @onclick='() => _tab = "namespaces"'>Namespaces</button></li>
<li class="nav-item"><button class="nav-link @Active("drivers")" @onclick='() => _tab = "drivers"'>Drivers</button></li>
<li class="nav-item"><button class="nav-link @Active("acls")" @onclick='() => _tab = "acls"'>ACLs</button></li>
</ul>
<div class="row">
<div class="col-md-8">
@if (_tab == "equipment") { <EquipmentTab GenerationId="@GenerationId"/> }
else if (_tab == "uns") { <UnsTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
else if (_tab == "namespaces") { <NamespacesTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
else if (_tab == "drivers") { <DriversTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
else if (_tab == "acls") { <AclsTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
</div>
<div class="col-md-4">
<div class="card sticky-top">
<div class="card-header d-flex justify-content-between align-items-center">
<strong>Validation</strong>
<button class="btn btn-sm btn-outline-secondary" @onclick="RevalidateAsync">Re-run</button>
</div>
<div class="card-body">
@if (_validating) { <p class="text-muted">Checking…</p> }
else if (_errors.Count == 0) { <div class="alert alert-success mb-0">No validation errors — safe to publish.</div> }
else
{
<div class="alert alert-danger mb-2">@_errors.Count error@(_errors.Count == 1 ? "" : "s")</div>
<ul class="list-unstyled">
@foreach (var e in _errors)
{
<li class="mb-2">
<span class="badge bg-danger me-1">@e.Code</span>
<small>@e.Message</small>
@if (!string.IsNullOrEmpty(e.Context)) { <div class="text-muted"><code>@e.Context</code></div> }
</li>
}
</ul>
}
</div>
</div>
@if (_publishError is not null) { <div class="alert alert-danger mt-3">@_publishError</div> }
</div>
</div>
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
[Parameter] public long GenerationId { get; set; }
private string _tab = "equipment";
private List<ValidationError> _errors = [];
private bool _validating;
private bool _busy;
private string? _publishError;
private string Active(string k) => _tab == k ? "active" : string.Empty;
protected override async Task OnParametersSetAsync() => await RevalidateAsync();
private async Task RevalidateAsync()
{
_validating = true;
try
{
var errors = await ValidationSvc.ValidateAsync(GenerationId, CancellationToken.None);
_errors = errors.ToList();
}
finally { _validating = false; }
}
private async Task PublishAsync()
{
_busy = true;
_publishError = null;
try
{
await GenerationSvc.PublishAsync(ClusterId, GenerationId, notes: "Published via Admin UI", CancellationToken.None);
Nav.NavigateTo($"/clusters/{ClusterId}");
}
catch (Exception ex) { _publishError = ex.Message; }
finally { _busy = false; }
}
}

View File

@@ -0,0 +1,107 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject DriverInstanceService DriverSvc
@inject NamespaceService NsSvc
<div class="d-flex justify-content-between mb-3">
<h4>DriverInstances</h4>
<button class="btn btn-sm btn-primary" @onclick="() => _showForm = true">Add driver</button>
</div>
@if (_drivers is null) { <p>Loading…</p> }
else if (_drivers.Count == 0) { <p class="text-muted">No drivers configured in this draft.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>DriverInstanceId</th><th>Name</th><th>Type</th><th>Namespace</th></tr></thead>
<tbody>
@foreach (var d in _drivers)
{
<tr><td><code>@d.DriverInstanceId</code></td><td>@d.Name</td><td>@d.DriverType</td><td><code>@d.NamespaceId</code></td></tr>
}
</tbody>
</table>
}
@if (_showForm && _namespaces is not null)
{
<div class="card">
<div class="card-body">
<div class="row g-3">
<div class="col-md-3">
<label class="form-label">Name</label>
<input class="form-control" @bind="_name"/>
</div>
<div class="col-md-3">
<label class="form-label">DriverType</label>
<select class="form-select" @bind="_type">
<option>Galaxy</option>
<option>ModbusTcp</option>
<option>AbCip</option>
<option>AbLegacy</option>
<option>S7</option>
<option>Focas</option>
<option>OpcUaClient</option>
</select>
</div>
<div class="col-md-6">
<label class="form-label">Namespace</label>
<select class="form-select" @bind="_nsId">
@foreach (var n in _namespaces) { <option value="@n.NamespaceId">@n.Kind — @n.NamespaceUri</option> }
</select>
</div>
<div class="col-12">
<label class="form-label">DriverConfig JSON (schemaless per driver type)</label>
<textarea class="form-control font-monospace" rows="6" @bind="_config"></textarea>
<div class="form-text">Phase 1: generic JSON editor — per-driver schema validation arrives in each driver's phase (decision #94).</div>
</div>
</div>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button class="btn btn-sm btn-primary" @onclick="SaveAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="() => _showForm = false">Cancel</button>
</div>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<DriverInstance>? _drivers;
private List<Namespace>? _namespaces;
private bool _showForm;
private string _name = string.Empty;
private string _type = "ModbusTcp";
private string _nsId = string.Empty;
private string _config = "{}";
private string? _error;
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync()
{
_drivers = await DriverSvc.ListAsync(GenerationId, CancellationToken.None);
_namespaces = await NsSvc.ListAsync(GenerationId, CancellationToken.None);
_nsId = _namespaces.FirstOrDefault()?.NamespaceId ?? string.Empty;
}
private async Task SaveAsync()
{
_error = null;
if (string.IsNullOrWhiteSpace(_name) || string.IsNullOrWhiteSpace(_nsId))
{
_error = "Name and Namespace are required";
return;
}
try
{
await DriverSvc.AddAsync(GenerationId, ClusterId, _nsId, _name, _type, _config, CancellationToken.None);
_name = string.Empty; _config = "{}";
_showForm = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
}

View File

@@ -0,0 +1,152 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Validation
@inject EquipmentService EquipmentSvc
<div class="d-flex justify-content-between mb-3">
<h4>Equipment (draft gen @GenerationId)</h4>
<button class="btn btn-primary btn-sm" @onclick="StartAdd">Add equipment</button>
</div>
@if (_equipment is null)
{
<p>Loading…</p>
}
else if (_equipment.Count == 0 && !_showForm)
{
<p class="text-muted">No equipment in this draft yet.</p>
}
else if (_equipment.Count > 0)
{
<table class="table table-sm table-hover">
<thead>
<tr>
<th>EquipmentId</th><th>Name</th><th>MachineCode</th><th>ZTag</th><th>SAPID</th>
<th>Manufacturer / Model</th><th>Serial</th><th></th>
</tr>
</thead>
<tbody>
@foreach (var e in _equipment)
{
<tr>
<td><code>@e.EquipmentId</code></td>
<td>@e.Name</td>
<td>@e.MachineCode</td>
<td>@e.ZTag</td>
<td>@e.SAPID</td>
<td>@e.Manufacturer / @e.Model</td>
<td>@e.SerialNumber</td>
<td><button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteAsync(e.EquipmentRowId)">Remove</button></td>
</tr>
}
</tbody>
</table>
}
@if (_showForm)
{
<div class="card mt-3">
<div class="card-body">
<h5>New equipment</h5>
<EditForm Model="_draft" OnValidSubmit="SaveAsync" FormName="new-equipment">
<DataAnnotationsValidator/>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">Name (UNS segment)</label>
<InputText @bind-Value="_draft.Name" class="form-control"/>
<ValidationMessage For="() => _draft.Name"/>
</div>
<div class="col-md-4">
<label class="form-label">MachineCode</label>
<InputText @bind-Value="_draft.MachineCode" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">DriverInstanceId</label>
<InputText @bind-Value="_draft.DriverInstanceId" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">UnsLineId</label>
<InputText @bind-Value="_draft.UnsLineId" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">ZTag</label>
<InputText @bind-Value="_draft.ZTag" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">SAPID</label>
<InputText @bind-Value="_draft.SAPID" class="form-control"/>
</div>
</div>
<h6 class="mt-4">OPC 40010 Identification</h6>
<div class="row g-3">
<div class="col-md-4"><label class="form-label">Manufacturer</label><InputText @bind-Value="_draft.Manufacturer" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Model</label><InputText @bind-Value="_draft.Model" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Serial number</label><InputText @bind-Value="_draft.SerialNumber" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Hardware rev</label><InputText @bind-Value="_draft.HardwareRevision" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Software rev</label><InputText @bind-Value="_draft.SoftwareRevision" class="form-control"/></div>
<div class="col-md-4">
<label class="form-label">Year of construction</label>
<InputNumber @bind-Value="_draft.YearOfConstruction" class="form-control"/>
</div>
</div>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button type="submit" class="btn btn-primary btn-sm">Save</button>
<button type="button" class="btn btn-secondary btn-sm ms-2" @onclick="() => _showForm = false">Cancel</button>
</div>
</EditForm>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
private List<Equipment>? _equipment;
private bool _showForm;
private Equipment _draft = NewBlankDraft();
private string? _error;
private static Equipment NewBlankDraft() => new()
{
EquipmentId = string.Empty, DriverInstanceId = string.Empty,
UnsLineId = string.Empty, Name = string.Empty, MachineCode = string.Empty,
};
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync()
{
_equipment = await EquipmentSvc.ListAsync(GenerationId, CancellationToken.None);
}
private void StartAdd()
{
_draft = NewBlankDraft();
_error = null;
_showForm = true;
}
private async Task SaveAsync()
{
_error = null;
_draft.EquipmentUuid = Guid.NewGuid();
_draft.EquipmentId = DraftValidator.DeriveEquipmentId(_draft.EquipmentUuid);
_draft.GenerationId = GenerationId;
try
{
await EquipmentSvc.CreateAsync(GenerationId, _draft, CancellationToken.None);
_showForm = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
private async Task DeleteAsync(Guid id)
{
await EquipmentSvc.DeleteAsync(id, CancellationToken.None);
await ReloadAsync();
}
}

View File

@@ -0,0 +1,73 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject GenerationService GenerationSvc
@inject NavigationManager Nav
<h4>Generations</h4>
@if (_generations is null) { <p>Loading…</p> }
else if (_generations.Count == 0) { <p class="text-muted">No generations in this cluster yet.</p> }
else
{
<table class="table table-sm">
<thead>
<tr><th>ID</th><th>Status</th><th>Created</th><th>Published</th><th>PublishedBy</th><th>Notes</th><th></th></tr>
</thead>
<tbody>
@foreach (var g in _generations)
{
<tr>
<td><code>@g.GenerationId</code></td>
<td>@StatusBadge(g.Status)</td>
<td><small>@g.CreatedAt.ToString("u") by @g.CreatedBy</small></td>
<td><small>@(g.PublishedAt?.ToString("u") ?? "-")</small></td>
<td><small>@g.PublishedBy</small></td>
<td><small>@g.Notes</small></td>
<td>
@if (g.Status == GenerationStatus.Draft)
{
<a class="btn btn-sm btn-primary" href="/clusters/@ClusterId/draft/@g.GenerationId">Open</a>
}
else if (g.Status is GenerationStatus.Published or GenerationStatus.Superseded)
{
<button class="btn btn-sm btn-outline-warning" @onclick="() => RollbackAsync(g.GenerationId)">Roll back to this</button>
}
</td>
</tr>
}
</tbody>
</table>
}
@if (_error is not null) { <div class="alert alert-danger">@_error</div> }
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<ConfigGeneration>? _generations;
private string? _error;
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync() =>
_generations = await GenerationSvc.ListRecentAsync(ClusterId, 100, CancellationToken.None);
private async Task RollbackAsync(long targetId)
{
_error = null;
try
{
await GenerationSvc.RollbackAsync(ClusterId, targetId, notes: $"Rollback via Admin UI", CancellationToken.None);
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
private static MarkupString StatusBadge(GenerationStatus s) => s switch
{
GenerationStatus.Draft => new MarkupString("<span class='badge bg-info'>Draft</span>"),
GenerationStatus.Published => new MarkupString("<span class='badge bg-success'>Published</span>"),
GenerationStatus.Superseded => new MarkupString("<span class='badge bg-secondary'>Superseded</span>"),
_ => new MarkupString($"<span class='badge bg-light text-dark'>{s}</span>"),
};
}

View File

@@ -0,0 +1,69 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject NamespaceService NsSvc
<div class="d-flex justify-content-between mb-3">
<h4>Namespaces</h4>
<button class="btn btn-sm btn-primary" @onclick="() => _showForm = true">Add namespace</button>
</div>
@if (_namespaces is null) { <p>Loading…</p> }
else if (_namespaces.Count == 0) { <p class="text-muted">No namespaces defined in this draft.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>NamespaceId</th><th>Kind</th><th>URI</th><th>Enabled</th></tr></thead>
<tbody>
@foreach (var n in _namespaces)
{
<tr><td><code>@n.NamespaceId</code></td><td>@n.Kind</td><td>@n.NamespaceUri</td><td>@(n.Enabled ? "yes" : "no")</td></tr>
}
</tbody>
</table>
}
@if (_showForm)
{
<div class="card">
<div class="card-body">
<div class="row g-3">
<div class="col-md-6"><label class="form-label">NamespaceUri</label><input class="form-control" @bind="_uri"/></div>
<div class="col-md-6">
<label class="form-label">Kind</label>
<select class="form-select" @bind="_kind">
<option value="@NamespaceKind.Equipment">Equipment</option>
<option value="@NamespaceKind.SystemPlatform">SystemPlatform (Galaxy)</option>
</select>
</div>
</div>
<div class="mt-3">
<button class="btn btn-sm btn-primary" @onclick="SaveAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="() => _showForm = false">Cancel</button>
</div>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<Namespace>? _namespaces;
private bool _showForm;
private string _uri = string.Empty;
private NamespaceKind _kind = NamespaceKind.Equipment;
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync() =>
_namespaces = await NsSvc.ListAsync(GenerationId, CancellationToken.None);
private async Task SaveAsync()
{
if (string.IsNullOrWhiteSpace(_uri)) return;
await NsSvc.AddAsync(GenerationId, ClusterId, _uri, _kind, CancellationToken.None);
_uri = string.Empty;
_showForm = false;
await ReloadAsync();
}
}

View File

@@ -0,0 +1,104 @@
@page "/clusters/new"
@using System.ComponentModel.DataAnnotations
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject ClusterService ClusterSvc
@inject GenerationService GenerationSvc
@inject NavigationManager Nav
<h1 class="mb-4">New cluster</h1>
<EditForm Model="_input" OnValidSubmit="CreateAsync" FormName="new-cluster">
<DataAnnotationsValidator/>
<div class="row g-3">
<div class="col-md-6">
<label class="form-label">ClusterId <span class="text-danger">*</span></label>
<InputText @bind-Value="_input.ClusterId" class="form-control"/>
<div class="form-text">Stable internal ID. Lowercase alphanumeric + hyphens; ≤ 64 chars.</div>
<ValidationMessage For="() => _input.ClusterId"/>
</div>
<div class="col-md-6">
<label class="form-label">Display name <span class="text-danger">*</span></label>
<InputText @bind-Value="_input.Name" class="form-control"/>
<ValidationMessage For="() => _input.Name"/>
</div>
<div class="col-md-4">
<label class="form-label">Enterprise</label>
<InputText @bind-Value="_input.Enterprise" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Site</label>
<InputText @bind-Value="_input.Site" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Redundancy</label>
<InputSelect @bind-Value="_input.RedundancyMode" class="form-select">
<option value="@RedundancyMode.None">None (single node)</option>
<option value="@RedundancyMode.Warm">Warm (2 nodes)</option>
<option value="@RedundancyMode.Hot">Hot (2 nodes)</option>
</InputSelect>
</div>
</div>
@if (!string.IsNullOrEmpty(_error))
{
<div class="alert alert-danger mt-3">@_error</div>
}
<div class="mt-4">
<button type="submit" class="btn btn-primary" disabled="@_submitting">Create cluster</button>
<a href="/clusters" class="btn btn-secondary ms-2">Cancel</a>
</div>
</EditForm>
@code {
private sealed class Input
{
[Required, RegularExpression("^[a-z0-9-]{1,64}$", ErrorMessage = "Lowercase alphanumeric + hyphens only")]
public string ClusterId { get; set; } = string.Empty;
[Required, StringLength(128)]
public string Name { get; set; } = string.Empty;
[StringLength(32)] public string Enterprise { get; set; } = "zb";
[StringLength(32)] public string Site { get; set; } = "dev";
public RedundancyMode RedundancyMode { get; set; } = RedundancyMode.None;
}
private Input _input = new();
private bool _submitting;
private string? _error;
private async Task CreateAsync()
{
_submitting = true;
_error = null;
try
{
var cluster = new ServerCluster
{
ClusterId = _input.ClusterId,
Name = _input.Name,
Enterprise = _input.Enterprise,
Site = _input.Site,
RedundancyMode = _input.RedundancyMode,
NodeCount = _input.RedundancyMode == RedundancyMode.None ? (byte)1 : (byte)2,
Enabled = true,
CreatedBy = "admin-ui",
};
await ClusterSvc.CreateAsync(cluster, createdBy: "admin-ui", CancellationToken.None);
await GenerationSvc.CreateDraftAsync(cluster.ClusterId, createdBy: "admin-ui", CancellationToken.None);
Nav.NavigateTo($"/clusters/{cluster.ClusterId}");
}
catch (Exception ex)
{
_error = ex.Message;
}
finally { _submitting = false; }
}
}

View File

@@ -0,0 +1,115 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject UnsService UnsSvc
<div class="row">
<div class="col-md-6">
<div class="d-flex justify-content-between mb-2">
<h4>UNS Areas</h4>
<button class="btn btn-sm btn-primary" @onclick="() => _showAreaForm = true">Add area</button>
</div>
@if (_areas is null) { <p>Loading…</p> }
else if (_areas.Count == 0) { <p class="text-muted">No areas yet.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>AreaId</th><th>Name</th></tr></thead>
<tbody>
@foreach (var a in _areas)
{
<tr><td><code>@a.UnsAreaId</code></td><td>@a.Name</td></tr>
}
</tbody>
</table>
}
@if (_showAreaForm)
{
<div class="card">
<div class="card-body">
<div class="mb-2"><label class="form-label">Name (lowercase segment)</label><input class="form-control" @bind="_newAreaName"/></div>
<button class="btn btn-sm btn-primary" @onclick="AddAreaAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="() => _showAreaForm = false">Cancel</button>
</div>
</div>
}
</div>
<div class="col-md-6">
<div class="d-flex justify-content-between mb-2">
<h4>UNS Lines</h4>
<button class="btn btn-sm btn-primary" @onclick="() => _showLineForm = true" disabled="@(_areas is null || _areas.Count == 0)">Add line</button>
</div>
@if (_lines is null) { <p>Loading…</p> }
else if (_lines.Count == 0) { <p class="text-muted">No lines yet.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>LineId</th><th>Area</th><th>Name</th></tr></thead>
<tbody>
@foreach (var l in _lines)
{
<tr><td><code>@l.UnsLineId</code></td><td><code>@l.UnsAreaId</code></td><td>@l.Name</td></tr>
}
</tbody>
</table>
}
@if (_showLineForm && _areas is not null)
{
<div class="card">
<div class="card-body">
<div class="mb-2">
<label class="form-label">Area</label>
<select class="form-select" @bind="_newLineAreaId">
@foreach (var a in _areas) { <option value="@a.UnsAreaId">@a.Name (@a.UnsAreaId)</option> }
</select>
</div>
<div class="mb-2"><label class="form-label">Name</label><input class="form-control" @bind="_newLineName"/></div>
<button class="btn btn-sm btn-primary" @onclick="AddLineAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="() => _showLineForm = false">Cancel</button>
</div>
</div>
}
</div>
</div>
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<UnsArea>? _areas;
private List<UnsLine>? _lines;
private bool _showAreaForm;
private bool _showLineForm;
private string _newAreaName = string.Empty;
private string _newLineName = string.Empty;
private string _newLineAreaId = string.Empty;
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync()
{
_areas = await UnsSvc.ListAreasAsync(GenerationId, CancellationToken.None);
_lines = await UnsSvc.ListLinesAsync(GenerationId, CancellationToken.None);
}
private async Task AddAreaAsync()
{
if (string.IsNullOrWhiteSpace(_newAreaName)) return;
await UnsSvc.AddAreaAsync(GenerationId, ClusterId, _newAreaName, notes: null, CancellationToken.None);
_newAreaName = string.Empty;
_showAreaForm = false;
await ReloadAsync();
}
private async Task AddLineAsync()
{
if (string.IsNullOrWhiteSpace(_newLineName) || string.IsNullOrWhiteSpace(_newLineAreaId)) return;
await UnsSvc.AddLineAsync(GenerationId, _newLineAreaId, _newLineName, notes: null, CancellationToken.None);
_newLineName = string.Empty;
_showLineForm = false;
await ReloadAsync();
}
}

View File

@@ -0,0 +1,172 @@
@page "/fleet"
@using Microsoft.EntityFrameworkCore
@using ZB.MOM.WW.OtOpcUa.Configuration
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject IServiceScopeFactory ScopeFactory
@implements IDisposable
<h1 class="mb-4">Fleet status</h1>
<div class="d-flex align-items-center mb-3 gap-2">
<button class="btn btn-sm btn-outline-primary" @onclick="RefreshAsync" disabled="@_refreshing">
@if (_refreshing) { <span class="spinner-border spinner-border-sm me-1" /> }
Refresh
</button>
<span class="text-muted small">
Auto-refresh every @RefreshIntervalSeconds s. Last updated: @(_lastRefreshUtc?.ToString("HH:mm:ss 'UTC'") ?? "—")
</span>
</div>
@if (_rows is null)
{
<p>Loading…</p>
}
else if (_rows.Count == 0)
{
<div class="alert alert-info">
No node state recorded yet. Nodes publish their state to the central DB on each poll; if
this list is empty, either no nodes have been registered or the poller hasn't run yet.
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3">
<div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Nodes</h6>
<div class="fs-3">@_rows.Count</div>
</div></div>
</div>
<div class="col-md-3">
<div class="card border-success"><div class="card-body">
<h6 class="text-muted mb-1">Applied</h6>
<div class="fs-3 text-success">@_rows.Count(r => r.Status == "Applied")</div>
</div></div>
</div>
<div class="col-md-3">
<div class="card border-warning"><div class="card-body">
<h6 class="text-muted mb-1">Stale</h6>
<div class="fs-3 text-warning">@_rows.Count(r => IsStale(r))</div>
</div></div>
</div>
<div class="col-md-3">
<div class="card border-danger"><div class="card-body">
<h6 class="text-muted mb-1">Failed</h6>
<div class="fs-3 text-danger">@_rows.Count(r => r.Status == "Failed")</div>
</div></div>
</div>
</div>
<table class="table table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Cluster</th>
<th>Generation</th>
<th>Status</th>
<th>Last applied</th>
<th>Last seen</th>
<th>Error</th>
</tr>
</thead>
<tbody>
@foreach (var r in _rows)
{
<tr class="@RowClass(r)">
<td><code>@r.NodeId</code></td>
<td>@r.ClusterId</td>
<td>@(r.GenerationId?.ToString() ?? "—")</td>
<td>
<span class="badge @StatusBadge(r.Status)">@(r.Status ?? "—")</span>
</td>
<td>@FormatAge(r.AppliedAt)</td>
<td class="@(IsStale(r) ? "text-warning" : "")">@FormatAge(r.SeenAt)</td>
<td class="text-truncate" style="max-width: 320px;" title="@r.Error">@r.Error</td>
</tr>
}
</tbody>
</table>
}
@code {
// Refresh cadence. 5s matches FleetStatusPoller's poll interval — the dashboard always sees
// the most recent published state without polling ahead of the broadcaster.
private const int RefreshIntervalSeconds = 5;
private List<FleetNodeRow>? _rows;
private bool _refreshing;
private DateTime? _lastRefreshUtc;
private Timer? _timer;
protected override async Task OnInitializedAsync()
{
await RefreshAsync();
_timer = new Timer(async _ => await InvokeAsync(RefreshAsync),
state: null,
dueTime: TimeSpan.FromSeconds(RefreshIntervalSeconds),
period: TimeSpan.FromSeconds(RefreshIntervalSeconds));
}
private async Task RefreshAsync()
{
if (_refreshing) return;
_refreshing = true;
try
{
using var scope = ScopeFactory.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
var rows = await db.ClusterNodeGenerationStates.AsNoTracking()
.Join(db.ClusterNodes.AsNoTracking(), s => s.NodeId, n => n.NodeId, (s, n) => new FleetNodeRow(
s.NodeId, n.ClusterId, s.CurrentGenerationId,
s.LastAppliedStatus != null ? s.LastAppliedStatus.ToString() : null,
s.LastAppliedError, s.LastAppliedAt, s.LastSeenAt))
.OrderBy(r => r.ClusterId)
.ThenBy(r => r.NodeId)
.ToListAsync();
_rows = rows;
_lastRefreshUtc = DateTime.UtcNow;
}
finally
{
_refreshing = false;
StateHasChanged();
}
}
private static bool IsStale(FleetNodeRow r)
{
if (r.SeenAt is null) return true;
return (DateTime.UtcNow - r.SeenAt.Value) > TimeSpan.FromSeconds(30);
}
private static string RowClass(FleetNodeRow r) => r.Status switch
{
"Failed" => "table-danger",
_ when IsStale(r) => "table-warning",
_ => "",
};
private static string StatusBadge(string? status) => status switch
{
"Applied" => "bg-success",
"Failed" => "bg-danger",
"Applying" => "bg-info",
_ => "bg-secondary",
};
private static string FormatAge(DateTime? t)
{
if (t is null) return "—";
var age = DateTime.UtcNow - t.Value;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 24) return $"{(int)age.TotalHours}h ago";
return t.Value.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
public void Dispose() => _timer?.Dispose();
internal sealed record FleetNodeRow(
string NodeId, string ClusterId, long? GenerationId,
string? Status, string? Error, DateTime? AppliedAt, DateTime? SeenAt);
}

View File

@@ -0,0 +1,72 @@
@page "/"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject ClusterService ClusterSvc
@inject GenerationService GenerationSvc
@inject NavigationManager Nav
<h1 class="mb-4">Fleet overview</h1>
@if (_clusters is null)
{
<p>Loading…</p>
}
else if (_clusters.Count == 0)
{
<div class="alert alert-info">
No clusters configured yet. <a href="/clusters/new">Create the first cluster</a>.
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3">
<div class="card"><div class="card-body"><h6 class="text-muted">Clusters</h6><div class="fs-2">@_clusters.Count</div></div></div>
</div>
<div class="col-md-3">
<div class="card"><div class="card-body"><h6 class="text-muted">Active drafts</h6><div class="fs-2">@_activeDraftCount</div></div></div>
</div>
<div class="col-md-3">
<div class="card"><div class="card-body"><h6 class="text-muted">Published generations</h6><div class="fs-2">@_publishedCount</div></div></div>
</div>
<div class="col-md-3">
<div class="card"><div class="card-body"><h6 class="text-muted">Disabled clusters</h6><div class="fs-2">@_clusters.Count(c => !c.Enabled)</div></div></div>
</div>
</div>
<h4 class="mt-4 mb-3">Clusters</h4>
<table class="table table-hover">
<thead><tr><th>ClusterId</th><th>Name</th><th>Enterprise / Site</th><th>Redundancy</th><th>Enabled</th><th></th></tr></thead>
<tbody>
@foreach (var c in _clusters)
{
<tr style="cursor: pointer;">
<td><code>@c.ClusterId</code></td>
<td>@c.Name</td>
<td>@c.Enterprise / @c.Site</td>
<td>@c.RedundancyMode</td>
<td>@(c.Enabled ? "Yes" : "No")</td>
<td><a href="/clusters/@c.ClusterId" class="btn btn-sm btn-outline-primary">Open</a></td>
</tr>
}
</tbody>
</table>
}
@code {
private List<ServerCluster>? _clusters;
private int _activeDraftCount;
private int _publishedCount;
protected override async Task OnInitializedAsync()
{
_clusters = await ClusterSvc.ListAsync(CancellationToken.None);
foreach (var c in _clusters)
{
var gens = await GenerationSvc.ListRecentAsync(c.ClusterId, 50, CancellationToken.None);
_activeDraftCount += gens.Count(g => g.Status.ToString() == "Draft");
_publishedCount += gens.Count(g => g.Status.ToString() == "Published");
}
}
}

View File

@@ -0,0 +1,100 @@
@page "/login"
@using System.Security.Claims
@using Microsoft.AspNetCore.Authentication
@using Microsoft.AspNetCore.Authentication.Cookies
@using ZB.MOM.WW.OtOpcUa.Admin.Security
@inject IHttpContextAccessor Http
@inject ILdapAuthService LdapAuth
@inject NavigationManager Nav
<div class="row justify-content-center mt-5">
<div class="col-md-5">
<div class="card">
<div class="card-body">
<h4 class="mb-4">OtOpcUa Admin — sign in</h4>
<EditForm Model="_input" OnValidSubmit="SignInAsync" FormName="login">
<div class="mb-3">
<label class="form-label">Username</label>
<InputText @bind-Value="_input.Username" class="form-control" autocomplete="username"/>
</div>
<div class="mb-3">
<label class="form-label">Password</label>
<InputText type="password" @bind-Value="_input.Password" class="form-control" autocomplete="current-password"/>
</div>
@if (_error is not null) { <div class="alert alert-danger">@_error</div> }
<button class="btn btn-primary w-100" type="submit" disabled="@_busy">
@(_busy ? "Signing in…" : "Sign in")
</button>
</EditForm>
<hr/>
<small class="text-muted">
LDAP bind against the configured directory. Dev defaults to GLAuth on
<code>localhost:3893</code>.
</small>
</div>
</div>
</div>
</div>
@code {
private sealed class Input
{
public string Username { get; set; } = string.Empty;
public string Password { get; set; } = string.Empty;
}
private Input _input = new();
private string? _error;
private bool _busy;
private async Task SignInAsync()
{
_error = null;
_busy = true;
try
{
if (string.IsNullOrWhiteSpace(_input.Username) || string.IsNullOrWhiteSpace(_input.Password))
{
_error = "Username and password are required";
return;
}
var result = await LdapAuth.AuthenticateAsync(_input.Username, _input.Password, CancellationToken.None);
if (!result.Success)
{
_error = result.Error ?? "Sign-in failed";
return;
}
if (result.Roles.Count == 0)
{
_error = "Sign-in succeeded but no Admin roles mapped for your LDAP groups. Contact your administrator.";
return;
}
var ctx = Http.HttpContext
?? throw new InvalidOperationException("HttpContext unavailable at sign-in");
var claims = new List<Claim>
{
new(ClaimTypes.Name, result.DisplayName ?? result.Username ?? _input.Username),
new(ClaimTypes.NameIdentifier, _input.Username),
};
foreach (var role in result.Roles)
claims.Add(new Claim(ClaimTypes.Role, role));
foreach (var group in result.Groups)
claims.Add(new Claim("ldap_group", group));
var identity = new ClaimsIdentity(claims, CookieAuthenticationDefaults.AuthenticationScheme);
await ctx.SignInAsync(CookieAuthenticationDefaults.AuthenticationScheme,
new ClaimsPrincipal(identity));
ctx.Response.Redirect("/");
}
finally { _busy = false; }
}
}

View File

@@ -0,0 +1,114 @@
@page "/reservations"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using Microsoft.AspNetCore.Authorization
@attribute [Authorize(Policy = "CanPublish")]
@inject ReservationService ReservationSvc
<h1 class="mb-4">External-ID reservations</h1>
<p class="text-muted">
Fleet-wide ZTag + SAPID reservation state (decision #124). Releasing a reservation is a
FleetAdmin-only audit-logged action — only release when the physical asset is permanently
retired and its ID needs to be reused by a different equipment.
</p>
<h4 class="mt-4">Active</h4>
@if (_active is null) { <p>Loading…</p> }
else if (_active.Count == 0) { <p class="text-muted">No active reservations.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>Kind</th><th>Value</th><th>EquipmentUuid</th><th>Cluster</th><th>First published</th><th>Last published</th><th></th></tr></thead>
<tbody>
@foreach (var r in _active)
{
<tr>
<td><code>@r.Kind</code></td>
<td><code>@r.Value</code></td>
<td><code>@r.EquipmentUuid</code></td>
<td>@r.ClusterId</td>
<td><small>@r.FirstPublishedAt.ToString("u") by @r.FirstPublishedBy</small></td>
<td><small>@r.LastPublishedAt.ToString("u")</small></td>
<td><button class="btn btn-sm btn-outline-danger" @onclick='() => OpenReleaseDialog(r)'>Release…</button></td>
</tr>
}
</tbody>
</table>
}
<h4 class="mt-4">Released (most recent 100)</h4>
@if (_released is null) { <p>Loading…</p> }
else if (_released.Count == 0) { <p class="text-muted">No released reservations yet.</p> }
else
{
<table class="table table-sm">
<thead><tr><th>Kind</th><th>Value</th><th>Released at</th><th>By</th><th>Reason</th></tr></thead>
<tbody>
@foreach (var r in _released)
{
<tr><td><code>@r.Kind</code></td><td><code>@r.Value</code></td><td>@r.ReleasedAt?.ToString("u")</td><td>@r.ReleasedBy</td><td>@r.ReleaseReason</td></tr>
}
</tbody>
</table>
}
@if (_releasing is not null)
{
<div class="modal show d-block" tabindex="-1" style="background-color: rgba(0,0,0,0.5);">
<div class="modal-dialog">
<div class="modal-content">
<div class="modal-header">
<h5 class="modal-title">Release reservation <code>@_releasing.Kind</code> = <code>@_releasing.Value</code></h5>
</div>
<div class="modal-body">
<p>This makes the (Kind, Value) pair available for a different EquipmentUuid in a future publish. Audit-logged.</p>
<label class="form-label">Reason (required)</label>
<textarea class="form-control" rows="3" @bind="_reason"></textarea>
@if (_error is not null) { <div class="alert alert-danger mt-2">@_error</div> }
</div>
<div class="modal-footer">
<button class="btn btn-secondary" @onclick='() => _releasing = null'>Cancel</button>
<button class="btn btn-danger" @onclick="ReleaseAsync" disabled="@_busy">Release</button>
</div>
</div>
</div>
</div>
}
@code {
private List<ExternalIdReservation>? _active;
private List<ExternalIdReservation>? _released;
private ExternalIdReservation? _releasing;
private string _reason = string.Empty;
private bool _busy;
private string? _error;
protected override async Task OnInitializedAsync() => await ReloadAsync();
private async Task ReloadAsync()
{
_active = await ReservationSvc.ListActiveAsync(CancellationToken.None);
_released = await ReservationSvc.ListReleasedAsync(CancellationToken.None);
}
private void OpenReleaseDialog(ExternalIdReservation r)
{
_releasing = r;
_reason = string.Empty;
_error = null;
}
private async Task ReleaseAsync()
{
if (_releasing is null || string.IsNullOrWhiteSpace(_reason)) { _error = "Reason is required"; return; }
_busy = true;
try
{
await ReservationSvc.ReleaseAsync(_releasing.Kind.ToString(), _releasing.Value, _reason, CancellationToken.None);
_releasing = null;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
finally { _busy = false; }
}
}

View File

@@ -0,0 +1,11 @@
@using Microsoft.AspNetCore.Components.Routing
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Layout
<Router AppAssembly="@typeof(Program).Assembly">
<Found Context="routeData">
<RouteView RouteData="@routeData" DefaultLayout="@typeof(MainLayout)"/>
</Found>
<NotFound>
<LayoutView Layout="@typeof(MainLayout)"><p>Not found.</p></LayoutView>
</NotFound>
</Router>

View File

@@ -0,0 +1,14 @@
@using System.Net.Http
@using Microsoft.AspNetCore.Components
@using Microsoft.AspNetCore.Components.Forms
@using Microsoft.AspNetCore.Components.Routing
@using Microsoft.AspNetCore.Components.Web
@using Microsoft.AspNetCore.Components.Web.Virtualization
@using Microsoft.AspNetCore.Components.Authorization
@using Microsoft.AspNetCore.Http
@using Microsoft.JSInterop
@using ZB.MOM.WW.OtOpcUa.Admin
@using ZB.MOM.WW.OtOpcUa.Admin.Components
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Layout
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Pages
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Pages.Clusters

View File

@@ -0,0 +1,31 @@
using Microsoft.AspNetCore.SignalR;
namespace ZB.MOM.WW.OtOpcUa.Admin.Hubs;
/// <summary>
/// Pushes sticky alerts (crash-loop circuit trips, failed applies, reservation-release
/// anomalies) to subscribed admin clients. Alerts don't auto-clear — the operator acks them
/// from the UI via <see cref="AcknowledgeAsync"/>.
/// </summary>
public sealed class AlertHub : Hub
{
public const string AllAlertsGroup = "__alerts__";
public override async Task OnConnectedAsync()
{
await Groups.AddToGroupAsync(Context.ConnectionId, AllAlertsGroup);
await base.OnConnectedAsync();
}
/// <summary>Client-initiated ack. The server side of ack persistence is deferred — v2.1.</summary>
public Task AcknowledgeAsync(string alertId) => Task.CompletedTask;
}
public sealed record AlertMessage(
string AlertId,
string Severity,
string Title,
string Detail,
DateTime RaisedAtUtc,
string? ClusterId,
string? NodeId);

View File

@@ -0,0 +1,39 @@
using Microsoft.AspNetCore.SignalR;
namespace ZB.MOM.WW.OtOpcUa.Admin.Hubs;
/// <summary>
/// Pushes per-node generation-apply state changes (<c>ClusterNodeGenerationState</c>) to
/// subscribed browser clients. Clients call <c>SubscribeCluster(clusterId)</c> on connect to
/// scope notifications; the server sends <c>NodeStateChanged</c> messages whenever the poller
/// observes a delta.
/// </summary>
public sealed class FleetStatusHub : Hub
{
public Task SubscribeCluster(string clusterId)
{
if (string.IsNullOrWhiteSpace(clusterId)) return Task.CompletedTask;
return Groups.AddToGroupAsync(Context.ConnectionId, GroupName(clusterId));
}
public Task UnsubscribeCluster(string clusterId)
{
if (string.IsNullOrWhiteSpace(clusterId)) return Task.CompletedTask;
return Groups.RemoveFromGroupAsync(Context.ConnectionId, GroupName(clusterId));
}
/// <summary>Clients call this once to also receive fleet-wide status — used by the dashboard.</summary>
public Task SubscribeFleet() => Groups.AddToGroupAsync(Context.ConnectionId, FleetGroup);
public const string FleetGroup = "__fleet__";
public static string GroupName(string clusterId) => $"cluster:{clusterId}";
}
public sealed record NodeStateChangedMessage(
string NodeId,
string ClusterId,
long? CurrentGenerationId,
string? LastAppliedStatus,
string? LastAppliedError,
DateTime? LastAppliedAt,
DateTime? LastSeenAt);

View File

@@ -0,0 +1,93 @@
using Microsoft.AspNetCore.SignalR;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Hubs;
/// <summary>
/// Polls <c>ClusterNodeGenerationState</c> every <see cref="PollInterval"/> and publishes
/// per-node deltas to <see cref="FleetStatusHub"/>. Also raises sticky
/// <see cref="AlertMessage"/>s on transitions into <c>Failed</c>.
/// </summary>
public sealed class FleetStatusPoller(
IServiceScopeFactory scopeFactory,
IHubContext<FleetStatusHub> fleetHub,
IHubContext<AlertHub> alertHub,
ILogger<FleetStatusPoller> logger) : BackgroundService
{
public TimeSpan PollInterval { get; init; } = TimeSpan.FromSeconds(5);
private readonly Dictionary<string, NodeStateSnapshot> _last = new();
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
logger.LogInformation("FleetStatusPoller starting — interval {Interval}s", PollInterval.TotalSeconds);
while (!stoppingToken.IsCancellationRequested)
{
try { await PollOnceAsync(stoppingToken); }
catch (Exception ex) when (ex is not OperationCanceledException)
{
logger.LogWarning(ex, "FleetStatusPoller tick failed");
}
try { await Task.Delay(PollInterval, stoppingToken); }
catch (OperationCanceledException) { break; }
}
}
internal async Task PollOnceAsync(CancellationToken ct)
{
using var scope = scopeFactory.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
var rows = await db.ClusterNodeGenerationStates.AsNoTracking()
.Join(db.ClusterNodes.AsNoTracking(), s => s.NodeId, n => n.NodeId, (s, n) => new { s, n.ClusterId })
.ToListAsync(ct);
foreach (var r in rows)
{
var snapshot = new NodeStateSnapshot(
r.s.NodeId, r.ClusterId, r.s.CurrentGenerationId,
r.s.LastAppliedStatus?.ToString(), r.s.LastAppliedError,
r.s.LastAppliedAt, r.s.LastSeenAt);
var hadPrior = _last.TryGetValue(r.s.NodeId, out var prior);
if (!hadPrior || prior != snapshot)
{
_last[r.s.NodeId] = snapshot;
var msg = new NodeStateChangedMessage(
snapshot.NodeId, snapshot.ClusterId, snapshot.GenerationId,
snapshot.Status, snapshot.Error, snapshot.AppliedAt, snapshot.SeenAt);
await fleetHub.Clients.Group(FleetStatusHub.GroupName(snapshot.ClusterId))
.SendAsync("NodeStateChanged", msg, ct);
await fleetHub.Clients.Group(FleetStatusHub.FleetGroup)
.SendAsync("NodeStateChanged", msg, ct);
if (snapshot.Status == "Failed" && (!hadPrior || prior.Status != "Failed"))
{
var alert = new AlertMessage(
AlertId: $"{snapshot.NodeId}:apply-failed",
Severity: "error",
Title: $"Apply failed on {snapshot.NodeId}",
Detail: snapshot.Error ?? "(no detail)",
RaisedAtUtc: DateTime.UtcNow,
ClusterId: snapshot.ClusterId,
NodeId: snapshot.NodeId);
await alertHub.Clients.Group(AlertHub.AllAlertsGroup)
.SendAsync("AlertRaised", alert, ct);
}
}
}
}
/// <summary>Exposed for tests — forces a snapshot reset so stub data re-seeds.</summary>
internal void ResetCache() => _last.Clear();
private readonly record struct NodeStateSnapshot(
string NodeId, string ClusterId, long? GenerationId,
string? Status, string? Error, DateTime? AppliedAt, DateTime? SeenAt);
}

View File

@@ -0,0 +1,86 @@
using Microsoft.AspNetCore.Authentication;
using Microsoft.AspNetCore.Authentication.Cookies;
using Microsoft.EntityFrameworkCore;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Admin.Components;
using ZB.MOM.WW.OtOpcUa.Admin.Hubs;
using ZB.MOM.WW.OtOpcUa.Admin.Security;
using ZB.MOM.WW.OtOpcUa.Admin.Services;
using ZB.MOM.WW.OtOpcUa.Configuration;
var builder = WebApplication.CreateBuilder(args);
builder.Host.UseSerilog((ctx, cfg) => cfg
.MinimumLevel.Information()
.WriteTo.Console()
.WriteTo.File("logs/otopcua-admin-.log", rollingInterval: RollingInterval.Day));
builder.Services.AddRazorComponents().AddInteractiveServerComponents();
builder.Services.AddHttpContextAccessor();
builder.Services.AddSignalR();
builder.Services.AddAuthentication(CookieAuthenticationDefaults.AuthenticationScheme)
.AddCookie(o =>
{
o.Cookie.Name = "OtOpcUa.Admin";
o.LoginPath = "/login";
o.ExpireTimeSpan = TimeSpan.FromHours(8);
});
builder.Services.AddAuthorizationBuilder()
.AddPolicy("CanEdit", p => p.RequireRole(AdminRoles.ConfigEditor, AdminRoles.FleetAdmin))
.AddPolicy("CanPublish", p => p.RequireRole(AdminRoles.FleetAdmin));
builder.Services.AddCascadingAuthenticationState();
builder.Services.AddDbContext<OtOpcUaConfigDbContext>(opt =>
opt.UseSqlServer(builder.Configuration.GetConnectionString("ConfigDb")
?? throw new InvalidOperationException("ConnectionStrings:ConfigDb not configured")));
builder.Services.AddScoped<ClusterService>();
builder.Services.AddScoped<GenerationService>();
builder.Services.AddScoped<EquipmentService>();
builder.Services.AddScoped<UnsService>();
builder.Services.AddScoped<NamespaceService>();
builder.Services.AddScoped<DriverInstanceService>();
builder.Services.AddScoped<NodeAclService>();
builder.Services.AddScoped<ReservationService>();
builder.Services.AddScoped<DraftValidationService>();
builder.Services.AddScoped<AuditLogService>();
// Cert-trust management — reads the OPC UA server's PKI store root so rejected client certs
// can be promoted to trusted via the Admin UI. Singleton: no per-request state, just
// filesystem operations.
builder.Services.Configure<CertTrustOptions>(builder.Configuration.GetSection(CertTrustOptions.SectionName));
builder.Services.AddSingleton<CertTrustService>();
// LDAP auth — parity with ScadaLink's LdapAuthService (decision #102).
builder.Services.Configure<LdapOptions>(
builder.Configuration.GetSection("Authentication:Ldap"));
builder.Services.AddScoped<ILdapAuthService, LdapAuthService>();
// SignalR real-time fleet status + alerts (admin-ui.md §"Real-Time Updates").
builder.Services.AddHostedService<FleetStatusPoller>();
var app = builder.Build();
app.UseSerilogRequestLogging();
app.UseStaticFiles();
app.UseAuthentication();
app.UseAuthorization();
app.UseAntiforgery();
app.MapPost("/auth/logout", async (HttpContext ctx) =>
{
await ctx.SignOutAsync(CookieAuthenticationDefaults.AuthenticationScheme);
ctx.Response.Redirect("/");
});
app.MapHub<FleetStatusHub>("/hubs/fleet");
app.MapHub<AlertHub>("/hubs/alerts");
app.MapRazorComponents<App>().AddInteractiveServerRenderMode();
await app.RunAsync();
public partial class Program;

View File

@@ -0,0 +1,6 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Security;
public interface ILdapAuthService
{
Task<LdapAuthResult> AuthenticateAsync(string username, string password, CancellationToken ct = default);
}

View File

@@ -0,0 +1,10 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Security;
/// <summary>Outcome of an LDAP bind attempt. <see cref="Roles"/> is the mapped-set of Admin roles.</summary>
public sealed record LdapAuthResult(
bool Success,
string? DisplayName,
string? Username,
IReadOnlyList<string> Groups,
IReadOnlyList<string> Roles,
string? Error);

View File

@@ -0,0 +1,160 @@
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using Novell.Directory.Ldap;
namespace ZB.MOM.WW.OtOpcUa.Admin.Security;
/// <summary>
/// LDAP bind-and-search authentication mirrored from ScadaLink's <c>LdapAuthService</c>
/// (CLAUDE.md memory: <c>scadalink_reference.md</c>) — same bind semantics, TLS guard, and
/// service-account search-then-bind path. Adapted for the Admin app's role-mapping shape
/// (LDAP group names → Admin roles via <see cref="LdapOptions.GroupToRole"/>).
/// </summary>
public sealed class LdapAuthService(IOptions<LdapOptions> options, ILogger<LdapAuthService> logger)
: ILdapAuthService
{
private readonly LdapOptions _options = options.Value;
public async Task<LdapAuthResult> AuthenticateAsync(string username, string password, CancellationToken ct = default)
{
if (string.IsNullOrWhiteSpace(username))
return new(false, null, null, [], [], "Username is required");
if (string.IsNullOrWhiteSpace(password))
return new(false, null, null, [], [], "Password is required");
if (!_options.UseTls && !_options.AllowInsecureLdap)
return new(false, null, username, [], [],
"Insecure LDAP is disabled. Enable UseTls or set AllowInsecureLdap for dev/test.");
try
{
using var conn = new LdapConnection();
if (_options.UseTls) conn.SecureSocketLayer = true;
await Task.Run(() => conn.Connect(_options.Server, _options.Port), ct);
var bindDn = await ResolveUserDnAsync(conn, username, ct);
await Task.Run(() => conn.Bind(bindDn, password), ct);
if (!string.IsNullOrWhiteSpace(_options.ServiceAccountDn))
await Task.Run(() => conn.Bind(_options.ServiceAccountDn, _options.ServiceAccountPassword), ct);
var displayName = username;
var groups = new List<string>();
try
{
var filter = $"(cn={EscapeLdapFilter(username)})";
var results = await Task.Run(() =>
conn.Search(_options.SearchBase, LdapConnection.ScopeSub, filter,
attrs: null, // request ALL attributes so we can inspect memberOf + dn-derived group
typesOnly: false), ct);
while (results.HasMore())
{
try
{
var entry = results.Next();
var name = entry.GetAttribute(_options.DisplayNameAttribute);
if (name is not null) displayName = name.StringValue;
var groupAttr = entry.GetAttribute(_options.GroupAttribute);
if (groupAttr is not null)
{
foreach (var groupDn in groupAttr.StringValueArray)
groups.Add(ExtractFirstRdnValue(groupDn));
}
// Fallback: GLAuth places users under ou=PrimaryGroup,baseDN. When the
// directory doesn't populate memberOf (or populates it differently), the
// user's primary group name is recoverable from the second RDN of the DN.
if (groups.Count == 0 && !string.IsNullOrEmpty(entry.Dn))
{
var primary = ExtractOuSegment(entry.Dn);
if (primary is not null) groups.Add(primary);
}
}
catch (LdapException) { break; } // no-more-entries signalled by exception
}
}
catch (LdapException ex)
{
logger.LogWarning(ex, "LDAP attribute lookup failed for {User}", username);
}
conn.Disconnect();
var roles = RoleMapper.Map(groups, _options.GroupToRole);
return new(true, displayName, username, groups, roles, null);
}
catch (LdapException ex)
{
logger.LogWarning(ex, "LDAP bind failed for {User}", username);
return new(false, null, username, [], [], "Invalid username or password");
}
catch (Exception ex) when (ex is not OperationCanceledException)
{
logger.LogError(ex, "Unexpected LDAP error for {User}", username);
return new(false, null, username, [], [], "Unexpected authentication error");
}
}
private async Task<string> ResolveUserDnAsync(LdapConnection conn, string username, CancellationToken ct)
{
if (username.Contains('=')) return username; // already a DN
if (!string.IsNullOrWhiteSpace(_options.ServiceAccountDn))
{
await Task.Run(() =>
conn.Bind(_options.ServiceAccountDn, _options.ServiceAccountPassword), ct);
var filter = $"(uid={EscapeLdapFilter(username)})";
var results = await Task.Run(() =>
conn.Search(_options.SearchBase, LdapConnection.ScopeSub, filter, ["dn"], false), ct);
if (results.HasMore())
return results.Next().Dn;
throw new LdapException("User not found", LdapException.NoSuchObject,
$"No entry for uid={username}");
}
return string.IsNullOrWhiteSpace(_options.SearchBase)
? $"cn={username}"
: $"cn={username},{_options.SearchBase}";
}
internal static string EscapeLdapFilter(string input) =>
input.Replace("\\", "\\5c")
.Replace("*", "\\2a")
.Replace("(", "\\28")
.Replace(")", "\\29")
.Replace("\0", "\\00");
/// <summary>
/// Pulls the first <c>ou=Value</c> segment from a DN. GLAuth encodes a user's primary
/// group as an <c>ou=</c> RDN immediately above the user's <c>cn=</c>, so this recovers
/// the group name when <see cref="LdapOptions.GroupAttribute"/> is absent from the entry.
/// </summary>
internal static string? ExtractOuSegment(string dn)
{
var segments = dn.Split(',');
foreach (var segment in segments)
{
var trimmed = segment.Trim();
if (trimmed.StartsWith("ou=", StringComparison.OrdinalIgnoreCase))
return trimmed[3..];
}
return null;
}
internal static string ExtractFirstRdnValue(string dn)
{
var equalsIdx = dn.IndexOf('=');
if (equalsIdx < 0) return dn;
var valueStart = equalsIdx + 1;
var commaIdx = dn.IndexOf(',', valueStart);
return commaIdx > valueStart ? dn[valueStart..commaIdx] : dn[valueStart..];
}
}

View File

@@ -0,0 +1,38 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Security;
/// <summary>
/// LDAP + role-mapping configuration for the Admin UI. Bound from <c>appsettings.json</c>
/// <c>Authentication:Ldap</c> section. Defaults point at the local GLAuth dev instance (see
/// <c>C:\publish\glauth\auth.md</c>).
/// </summary>
public sealed class LdapOptions
{
public const string SectionName = "Authentication:Ldap";
public bool Enabled { get; set; } = true;
public string Server { get; set; } = "localhost";
public int Port { get; set; } = 3893;
public bool UseTls { get; set; }
/// <summary>Dev-only escape hatch — must be <c>false</c> in production.</summary>
public bool AllowInsecureLdap { get; set; }
public string SearchBase { get; set; } = "dc=lmxopcua,dc=local";
/// <summary>
/// Service-account DN used for search-then-bind. When empty, a direct-bind with
/// <c>cn={user},{SearchBase}</c> is attempted.
/// </summary>
public string ServiceAccountDn { get; set; } = string.Empty;
public string ServiceAccountPassword { get; set; } = string.Empty;
public string DisplayNameAttribute { get; set; } = "cn";
public string GroupAttribute { get; set; } = "memberOf";
/// <summary>
/// Maps LDAP group name → Admin role. Group match is case-insensitive. A user gets every
/// role whose source group is in their membership list. Example dev mapping:
/// <code>"ReadOnly":"ConfigViewer","ReadWrite":"ConfigEditor","AlarmAck":"FleetAdmin"</code>
/// </summary>
public Dictionary<string, string> GroupToRole { get; set; } = new(StringComparer.OrdinalIgnoreCase);
}

View File

@@ -0,0 +1,23 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Security;
/// <summary>
/// Deterministic LDAP-group-to-Admin-role mapper driven by <see cref="LdapOptions.GroupToRole"/>.
/// Every returned role corresponds to a group the user actually holds; no inference.
/// </summary>
public static class RoleMapper
{
public static IReadOnlyList<string> Map(
IReadOnlyCollection<string> ldapGroups,
IReadOnlyDictionary<string, string> groupToRole)
{
if (groupToRole.Count == 0) return [];
var roles = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
foreach (var group in ldapGroups)
{
if (groupToRole.TryGetValue(group, out var role))
roles.Add(role);
}
return [.. roles];
}
}

View File

@@ -0,0 +1,16 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// The three admin roles per <c>admin-ui.md</c> §"Admin Roles" — mapped from LDAP groups at
/// sign-in. Each role has a fixed set of capabilities (cluster CRUD, draft → publish, fleet
/// admin). The ACL-driven runtime permissions (<c>NodePermissions</c>) govern OPC UA clients;
/// these roles govern the Admin UI itself.
/// </summary>
public static class AdminRoles
{
public const string ConfigViewer = "ConfigViewer";
public const string ConfigEditor = "ConfigEditor";
public const string FleetAdmin = "FleetAdmin";
public static IReadOnlyList<string> All => [ConfigViewer, ConfigEditor, FleetAdmin];
}

View File

@@ -0,0 +1,15 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
public sealed class AuditLogService(OtOpcUaConfigDbContext db)
{
public Task<List<ConfigAuditLog>> ListRecentAsync(string? clusterId, int limit, CancellationToken ct)
{
var q = db.ConfigAuditLogs.AsNoTracking();
if (clusterId is not null) q = q.Where(a => a.ClusterId == clusterId);
return q.OrderByDescending(a => a.Timestamp).Take(limit).ToListAsync(ct);
}
}

View File

@@ -0,0 +1,22 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Points the Admin UI at the OPC UA Server's PKI store root so
/// <see cref="CertTrustService"/> can list and move certs between the
/// <c>rejected/</c> and <c>trusted/</c> directories the server maintains. Must match the
/// <c>OpcUaServer:PkiStoreRoot</c> the Server process is configured with.
/// </summary>
public sealed class CertTrustOptions
{
public const string SectionName = "CertTrust";
/// <summary>
/// Absolute path to the PKI root. Defaults to
/// <c>%ProgramData%\OtOpcUa\pki</c> — matches <c>OpcUaServerOptions.PkiStoreRoot</c>'s
/// default so a standard side-by-side install needs no override.
/// </summary>
public string PkiStoreRoot { get; init; } =
Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData),
"OtOpcUa", "pki");
}

View File

@@ -0,0 +1,135 @@
using System.Security.Cryptography.X509Certificates;
using Microsoft.Extensions.Options;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Metadata for a certificate file found in one of the OPC UA server's PKI stores. The
/// <see cref="FilePath"/> is the absolute path of the DER/CRT file the stack created when it
/// rejected the cert (for <see cref="CertStoreKind.Rejected"/>) or when an operator trusted
/// it (for <see cref="CertStoreKind.Trusted"/>).
/// </summary>
public sealed record CertInfo(
string Thumbprint,
string Subject,
string Issuer,
DateTime NotBefore,
DateTime NotAfter,
string FilePath,
CertStoreKind Store);
public enum CertStoreKind
{
Rejected,
Trusted,
}
/// <summary>
/// Filesystem-backed view over the OPC UA server's PKI store. The Opc.Ua stack uses a
/// Directory-typed store — each cert is a <c>.der</c> file under <c>{root}/{store}/certs/</c>
/// with a filename derived from subject + thumbprint. This service exposes operators for the
/// Admin UI: list rejected, list trusted, trust a rejected cert (move to trusted), remove a
/// rejected cert (delete), untrust a previously trusted cert (delete from trusted).
/// </summary>
/// <remarks>
/// The Admin process is separate from the Server process; this service deliberately has no
/// Opc.Ua dependency — it works on the on-disk layout directly so it can run on the Admin
/// host even when the Server isn't installed locally, as long as the PKI root is reachable
/// (typical deployment has Admin + Server side-by-side on the same machine).
///
/// Trust/untrust requires the Server to re-read its trust list. The Opc.Ua stack re-reads
/// the Directory store on each new incoming connection, so there's no explicit signal
/// needed — the next client handshake picks up the change. Operators should retry the
/// rejected client's connection after trusting.
/// </remarks>
public sealed class CertTrustService
{
private readonly CertTrustOptions _options;
private readonly ILogger<CertTrustService> _logger;
public CertTrustService(IOptions<CertTrustOptions> options, ILogger<CertTrustService> logger)
{
_options = options.Value;
_logger = logger;
}
public string PkiStoreRoot => _options.PkiStoreRoot;
public IReadOnlyList<CertInfo> ListRejected() => ListStore(CertStoreKind.Rejected);
public IReadOnlyList<CertInfo> ListTrusted() => ListStore(CertStoreKind.Trusted);
/// <summary>
/// Move the cert with <paramref name="thumbprint"/> from the rejected store to the
/// trusted store. No-op returns false if the rejected file doesn't exist (already moved
/// by another operator, or thumbprint mismatch). Overwrites an existing trusted copy
/// silently — idempotent.
/// </summary>
public bool TrustRejected(string thumbprint)
{
var cert = FindInStore(CertStoreKind.Rejected, thumbprint);
if (cert is null) return false;
var trustedDir = CertsDir(CertStoreKind.Trusted);
Directory.CreateDirectory(trustedDir);
var destPath = Path.Combine(trustedDir, Path.GetFileName(cert.FilePath));
File.Move(cert.FilePath, destPath, overwrite: true);
_logger.LogInformation("Trusted cert {Thumbprint} (subject={Subject}) — moved {From} → {To}",
cert.Thumbprint, cert.Subject, cert.FilePath, destPath);
return true;
}
public bool DeleteRejected(string thumbprint) => DeleteFromStore(CertStoreKind.Rejected, thumbprint);
public bool UntrustCert(string thumbprint) => DeleteFromStore(CertStoreKind.Trusted, thumbprint);
private bool DeleteFromStore(CertStoreKind store, string thumbprint)
{
var cert = FindInStore(store, thumbprint);
if (cert is null) return false;
File.Delete(cert.FilePath);
_logger.LogInformation("Deleted cert {Thumbprint} (subject={Subject}) from {Store} store",
cert.Thumbprint, cert.Subject, store);
return true;
}
private CertInfo? FindInStore(CertStoreKind store, string thumbprint) =>
ListStore(store).FirstOrDefault(c =>
string.Equals(c.Thumbprint, thumbprint, StringComparison.OrdinalIgnoreCase));
private IReadOnlyList<CertInfo> ListStore(CertStoreKind store)
{
var dir = CertsDir(store);
if (!Directory.Exists(dir)) return [];
var results = new List<CertInfo>();
foreach (var path in Directory.EnumerateFiles(dir))
{
// Skip CRL sidecars + private-key files — trust operations only concern public certs.
var ext = Path.GetExtension(path);
if (!ext.Equals(".der", StringComparison.OrdinalIgnoreCase) &&
!ext.Equals(".crt", StringComparison.OrdinalIgnoreCase) &&
!ext.Equals(".cer", StringComparison.OrdinalIgnoreCase))
{
continue;
}
try
{
var cert = X509CertificateLoader.LoadCertificateFromFile(path);
results.Add(new CertInfo(
cert.Thumbprint, cert.Subject, cert.Issuer,
cert.NotBefore.ToUniversalTime(), cert.NotAfter.ToUniversalTime(),
path, store));
}
catch (Exception ex)
{
// A malformed file in the store shouldn't take down the page. Surface it in logs
// but skip — operators see the other certs and can clean the bad file manually.
_logger.LogWarning(ex, "Failed to parse cert at {Path} — skipping", path);
}
}
return results;
}
private string CertsDir(CertStoreKind store) =>
Path.Combine(_options.PkiStoreRoot, store == CertStoreKind.Rejected ? "rejected" : "trusted", "certs");
}

View File

@@ -0,0 +1,28 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Cluster CRUD surface used by the Blazor pages. Writes go through stored procs in later
/// phases; Phase 1 reads via EF Core directly (DENY SELECT on <c>dbo</c> schema means this
/// service connects as a DB owner during dev — production swaps in a read-only view grant).
/// </summary>
public sealed class ClusterService(OtOpcUaConfigDbContext db)
{
public Task<List<ServerCluster>> ListAsync(CancellationToken ct) =>
db.ServerClusters.AsNoTracking().OrderBy(c => c.ClusterId).ToListAsync(ct);
public Task<ServerCluster?> FindAsync(string clusterId, CancellationToken ct) =>
db.ServerClusters.AsNoTracking().FirstOrDefaultAsync(c => c.ClusterId == clusterId, ct);
public async Task<ServerCluster> CreateAsync(ServerCluster cluster, string createdBy, CancellationToken ct)
{
cluster.CreatedAt = DateTime.UtcNow;
cluster.CreatedBy = createdBy;
db.ServerClusters.Add(cluster);
await db.SaveChangesAsync(ct);
return cluster;
}
}

View File

@@ -0,0 +1,45 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Runs the managed <see cref="DraftValidator"/> against a draft's snapshot loaded from the
/// Configuration DB. Used by the draft editor's inline validation panel and by the publish
/// dialog's pre-check. Structural-only SQL checks live in <c>sp_ValidateDraft</c>; this layer
/// owns the content / cross-generation / regex rules.
/// </summary>
public sealed class DraftValidationService(OtOpcUaConfigDbContext db)
{
public async Task<IReadOnlyList<ValidationError>> ValidateAsync(long draftId, CancellationToken ct)
{
var draft = await db.ConfigGenerations.AsNoTracking()
.FirstOrDefaultAsync(g => g.GenerationId == draftId, ct)
?? throw new InvalidOperationException($"Draft {draftId} not found");
var snapshot = new DraftSnapshot
{
GenerationId = draft.GenerationId,
ClusterId = draft.ClusterId,
Namespaces = await db.Namespaces.AsNoTracking().Where(n => n.GenerationId == draftId).ToListAsync(ct),
DriverInstances = await db.DriverInstances.AsNoTracking().Where(d => d.GenerationId == draftId).ToListAsync(ct),
Devices = await db.Devices.AsNoTracking().Where(d => d.GenerationId == draftId).ToListAsync(ct),
UnsAreas = await db.UnsAreas.AsNoTracking().Where(a => a.GenerationId == draftId).ToListAsync(ct),
UnsLines = await db.UnsLines.AsNoTracking().Where(l => l.GenerationId == draftId).ToListAsync(ct),
Equipment = await db.Equipment.AsNoTracking().Where(e => e.GenerationId == draftId).ToListAsync(ct),
Tags = await db.Tags.AsNoTracking().Where(t => t.GenerationId == draftId).ToListAsync(ct),
PollGroups = await db.PollGroups.AsNoTracking().Where(p => p.GenerationId == draftId).ToListAsync(ct),
PriorEquipment = await db.Equipment.AsNoTracking()
.Where(e => e.GenerationId != draftId
&& db.ConfigGenerations.Any(g => g.GenerationId == e.GenerationId && g.ClusterId == draft.ClusterId))
.ToListAsync(ct),
ActiveReservations = await db.ExternalIdReservations.AsNoTracking()
.Where(r => r.ReleasedAt == null)
.ToListAsync(ct),
};
return DraftValidator.Validate(snapshot);
}
}

View File

@@ -0,0 +1,33 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
public sealed class DriverInstanceService(OtOpcUaConfigDbContext db)
{
public Task<List<DriverInstance>> ListAsync(long generationId, CancellationToken ct) =>
db.DriverInstances.AsNoTracking()
.Where(d => d.GenerationId == generationId)
.OrderBy(d => d.DriverInstanceId)
.ToListAsync(ct);
public async Task<DriverInstance> AddAsync(
long draftId, string clusterId, string namespaceId, string name, string driverType,
string driverConfigJson, CancellationToken ct)
{
var di = new DriverInstance
{
GenerationId = draftId,
DriverInstanceId = $"drv-{Guid.NewGuid():N}"[..20],
ClusterId = clusterId,
NamespaceId = namespaceId,
Name = name,
DriverType = driverType,
DriverConfig = driverConfigJson,
};
db.DriverInstances.Add(di);
await db.SaveChangesAsync(ct);
return di;
}
}

View File

@@ -0,0 +1,75 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Equipment CRUD scoped to a generation. The Admin app writes against Draft generations only;
/// Published generations are read-only (to create changes, clone to a new draft via
/// <see cref="GenerationService.CreateDraftAsync"/>).
/// </summary>
public sealed class EquipmentService(OtOpcUaConfigDbContext db)
{
public Task<List<Equipment>> ListAsync(long generationId, CancellationToken ct) =>
db.Equipment.AsNoTracking()
.Where(e => e.GenerationId == generationId)
.OrderBy(e => e.Name)
.ToListAsync(ct);
public Task<Equipment?> FindAsync(long generationId, string equipmentId, CancellationToken ct) =>
db.Equipment.AsNoTracking()
.FirstOrDefaultAsync(e => e.GenerationId == generationId && e.EquipmentId == equipmentId, ct);
/// <summary>
/// Creates a new equipment row in the given draft. The EquipmentId is auto-derived from
/// a fresh EquipmentUuid per decision #125; operator-supplied IDs are rejected upstream.
/// </summary>
public async Task<Equipment> CreateAsync(long draftId, Equipment input, CancellationToken ct)
{
input.GenerationId = draftId;
input.EquipmentUuid = input.EquipmentUuid == Guid.Empty ? Guid.NewGuid() : input.EquipmentUuid;
input.EquipmentId = DraftValidator.DeriveEquipmentId(input.EquipmentUuid);
db.Equipment.Add(input);
await db.SaveChangesAsync(ct);
return input;
}
public async Task UpdateAsync(Equipment updated, CancellationToken ct)
{
// Only editable fields are persisted; EquipmentId + EquipmentUuid are immutable once set.
var existing = await db.Equipment
.FirstOrDefaultAsync(e => e.EquipmentRowId == updated.EquipmentRowId, ct)
?? throw new InvalidOperationException($"Equipment row {updated.EquipmentRowId} not found");
existing.Name = updated.Name;
existing.MachineCode = updated.MachineCode;
existing.ZTag = updated.ZTag;
existing.SAPID = updated.SAPID;
existing.Manufacturer = updated.Manufacturer;
existing.Model = updated.Model;
existing.SerialNumber = updated.SerialNumber;
existing.HardwareRevision = updated.HardwareRevision;
existing.SoftwareRevision = updated.SoftwareRevision;
existing.YearOfConstruction = updated.YearOfConstruction;
existing.AssetLocation = updated.AssetLocation;
existing.ManufacturerUri = updated.ManufacturerUri;
existing.DeviceManualUri = updated.DeviceManualUri;
existing.DriverInstanceId = updated.DriverInstanceId;
existing.DeviceId = updated.DeviceId;
existing.UnsLineId = updated.UnsLineId;
existing.EquipmentClassRef = updated.EquipmentClassRef;
existing.Enabled = updated.Enabled;
await db.SaveChangesAsync(ct);
}
public async Task DeleteAsync(Guid equipmentRowId, CancellationToken ct)
{
var row = await db.Equipment.FirstOrDefaultAsync(e => e.EquipmentRowId == equipmentRowId, ct);
if (row is null) return;
db.Equipment.Remove(row);
await db.SaveChangesAsync(ct);
}
}

View File

@@ -0,0 +1,71 @@
using Microsoft.Data.SqlClient;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Owns the draft → diff → publish workflow (decision #89). Publish + rollback call into the
/// stored procedures; diff queries <c>sp_ComputeGenerationDiff</c>.
/// </summary>
public sealed class GenerationService(OtOpcUaConfigDbContext db)
{
public async Task<ConfigGeneration> CreateDraftAsync(string clusterId, string createdBy, CancellationToken ct)
{
var gen = new ConfigGeneration
{
ClusterId = clusterId,
Status = GenerationStatus.Draft,
CreatedBy = createdBy,
CreatedAt = DateTime.UtcNow,
};
db.ConfigGenerations.Add(gen);
await db.SaveChangesAsync(ct);
return gen;
}
public Task<List<ConfigGeneration>> ListRecentAsync(string clusterId, int limit, CancellationToken ct) =>
db.ConfigGenerations.AsNoTracking()
.Where(g => g.ClusterId == clusterId)
.OrderByDescending(g => g.GenerationId)
.Take(limit)
.ToListAsync(ct);
public async Task PublishAsync(string clusterId, long draftGenerationId, string? notes, CancellationToken ct)
{
await db.Database.ExecuteSqlRawAsync(
"EXEC dbo.sp_PublishGeneration @ClusterId = {0}, @DraftGenerationId = {1}, @Notes = {2}",
[clusterId, draftGenerationId, (object?)notes ?? DBNull.Value],
ct);
}
public async Task RollbackAsync(string clusterId, long targetGenerationId, string? notes, CancellationToken ct)
{
await db.Database.ExecuteSqlRawAsync(
"EXEC dbo.sp_RollbackToGeneration @ClusterId = {0}, @TargetGenerationId = {1}, @Notes = {2}",
[clusterId, targetGenerationId, (object?)notes ?? DBNull.Value],
ct);
}
public async Task<List<DiffRow>> ComputeDiffAsync(long from, long to, CancellationToken ct)
{
var results = new List<DiffRow>();
await using var conn = (SqlConnection)db.Database.GetDbConnection();
if (conn.State != System.Data.ConnectionState.Open) await conn.OpenAsync(ct);
await using var cmd = conn.CreateCommand();
cmd.CommandText = "EXEC dbo.sp_ComputeGenerationDiff @FromGenerationId = @f, @ToGenerationId = @t";
cmd.Parameters.AddWithValue("@f", from);
cmd.Parameters.AddWithValue("@t", to);
await using var reader = await cmd.ExecuteReaderAsync(ct);
while (await reader.ReadAsync(ct))
results.Add(new DiffRow(reader.GetString(0), reader.GetString(1), reader.GetString(2)));
return results;
}
}
public sealed record DiffRow(string TableName, string LogicalId, string ChangeKind);

View File

@@ -0,0 +1,31 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
public sealed class NamespaceService(OtOpcUaConfigDbContext db)
{
public Task<List<Namespace>> ListAsync(long generationId, CancellationToken ct) =>
db.Namespaces.AsNoTracking()
.Where(n => n.GenerationId == generationId)
.OrderBy(n => n.NamespaceId)
.ToListAsync(ct);
public async Task<Namespace> AddAsync(
long draftId, string clusterId, string namespaceUri, NamespaceKind kind, CancellationToken ct)
{
var ns = new Namespace
{
GenerationId = draftId,
NamespaceId = $"ns-{Guid.NewGuid():N}"[..20],
ClusterId = clusterId,
NamespaceUri = namespaceUri,
Kind = kind,
};
db.Namespaces.Add(ns);
await db.SaveChangesAsync(ct);
return ns;
}
}

View File

@@ -0,0 +1,44 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
public sealed class NodeAclService(OtOpcUaConfigDbContext db)
{
public Task<List<NodeAcl>> ListAsync(long generationId, CancellationToken ct) =>
db.NodeAcls.AsNoTracking()
.Where(a => a.GenerationId == generationId)
.OrderBy(a => a.LdapGroup)
.ThenBy(a => a.ScopeKind)
.ToListAsync(ct);
public async Task<NodeAcl> GrantAsync(
long draftId, string clusterId, string ldapGroup, NodeAclScopeKind scopeKind, string? scopeId,
NodePermissions permissions, string? notes, CancellationToken ct)
{
var acl = new NodeAcl
{
GenerationId = draftId,
NodeAclId = $"acl-{Guid.NewGuid():N}"[..20],
ClusterId = clusterId,
LdapGroup = ldapGroup,
ScopeKind = scopeKind,
ScopeId = scopeId,
PermissionFlags = permissions,
Notes = notes,
};
db.NodeAcls.Add(acl);
await db.SaveChangesAsync(ct);
return acl;
}
public async Task RevokeAsync(Guid nodeAclRowId, CancellationToken ct)
{
var row = await db.NodeAcls.FirstOrDefaultAsync(a => a.NodeAclRowId == nodeAclRowId, ct);
if (row is null) return;
db.NodeAcls.Remove(row);
await db.SaveChangesAsync(ct);
}
}

View File

@@ -0,0 +1,38 @@
using Microsoft.Data.SqlClient;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Fleet-wide external-ID reservation inspector + FleetAdmin-only release flow per
/// <c>admin-ui.md §"Release an external-ID reservation"</c>. Release is audit-logged
/// (<see cref="ConfigAuditLog"/>) via <c>sp_ReleaseExternalIdReservation</c>.
/// </summary>
public sealed class ReservationService(OtOpcUaConfigDbContext db)
{
public Task<List<ExternalIdReservation>> ListActiveAsync(CancellationToken ct) =>
db.ExternalIdReservations.AsNoTracking()
.Where(r => r.ReleasedAt == null)
.OrderBy(r => r.Kind).ThenBy(r => r.Value)
.ToListAsync(ct);
public Task<List<ExternalIdReservation>> ListReleasedAsync(CancellationToken ct) =>
db.ExternalIdReservations.AsNoTracking()
.Where(r => r.ReleasedAt != null)
.OrderByDescending(r => r.ReleasedAt)
.Take(100)
.ToListAsync(ct);
public async Task ReleaseAsync(string kind, string value, string reason, CancellationToken ct)
{
if (string.IsNullOrWhiteSpace(reason))
throw new ArgumentException("ReleaseReason is required (audit invariant)", nameof(reason));
await db.Database.ExecuteSqlRawAsync(
"EXEC dbo.sp_ReleaseExternalIdReservation @Kind = {0}, @Value = {1}, @ReleaseReason = {2}",
[kind, value, reason],
ct);
}
}

View File

@@ -0,0 +1,50 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
public sealed class UnsService(OtOpcUaConfigDbContext db)
{
public Task<List<UnsArea>> ListAreasAsync(long generationId, CancellationToken ct) =>
db.UnsAreas.AsNoTracking()
.Where(a => a.GenerationId == generationId)
.OrderBy(a => a.Name)
.ToListAsync(ct);
public Task<List<UnsLine>> ListLinesAsync(long generationId, CancellationToken ct) =>
db.UnsLines.AsNoTracking()
.Where(l => l.GenerationId == generationId)
.OrderBy(l => l.Name)
.ToListAsync(ct);
public async Task<UnsArea> AddAreaAsync(long draftId, string clusterId, string name, string? notes, CancellationToken ct)
{
var area = new UnsArea
{
GenerationId = draftId,
UnsAreaId = $"area-{Guid.NewGuid():N}"[..20],
ClusterId = clusterId,
Name = name,
Notes = notes,
};
db.UnsAreas.Add(area);
await db.SaveChangesAsync(ct);
return area;
}
public async Task<UnsLine> AddLineAsync(long draftId, string unsAreaId, string name, string? notes, CancellationToken ct)
{
var line = new UnsLine
{
GenerationId = draftId,
UnsLineId = $"line-{Guid.NewGuid():N}"[..20],
UnsAreaId = unsAreaId,
Name = name,
Notes = notes,
};
db.UnsLines.Add(line);
await db.SaveChangesAsync(ct);
return line;
}
}

View File

@@ -0,0 +1,34 @@
<Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Admin</RootNamespace>
<AssemblyName>OtOpcUa.Admin</AssemblyName>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.0"/>
<PackageReference Include="Novell.Directory.Ldap.NETStandard" Version="3.6.0"/>
<PackageReference Include="Microsoft.AspNetCore.SignalR.Client" Version="10.0.0"/>
<PackageReference Include="Serilog.AspNetCore" Version="9.0.0"/>
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Admin.Tests"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,27 @@
{
"ConnectionStrings": {
"ConfigDb": "Server=localhost,14330;Database=OtOpcUaConfig;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=True;Encrypt=False;"
},
"Authentication": {
"Ldap": {
"Enabled": true,
"Server": "localhost",
"Port": 3893,
"UseTls": false,
"AllowInsecureLdap": true,
"SearchBase": "dc=lmxopcua,dc=local",
"ServiceAccountDn": "cn=serviceaccount,ou=svcaccts,dc=lmxopcua,dc=local",
"ServiceAccountPassword": "serviceaccount123",
"DisplayNameAttribute": "cn",
"GroupAttribute": "memberOf",
"GroupToRole": {
"ReadOnly": "ConfigViewer",
"ReadWrite": "ConfigEditor",
"AlarmAck": "FleetAdmin"
}
}
},
"Serilog": {
"MinimumLevel": "Information"
}
}

View File

@@ -0,0 +1,3 @@
/* OtOpcUa Admin — ScadaLink-parity palette. Keep it minimal here; lean on Bootstrap 5. */
body { background-color: #f5f6fa; }
.nav-link.active { background-color: rgba(255,255,255,0.1); border-radius: 4px; }

View File

@@ -0,0 +1,19 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
/// <summary>
/// Host-supplied callbacks invoked as the applier walks the diff. Callbacks are idempotent on
/// retry (the applier may re-invoke with the same inputs if a later stage fails — nodes
/// register-applied to the central DB only after success). Order: namespace → driver → device →
/// equipment → poll group → tag, with Removed before Added/Modified.
/// </summary>
public sealed class ApplyCallbacks
{
public Func<EntityChange<Namespace>, CancellationToken, Task>? OnNamespace { get; init; }
public Func<EntityChange<DriverInstance>, CancellationToken, Task>? OnDriver { get; init; }
public Func<EntityChange<Device>, CancellationToken, Task>? OnDevice { get; init; }
public Func<EntityChange<Equipment>, CancellationToken, Task>? OnEquipment { get; init; }
public Func<EntityChange<PollGroup>, CancellationToken, Task>? OnPollGroup { get; init; }
public Func<EntityChange<Tag>, CancellationToken, Task>? OnTag { get; init; }
}

View File

@@ -0,0 +1,8 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
public enum ChangeKind
{
Added,
Removed,
Modified,
}

View File

@@ -0,0 +1,48 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
public sealed class GenerationApplier(ApplyCallbacks callbacks) : IGenerationApplier
{
public async Task<ApplyResult> ApplyAsync(DraftSnapshot? from, DraftSnapshot to, CancellationToken ct)
{
var diff = GenerationDiffer.Compute(from, to);
var errors = new List<string>();
// Removed first, then Added/Modified — prevents FK dangling while cascades settle.
await ApplyPass(diff.Tags, ChangeKind.Removed, callbacks.OnTag, errors, ct);
await ApplyPass(diff.PollGroups, ChangeKind.Removed, callbacks.OnPollGroup, errors, ct);
await ApplyPass(diff.Equipment, ChangeKind.Removed, callbacks.OnEquipment, errors, ct);
await ApplyPass(diff.Devices, ChangeKind.Removed, callbacks.OnDevice, errors, ct);
await ApplyPass(diff.Drivers, ChangeKind.Removed, callbacks.OnDriver, errors, ct);
await ApplyPass(diff.Namespaces, ChangeKind.Removed, callbacks.OnNamespace, errors, ct);
foreach (var kind in new[] { ChangeKind.Added, ChangeKind.Modified })
{
await ApplyPass(diff.Namespaces, kind, callbacks.OnNamespace, errors, ct);
await ApplyPass(diff.Drivers, kind, callbacks.OnDriver, errors, ct);
await ApplyPass(diff.Devices, kind, callbacks.OnDevice, errors, ct);
await ApplyPass(diff.Equipment, kind, callbacks.OnEquipment, errors, ct);
await ApplyPass(diff.PollGroups, kind, callbacks.OnPollGroup, errors, ct);
await ApplyPass(diff.Tags, kind, callbacks.OnTag, errors, ct);
}
return errors.Count == 0 ? ApplyResult.Ok(diff) : ApplyResult.Fail(diff, errors);
}
private static async Task ApplyPass<T>(
IReadOnlyList<EntityChange<T>> changes,
ChangeKind kind,
Func<EntityChange<T>, CancellationToken, Task>? callback,
List<string> errors,
CancellationToken ct)
{
if (callback is null) return;
foreach (var change in changes.Where(c => c.Kind == kind))
{
try { await callback(change, ct); }
catch (Exception ex) { errors.Add($"{typeof(T).Name} {change.Kind} '{change.LogicalId}': {ex.Message}"); }
}
}
}

View File

@@ -0,0 +1,70 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
/// <summary>
/// Per-entity diff computed locally on the node. The enumerable order matches the dependency
/// order expected by <see cref="IGenerationApplier"/>: namespace → driver → device → equipment →
/// poll group → tag → ACL, with Removed processed before Added inside each bucket so cascades
/// settle before new rows appear.
/// </summary>
public sealed record GenerationDiff(
IReadOnlyList<EntityChange<Namespace>> Namespaces,
IReadOnlyList<EntityChange<DriverInstance>> Drivers,
IReadOnlyList<EntityChange<Device>> Devices,
IReadOnlyList<EntityChange<Equipment>> Equipment,
IReadOnlyList<EntityChange<PollGroup>> PollGroups,
IReadOnlyList<EntityChange<Tag>> Tags);
public sealed record EntityChange<T>(ChangeKind Kind, string LogicalId, T? From, T? To);
public static class GenerationDiffer
{
public static GenerationDiff Compute(DraftSnapshot? from, DraftSnapshot to)
{
from ??= new DraftSnapshot { GenerationId = 0, ClusterId = to.ClusterId };
return new GenerationDiff(
Namespaces: DiffById(from.Namespaces, to.Namespaces, x => x.NamespaceId,
(a, b) => (a.ClusterId, a.NamespaceUri, a.Kind, a.Enabled, a.Notes)
== (b.ClusterId, b.NamespaceUri, b.Kind, b.Enabled, b.Notes)),
Drivers: DiffById(from.DriverInstances, to.DriverInstances, x => x.DriverInstanceId,
(a, b) => (a.ClusterId, a.NamespaceId, a.Name, a.DriverType, a.Enabled, a.DriverConfig)
== (b.ClusterId, b.NamespaceId, b.Name, b.DriverType, b.Enabled, b.DriverConfig)),
Devices: DiffById(from.Devices, to.Devices, x => x.DeviceId,
(a, b) => (a.DriverInstanceId, a.Name, a.Enabled, a.DeviceConfig)
== (b.DriverInstanceId, b.Name, b.Enabled, b.DeviceConfig)),
Equipment: DiffById(from.Equipment, to.Equipment, x => x.EquipmentId,
(a, b) => (a.EquipmentUuid, a.DriverInstanceId, a.UnsLineId, a.Name, a.MachineCode, a.ZTag, a.SAPID, a.Enabled)
== (b.EquipmentUuid, b.DriverInstanceId, b.UnsLineId, b.Name, b.MachineCode, b.ZTag, b.SAPID, b.Enabled)),
PollGroups: DiffById(from.PollGroups, to.PollGroups, x => x.PollGroupId,
(a, b) => (a.DriverInstanceId, a.Name, a.IntervalMs)
== (b.DriverInstanceId, b.Name, b.IntervalMs)),
Tags: DiffById(from.Tags, to.Tags, x => x.TagId,
(a, b) => (a.DriverInstanceId, a.DeviceId, a.EquipmentId, a.PollGroupId, a.FolderPath, a.Name, a.DataType, a.AccessLevel, a.WriteIdempotent, a.TagConfig)
== (b.DriverInstanceId, b.DeviceId, b.EquipmentId, b.PollGroupId, b.FolderPath, b.Name, b.DataType, b.AccessLevel, b.WriteIdempotent, b.TagConfig)));
}
private static List<EntityChange<T>> DiffById<T>(
IReadOnlyList<T> from, IReadOnlyList<T> to,
Func<T, string> id, Func<T, T, bool> equal)
{
var fromById = from.ToDictionary(id);
var toById = to.ToDictionary(id);
var result = new List<EntityChange<T>>();
foreach (var (logicalId, src) in fromById.Where(kv => !toById.ContainsKey(kv.Key)))
result.Add(new(ChangeKind.Removed, logicalId, src, default));
foreach (var (logicalId, dst) in toById)
{
if (!fromById.TryGetValue(logicalId, out var src))
result.Add(new(ChangeKind.Added, logicalId, default, dst));
else if (!equal(src, dst))
result.Add(new(ChangeKind.Modified, logicalId, src, dst));
}
return result;
}
}

View File

@@ -0,0 +1,23 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
/// <summary>
/// Applies a <see cref="GenerationDiff"/> to whatever backing runtime the node owns: the OPC UA
/// address space, driver subscriptions, the local cache, etc. The Core project wires concrete
/// callbacks into this via <see cref="ApplyCallbacks"/> so the Configuration project stays free
/// of a Core/Server dependency (interface independence per decision #59).
/// </summary>
public interface IGenerationApplier
{
Task<ApplyResult> ApplyAsync(DraftSnapshot? from, DraftSnapshot to, CancellationToken ct);
}
public sealed record ApplyResult(
bool Succeeded,
GenerationDiff Diff,
IReadOnlyList<string> Errors)
{
public static ApplyResult Ok(GenerationDiff diff) => new(true, diff, []);
public static ApplyResult Fail(GenerationDiff diff, IReadOnlyList<string> errors) => new(false, diff, errors);
}

View File

@@ -0,0 +1,28 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Design;
namespace ZB.MOM.WW.OtOpcUa.Configuration;
/// <summary>
/// Used by <c>dotnet ef</c> at design time (migrations, scaffolding). Reads the connection string
/// from the <c>OTOPCUA_CONFIG_CONNECTION</c> environment variable, falling back to the local dev
/// container on <c>localhost:1433</c>.
/// </summary>
public sealed class DesignTimeDbContextFactory : IDesignTimeDbContextFactory<OtOpcUaConfigDbContext>
{
// Host-port 14330 avoids collision with the native MSSQL14 instance on 1433 (Galaxy "ZB" DB).
private const string DefaultConnectionString =
"Server=localhost,14330;Database=OtOpcUaConfig;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=True;Encrypt=False;";
public OtOpcUaConfigDbContext CreateDbContext(string[] args)
{
var connection = Environment.GetEnvironmentVariable("OTOPCUA_CONFIG_CONNECTION")
?? DefaultConnectionString;
var options = new DbContextOptionsBuilder<OtOpcUaConfigDbContext>()
.UseSqlServer(connection, sql => sql.MigrationsAssembly(typeof(OtOpcUaConfigDbContext).Assembly.FullName))
.Options;
return new OtOpcUaConfigDbContext(options);
}
}

View File

@@ -0,0 +1,51 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>Physical OPC UA server node within a <see cref="ServerCluster"/>.</summary>
public sealed class ClusterNode
{
/// <summary>Stable per-machine logical ID, e.g. "LINE3-OPCUA-A".</summary>
public required string NodeId { get; set; }
public required string ClusterId { get; set; }
public required RedundancyRole RedundancyRole { get; set; }
/// <summary>Machine hostname / IP.</summary>
public required string Host { get; set; }
public int OpcUaPort { get; set; } = 4840;
public int DashboardPort { get; set; } = 8081;
/// <summary>
/// OPC UA <c>ApplicationUri</c> — MUST be unique per node per OPC UA spec. Clients pin trust here.
/// Fleet-wide unique index enforces no two nodes share a value (decision #86).
/// Stored explicitly, NOT derived from <see cref="Host"/> at runtime — silent rewrite on
/// hostname change would break all client trust.
/// </summary>
public required string ApplicationUri { get; set; }
/// <summary>Primary = 200, Secondary = 150 by default.</summary>
public byte ServiceLevelBase { get; set; } = 200;
/// <summary>
/// Per-node override JSON keyed by DriverInstanceId, merged onto cluster-level DriverConfig
/// at apply time. Minimal by intent (decision #81). Nullable when no overrides exist.
/// </summary>
public string? DriverConfigOverridesJson { get; set; }
public bool Enabled { get; set; } = true;
public DateTime? LastSeenAt { get; set; }
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
public required string CreatedBy { get; set; }
// Navigation
public ServerCluster? Cluster { get; set; }
public ICollection<ClusterNodeCredential> Credentials { get; set; } = [];
public ClusterNodeGenerationState? GenerationState { get; set; }
}

View File

@@ -0,0 +1,29 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Authenticates a <see cref="ClusterNode"/> to the central config DB.
/// Per decision #83 — credentials bind to NodeId, not ClusterId.
/// </summary>
public sealed class ClusterNodeCredential
{
public Guid CredentialId { get; set; }
public required string NodeId { get; set; }
public required CredentialKind Kind { get; set; }
/// <summary>Login name / cert thumbprint / SID / gMSA name.</summary>
public required string Value { get; set; }
public bool Enabled { get; set; } = true;
public DateTime? RotatedAt { get; set; }
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
public required string CreatedBy { get; set; }
public ClusterNode? Node { get; set; }
}

View File

@@ -0,0 +1,26 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Tracks which generation each node has applied. Per-node (not per-cluster) — both nodes of a
/// 2-node cluster track independently per decision #84.
/// </summary>
public sealed class ClusterNodeGenerationState
{
public required string NodeId { get; set; }
public long? CurrentGenerationId { get; set; }
public DateTime? LastAppliedAt { get; set; }
public NodeApplyStatus? LastAppliedStatus { get; set; }
public string? LastAppliedError { get; set; }
/// <summary>Updated on every poll for liveness detection.</summary>
public DateTime? LastSeenAt { get; set; }
public ClusterNode? Node { get; set; }
public ConfigGeneration? CurrentGeneration { get; set; }
}

View File

@@ -0,0 +1,25 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Append-only audit log for every config write + authorization-check event. Grants revoked for
/// UPDATE / DELETE on all principals (enforced by the authorization migration in B.3).
/// </summary>
public sealed class ConfigAuditLog
{
public long AuditId { get; set; }
public DateTime Timestamp { get; set; } = DateTime.UtcNow;
public required string Principal { get; set; }
/// <summary>DraftCreated | DraftEdited | Published | RolledBack | NodeApplied | CredentialAdded | CredentialDisabled | ClusterCreated | NodeAdded | ExternalIdReleased | CrossClusterNamespaceAttempt | OpcUaAccessDenied | …</summary>
public required string EventType { get; set; }
public string? ClusterId { get; set; }
public string? NodeId { get; set; }
public long? GenerationId { get; set; }
public string? DetailsJson { get; set; }
}

View File

@@ -0,0 +1,32 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Atomic, immutable snapshot of one cluster's configuration.
/// Per decision #82 — cluster-scoped, not fleet-scoped.
/// </summary>
public sealed class ConfigGeneration
{
/// <summary>Monotonically increasing ID, generated by <c>IDENTITY(1, 1)</c>.</summary>
public long GenerationId { get; set; }
public required string ClusterId { get; set; }
public required GenerationStatus Status { get; set; }
public long? ParentGenerationId { get; set; }
public DateTime? PublishedAt { get; set; }
public string? PublishedBy { get; set; }
public string? Notes { get; set; }
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
public required string CreatedBy { get; set; }
public ServerCluster? Cluster { get; set; }
public ConfigGeneration? Parent { get; set; }
}

View File

@@ -0,0 +1,23 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>Per-device row for multi-device drivers (Modbus, AB CIP). Optional for single-device drivers.</summary>
public sealed class Device
{
public Guid DeviceRowId { get; set; }
public long GenerationId { get; set; }
public required string DeviceId { get; set; }
/// <summary>Logical FK to <see cref="DriverInstance.DriverInstanceId"/>.</summary>
public required string DriverInstanceId { get; set; }
public required string Name { get; set; }
public bool Enabled { get; set; } = true;
/// <summary>Schemaless per-driver-type device config (host, port, unit ID, slot, etc.).</summary>
public required string DeviceConfig { get; set; }
public ConfigGeneration? Generation { get; set; }
}

View File

@@ -0,0 +1,32 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>One driver instance in a cluster's generation. JSON config is schemaless per-driver-type.</summary>
public sealed class DriverInstance
{
public Guid DriverInstanceRowId { get; set; }
public long GenerationId { get; set; }
public required string DriverInstanceId { get; set; }
public required string ClusterId { get; set; }
/// <summary>
/// Logical FK to <see cref="Namespace.NamespaceId"/>. Same-cluster binding enforced by
/// <c>sp_ValidateDraft</c> per decision #122: Namespace.ClusterId must equal DriverInstance.ClusterId.
/// </summary>
public required string NamespaceId { get; set; }
public required string Name { get; set; }
/// <summary>Galaxy | ModbusTcp | AbCip | AbLegacy | S7 | TwinCat | Focas | OpcUaClient</summary>
public required string DriverType { get; set; }
public bool Enabled { get; set; } = true;
/// <summary>Schemaless per-driver-type JSON config. Validated against registered JSON schema at draft-publish time (decision #91).</summary>
public required string DriverConfig { get; set; }
public ConfigGeneration? Generation { get; set; }
public ServerCluster? Cluster { get; set; }
}

View File

@@ -0,0 +1,64 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// UNS level-5 entity. Only for drivers in Equipment-kind namespaces.
/// Per decisions #109 (first-class), #116 (5-identifier model), #125 (system-generated EquipmentId),
/// #138139 (OPC 40010 Identification fields as first-class columns).
/// </summary>
public sealed class Equipment
{
public Guid EquipmentRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>
/// System-generated stable internal logical ID. Format: <c>'EQ-' + first 12 hex chars of EquipmentUuid</c>.
/// NEVER operator-supplied, NEVER in CSV imports, NEVER editable in Admin UI (decision #125).
/// </summary>
public required string EquipmentId { get; set; }
/// <summary>UUIDv4, IMMUTABLE across all generations of the same EquipmentId. Downstream-consumer join key.</summary>
public Guid EquipmentUuid { get; set; }
/// <summary>Logical FK to the driver providing data for this equipment.</summary>
public required string DriverInstanceId { get; set; }
/// <summary>Optional logical FK to a multi-device driver's device.</summary>
public string? DeviceId { get; set; }
/// <summary>Logical FK to <see cref="UnsLine.UnsLineId"/>.</summary>
public required string UnsLineId { get; set; }
/// <summary>UNS level 5 segment, matches <c>^[a-z0-9-]{1,32}$</c>.</summary>
public required string Name { get; set; }
// Operator-facing / external-system identifiers (decision #116)
/// <summary>Operator colloquial id (e.g. "machine_001"). Unique within cluster. Required.</summary>
public required string MachineCode { get; set; }
/// <summary>ERP equipment id. Unique fleet-wide via <see cref="ExternalIdReservation"/>. Primary browse identifier in Admin UI.</summary>
public string? ZTag { get; set; }
/// <summary>SAP PM equipment id. Unique fleet-wide via <see cref="ExternalIdReservation"/>.</summary>
public string? SAPID { get; set; }
// OPC UA Companion Spec OPC 40010 Machinery Identification fields (decision #139).
// All nullable so equipment can be added before identity is fully captured.
public string? Manufacturer { get; set; }
public string? Model { get; set; }
public string? SerialNumber { get; set; }
public string? HardwareRevision { get; set; }
public string? SoftwareRevision { get; set; }
public short? YearOfConstruction { get; set; }
public string? AssetLocation { get; set; }
public string? ManufacturerUri { get; set; }
public string? DeviceManualUri { get; set; }
/// <summary>Nullable hook for future schemas-repo template ID (decision #112).</summary>
public string? EquipmentClassRef { get; set; }
public bool Enabled { get; set; } = true;
public ConfigGeneration? Generation { get; set; }
}

View File

@@ -0,0 +1,36 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Fleet-wide rollback-safe reservation of ZTag and SAPID. Per decision #124 — NOT generation-versioned.
/// Exists outside generation flow specifically because old generations and disabled equipment can
/// still hold the same external IDs; per-generation uniqueness indexes fail under rollback/re-enable.
/// </summary>
public sealed class ExternalIdReservation
{
public Guid ReservationId { get; set; }
public required ReservationKind Kind { get; set; }
public required string Value { get; set; }
/// <summary>The equipment that owns this reservation. Stays bound even when equipment is disabled.</summary>
public Guid EquipmentUuid { get; set; }
/// <summary>First cluster to publish this reservation.</summary>
public required string ClusterId { get; set; }
public DateTime FirstPublishedAt { get; set; } = DateTime.UtcNow;
public required string FirstPublishedBy { get; set; }
public DateTime LastPublishedAt { get; set; } = DateTime.UtcNow;
/// <summary>Non-null when explicitly released by FleetAdmin (audit-logged, requires reason).</summary>
public DateTime? ReleasedAt { get; set; }
public string? ReleasedBy { get; set; }
public string? ReleaseReason { get; set; }
}

View File

@@ -0,0 +1,31 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// OPC UA namespace served by a cluster. Generation-versioned per decision #123 —
/// namespaces are content (affect what consumers see at the endpoint), not topology.
/// </summary>
public sealed class Namespace
{
public Guid NamespaceRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>Stable logical ID across generations, e.g. "LINE3-OPCUA-equipment".</summary>
public required string NamespaceId { get; set; }
public required string ClusterId { get; set; }
public required NamespaceKind Kind { get; set; }
/// <summary>E.g. "urn:zb:warsaw-west:equipment". Unique fleet-wide per generation.</summary>
public required string NamespaceUri { get; set; }
public bool Enabled { get; set; } = true;
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
public ServerCluster? Cluster { get; set; }
}

View File

@@ -0,0 +1,32 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// One ACL grant: an LDAP group gets a set of <see cref="NodePermissions"/> at a specific scope.
/// Generation-versioned per decision #130. See <c>acl-design.md</c> for evaluation algorithm.
/// </summary>
public sealed class NodeAcl
{
public Guid NodeAclRowId { get; set; }
public long GenerationId { get; set; }
public required string NodeAclId { get; set; }
public required string ClusterId { get; set; }
public required string LdapGroup { get; set; }
public required NodeAclScopeKind ScopeKind { get; set; }
/// <summary>NULL when <see cref="ScopeKind"/> = <see cref="NodeAclScopeKind.Cluster"/>; otherwise the scoped entity's logical ID.</summary>
public string? ScopeId { get; set; }
/// <summary>Bitmask of <see cref="NodePermissions"/>. Stored as int in SQL.</summary>
public required NodePermissions PermissionFlags { get; set; }
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
}

View File

@@ -0,0 +1,19 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>Driver-scoped polling group. Tags reference it via <see cref="Tag.PollGroupId"/>.</summary>
public sealed class PollGroup
{
public Guid PollGroupRowId { get; set; }
public long GenerationId { get; set; }
public required string PollGroupId { get; set; }
public required string DriverInstanceId { get; set; }
public required string Name { get; set; }
public int IntervalMs { get; set; }
public ConfigGeneration? Generation { get; set; }
}

View File

@@ -0,0 +1,42 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Top-level deployment unit. 1 or 2 <see cref="ClusterNode"/> members.
/// Per <c>config-db-schema.md</c> ServerCluster table.
/// </summary>
public sealed class ServerCluster
{
/// <summary>Stable logical ID, e.g. "LINE3-OPCUA".</summary>
public required string ClusterId { get; set; }
public required string Name { get; set; }
/// <summary>UNS level 1. Canonical org value: "zb" per decision #140.</summary>
public required string Enterprise { get; set; }
/// <summary>UNS level 2, e.g. "warsaw-west".</summary>
public required string Site { get; set; }
public byte NodeCount { get; set; }
public required RedundancyMode RedundancyMode { get; set; }
public bool Enabled { get; set; } = true;
public string? Notes { get; set; }
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
public required string CreatedBy { get; set; }
public DateTime? ModifiedAt { get; set; }
public string? ModifiedBy { get; set; }
// Navigation
public ICollection<ClusterNode> Nodes { get; set; } = [];
public ICollection<Namespace> Namespaces { get; set; } = [];
public ICollection<ConfigGeneration> Generations { get; set; } = [];
}

View File

@@ -0,0 +1,47 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// One canonical tag (signal) in a cluster's generation. Per decision #110:
/// <see cref="EquipmentId"/> is REQUIRED when the driver is in an Equipment-kind namespace
/// and NULL when in SystemPlatform-kind namespace (Galaxy hierarchy preserved).
/// </summary>
public sealed class Tag
{
public Guid TagRowId { get; set; }
public long GenerationId { get; set; }
public required string TagId { get; set; }
public required string DriverInstanceId { get; set; }
public string? DeviceId { get; set; }
/// <summary>
/// Required when driver is in Equipment-kind namespace; NULL when in SystemPlatform-kind.
/// Cross-table invariant enforced by sp_ValidateDraft (decision #110).
/// </summary>
public string? EquipmentId { get; set; }
public required string Name { get; set; }
/// <summary>Only used when <see cref="EquipmentId"/> is NULL (SystemPlatform namespace).</summary>
public string? FolderPath { get; set; }
/// <summary>OPC UA built-in type name (Boolean / Int32 / Float / etc.).</summary>
public required string DataType { get; set; }
public required TagAccessLevel AccessLevel { get; set; }
/// <summary>Per decisions #4445 — opt-in for write retry eligibility.</summary>
public bool WriteIdempotent { get; set; }
public string? PollGroupId { get; set; }
/// <summary>Register address / scaling / poll group / byte-order / etc. — schemaless per driver type.</summary>
public required string TagConfig { get; set; }
public ConfigGeneration? Generation { get; set; }
}

View File

@@ -0,0 +1,21 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>UNS level-3 segment. Generation-versioned per decision #115.</summary>
public sealed class UnsArea
{
public Guid UnsAreaRowId { get; set; }
public long GenerationId { get; set; }
public required string UnsAreaId { get; set; }
public required string ClusterId { get; set; }
/// <summary>UNS level 3 segment: matches <c>^[a-z0-9-]{1,32}$</c> OR equals literal <c>_default</c>.</summary>
public required string Name { get; set; }
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
public ServerCluster? Cluster { get; set; }
}

View File

@@ -0,0 +1,21 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>UNS level-4 segment. Generation-versioned per decision #115.</summary>
public sealed class UnsLine
{
public Guid UnsLineRowId { get; set; }
public long GenerationId { get; set; }
public required string UnsLineId { get; set; }
/// <summary>Logical FK to <see cref="UnsArea.UnsAreaId"/>; resolved within the same generation.</summary>
public required string UnsAreaId { get; set; }
/// <summary>UNS level 4 segment: matches <c>^[a-z0-9-]{1,32}$</c> OR equals literal <c>_default</c>.</summary>
public required string Name { get; set; }
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
}

View File

@@ -0,0 +1,10 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Credential kind for <see cref="Entities.ClusterNodeCredential"/>. Per decision #83.</summary>
public enum CredentialKind
{
SqlLogin,
ClientCertThumbprint,
ADPrincipal,
gMSA,
}

View File

@@ -0,0 +1,10 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Generation lifecycle state. Draft → Published → Superseded | RolledBack.</summary>
public enum GenerationStatus
{
Draft,
Published,
Superseded,
RolledBack,
}

View File

@@ -0,0 +1,25 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>OPC UA namespace kind per decision #107. One of each kind per cluster per generation.</summary>
public enum NamespaceKind
{
/// <summary>
/// Equipment namespace — raw signals from native-protocol drivers (Modbus, AB CIP, AB Legacy,
/// S7, TwinCAT, FOCAS, and OpcUaClient when gatewaying raw equipment). UNS 5-level hierarchy
/// applies.
/// </summary>
Equipment,
/// <summary>
/// System Platform namespace — Galaxy / MXAccess processed data (v1 LmxOpcUa folded in).
/// UNS rules do NOT apply; Galaxy hierarchy preserved as v1 expressed it.
/// </summary>
SystemPlatform,
/// <summary>
/// Reserved for future replay driver per handoff §"Digital Twin Touchpoints" — not populated
/// in v2.0 but enum value reserved so the schema does not need to change when the replay
/// driver lands.
/// </summary>
Simulated,
}

View File

@@ -0,0 +1,12 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>ACL scope level. Per <c>acl-design.md</c> §"Scope Hierarchy".</summary>
public enum NodeAclScopeKind
{
Cluster,
Namespace,
UnsArea,
UnsLine,
Equipment,
Tag,
}

View File

@@ -0,0 +1,10 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Status tracked per node in <see cref="Entities.ClusterNodeGenerationState"/>.</summary>
public enum NodeApplyStatus
{
Applied,
RolledBack,
Failed,
InProgress,
}

View File

@@ -0,0 +1,37 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>
/// OPC UA client data-path permissions per <c>acl-design.md</c>.
/// Stored as <c>int</c> bitmask in <see cref="Entities.NodeAcl.PermissionFlags"/>.
/// </summary>
[Flags]
public enum NodePermissions : uint
{
None = 0,
// Read-side
Browse = 1 << 0,
Read = 1 << 1,
Subscribe = 1 << 2,
HistoryRead = 1 << 3,
// Write-side (mirrors v1 SecurityClassification model)
WriteOperate = 1 << 4,
WriteTune = 1 << 5,
WriteConfigure = 1 << 6,
// Alarm-side
AlarmRead = 1 << 7,
AlarmAcknowledge = 1 << 8,
AlarmConfirm = 1 << 9,
AlarmShelve = 1 << 10,
// OPC UA Part 4 §5.11
MethodCall = 1 << 11,
// Bundles (one-click grants in Admin UI)
ReadOnly = Browse | Read | Subscribe | HistoryRead | AlarmRead,
Operator = ReadOnly | WriteOperate | AlarmAcknowledge | AlarmConfirm,
Engineer = Operator | WriteTune | AlarmShelve,
Admin = Engineer | WriteConfigure | MethodCall,
}

View File

@@ -0,0 +1,17 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>
/// Cluster redundancy mode per OPC UA Part 5 §6.5. Persisted as string in
/// <c>ServerCluster.RedundancyMode</c> with a CHECK constraint coupling to <c>NodeCount</c>.
/// </summary>
public enum RedundancyMode
{
/// <summary>Single-node cluster. Required when <c>NodeCount = 1</c>.</summary>
None,
/// <summary>Warm redundancy (non-transparent). Two-node cluster.</summary>
Warm,
/// <summary>Hot redundancy (non-transparent). Two-node cluster.</summary>
Hot,
}

View File

@@ -0,0 +1,9 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Per-node redundancy role within a cluster. Per decision #84.</summary>
public enum RedundancyRole
{
Primary,
Secondary,
Standalone,
}

View File

@@ -0,0 +1,8 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>External-ID reservation kind. Per decision #124.</summary>
public enum ReservationKind
{
ZTag,
SAPID,
}

View File

@@ -0,0 +1,8 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Tag-level OPC UA access level baseline. Further narrowed per-user by NodeAcl grants.</summary>
public enum TagAccessLevel
{
Read,
ReadWrite,
}

View File

@@ -0,0 +1,15 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
/// <summary>
/// A self-contained snapshot of one generation — enough to rebuild the address space on a node
/// that has lost DB connectivity. The payload is the JSON-serialized <c>sp_GetGenerationContent</c>
/// result; the local cache doesn't inspect the shape, it just round-trips bytes.
/// </summary>
public sealed class GenerationSnapshot
{
public int Id { get; set; } // LiteDB auto-ID
public required string ClusterId { get; set; }
public required long GenerationId { get; set; }
public required DateTime CachedAt { get; set; }
public required string PayloadJson { get; set; }
}

View File

@@ -0,0 +1,12 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
/// <summary>
/// Per-node local cache of the most-recently-applied generation(s). Used to bootstrap the
/// address space when the central DB is unreachable (decision #79 — degraded-but-running).
/// </summary>
public interface ILocalConfigCache
{
Task<GenerationSnapshot?> GetMostRecentAsync(string clusterId, CancellationToken ct = default);
Task PutAsync(GenerationSnapshot snapshot, CancellationToken ct = default);
Task PruneOldGenerationsAsync(string clusterId, int keepLatest = 10, CancellationToken ct = default);
}

View File

@@ -0,0 +1,89 @@
using LiteDB;
namespace ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
/// <summary>
/// LiteDB-backed <see cref="ILocalConfigCache"/>. One file per node (default
/// <c>config_cache.db</c>), one collection per snapshot. Corruption surfaces as
/// <see cref="LocalConfigCacheCorruptException"/> on construction or read — callers should
/// delete and re-fetch from the central DB (decision #80).
/// </summary>
public sealed class LiteDbConfigCache : ILocalConfigCache, IDisposable
{
private const string CollectionName = "generations";
private readonly LiteDatabase _db;
private readonly ILiteCollection<GenerationSnapshot> _col;
public LiteDbConfigCache(string dbPath)
{
// LiteDB can be tolerant of header-only corruption at construction time (it may overwrite
// the header and "recover"), so we force a write + read probe to fail fast on real corruption.
try
{
_db = new LiteDatabase(new ConnectionString { Filename = dbPath, Upgrade = true });
_col = _db.GetCollection<GenerationSnapshot>(CollectionName);
_col.EnsureIndex(s => s.ClusterId);
_col.EnsureIndex(s => s.GenerationId);
_ = _col.Count();
}
catch (Exception ex) when (ex is LiteException or InvalidDataException or IOException
or NotSupportedException or UnauthorizedAccessException
or ArgumentOutOfRangeException or FormatException)
{
_db?.Dispose();
throw new LocalConfigCacheCorruptException(
$"LiteDB cache at '{dbPath}' is corrupt or unreadable — delete the file and refetch from the central DB.",
ex);
}
}
public Task<GenerationSnapshot?> GetMostRecentAsync(string clusterId, CancellationToken ct = default)
{
ct.ThrowIfCancellationRequested();
var snapshot = _col
.Find(s => s.ClusterId == clusterId)
.OrderByDescending(s => s.GenerationId)
.FirstOrDefault();
return Task.FromResult<GenerationSnapshot?>(snapshot);
}
public Task PutAsync(GenerationSnapshot snapshot, CancellationToken ct = default)
{
ct.ThrowIfCancellationRequested();
// upsert by (ClusterId, GenerationId) — replace in place if already cached
var existing = _col
.Find(s => s.ClusterId == snapshot.ClusterId && s.GenerationId == snapshot.GenerationId)
.FirstOrDefault();
if (existing is null)
_col.Insert(snapshot);
else
{
snapshot.Id = existing.Id;
_col.Update(snapshot);
}
return Task.CompletedTask;
}
public Task PruneOldGenerationsAsync(string clusterId, int keepLatest = 10, CancellationToken ct = default)
{
ct.ThrowIfCancellationRequested();
var doomed = _col
.Find(s => s.ClusterId == clusterId)
.OrderByDescending(s => s.GenerationId)
.Skip(keepLatest)
.Select(s => s.Id)
.ToList();
foreach (var id in doomed)
_col.Delete(id);
return Task.CompletedTask;
}
public void Dispose() => _db.Dispose();
}
public sealed class LocalConfigCacheCorruptException(string message, Exception inner)
: Exception(message, inner);

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More