fe91d42927b3e8cea1d105a5321032fbd87d84a1
16 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
fe91d42927 |
PR 7.2 — Retire legacy Galaxy projects + service
Matrix-gate satisfied (14 passed / 1 skipped / 0 failed on 2026-04-30 per docs/v2/Galaxy.ParityMatrix.md). Galaxy access flows through the in-process GalaxyDriver → mxaccessgw exclusively. Legacy infrastructure deleted in this commit: Source projects (6): - src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host (.NET 4.8 x86 + MXAccess COM) - src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy (in-process pipe client) - src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared (pipe-IPC contracts) - tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests - tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests - tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests Test projects with no consumer after legacy retired (3): - tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E (drove Galaxy.Host EXE) - tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests (drove both backends) - tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport (only consumed by Host/Proxy tests) Edits: - ZB.MOM.WW.OtOpcUa.slnx: drop nine project entries - Server.csproj: drop Driver.Galaxy.Proxy ProjectReference - Server/Program.cs: drop GalaxyProxyDriverFactoryExtensions.Register + the parallel-registration comment block; only GalaxyDriverFactoryExtensions registers now under DriverType "GalaxyMxGateway" - Install-Services.ps1: rewrite to drop OtOpcUaGalaxyHost service install + the GalaxySharedSecret/ZbConnection/GalaxyClientName/GalaxyPipeName/ AvevaServiceDependencies/MxAccessInitialConnect* parameters that only applied to the legacy host. Adds a closing note pointing operators at the separate mxaccessgw install - Uninstall-Services.ps1: keep OtOpcUaGalaxyHost in the cleanup loop so pre-7.2 rigs upgrade-uninstall cleanly, plus add OtOpcUaWonderwareHistorian - scripts/e2e/test-galaxy.ps1: deleted (drove the legacy E2E) - scripts/e2e/e2e-config.sample.json: rewrite the galaxy section comment to reflect the GalaxyMxGateway-only path - scripts/e2e/README.md: drop OtOpcUaGalaxyHost references - scripts/compliance/phase-7-compliance.ps1: drop Galaxy.Shared HistorianAlarms* checks (those contracts moved to Driver.Historian.Wonderware.Client in PR 3.4) Live state: OtOpcUaGalaxyHost Windows service stopped + removed via NSSM before this commit. The dev box's Galaxy access is now exclusively through the running mxaccessgw (separate repo). Stays out of scope for PR 7.2 (PR 7.3 territory): - CLAUDE.md Galaxy section rewrite - mxaccess_documentation.md deletion - Memory entries for the now-retired Galaxy.Host service Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
21cac4c8c4 |
PR 4.W — Galaxy:Backend wiring + server-side factory registration
- GalaxyDriver.InitializeAsync now builds the production gw runtime (MxGatewayClient, GalaxyMxSession, GatewayGalaxySubscriber, GatewayGalaxyDataWriter, ReconnectSupervisor, HostConnectivityForwarder, PerPlatformProbeWatcher) when no test seams are pre-injected; Dispose tears the chain down in order. - GetHealth surfaces supervisor.IsDegraded as DriverState.Degraded so a transport drop is observable without polling the supervisor directly. - DiscoverAsync now refreshes the per-platform probe watcher's membership against $WinPlatform / $AppEngine objects after every discovery pass. - OnPumpDataChange routes ScanState changes through the probe watcher in addition to fanning out OnDataChange to ISubscribable consumers. - Server registers GalaxyDriver under "GalaxyMxGateway" alongside the legacy "Galaxy" GalaxyProxyDriver factory so DriverInstance rows can opt in. - Bumped Server.Tests' Microsoft.Extensions.Logging.Abstractions to 10.0.7 to resolve the downgrade pulled in transitively via MxGateway.Client. - Lifecycle factory tests switched to the internal seam-injection ctor so they no longer attempt a real gRPC connect during InitializeAsync. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
854827090a |
PR 3.W — Phase 3 wire-up: Wonderware sidecar DI registration
Solution + DI plumbing to complete Phase 3. With this PR the .NET 10 server
can boot with the Wonderware historian sidecar in the loop, gated by config
so existing deployments are unaffected.
slnx: registers Driver.Historian.Wonderware (net48 sidecar),
Driver.Historian.Wonderware.Client (net10 client), and both test projects.
Server.csproj: adds ProjectReference to the .NET 10 client.
Program.cs: reads Historian:Wonderware:* configuration. When Enabled=true,
constructs a WonderwareHistorianClient singleton and:
- Registers it as IAlarmHistorianWriter so the SqliteStoreAndForwardSink
drain (task #248) can pick it up.
- Registers a WonderwareHistorianBootstrap hosted service that, on
StartAsync, calls IHistoryRouter.Register(prefix, client) under the
configured DriverInstancePrefix (default "galaxy") — lets the
HistoryRead* dispatch in DriverNodeManager find the sidecar via
longest-prefix-match resolution.
When Enabled=false (the default), DriverNodeManager keeps using its
internal LegacyDriverHistoryAdapter for the read path and the existing
NullAlarmHistorianSink stays in place — drop-in compatible with every
deployment that hasn't moved off Galaxy.Host yet.
42 server integration tests + 10 client tests pass. Full solution build
clean (0/0).
Note: scripts/install/Install-Services.ps1 and
src/.../Server/appsettings.json carry intermixed user WIP and are NOT
committed in this PR. Equivalent edits applied locally:
Install-Services.ps1: new -InstallWonderwareHistorian switch installs the
OtOpcUaWonderwareHistorian service alongside OtOpcUaGalaxyHost;
generates a fresh historian shared secret; OtOpcUa service depends on
both when historian sidecar is installed.
Server/appsettings.json: new Historian.Wonderware section with
Enabled=false default, PipeName/SharedSecret/PeerName/
DriverInstancePrefix/ConnectTimeoutSeconds/CallTimeoutSeconds keys.
Both pieces should land in a follow-up commit once the user's WIP on those
files clears.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
de77d42eab |
Phase 6.3 Stream B — peer-probe HostedServices populating PeerReachabilityTracker
Closes task #116 (GA hardening backlog). Before this commit the RedundancyStatePublisher saw PeerReachability.Unknown for every peer because the tracker had no writers — every healthy peer got degraded to the Isolated-Primary band (230) even when fully reachable. Not release-blocking (safe default), but not the full non-transparent- redundancy UX either. Two-layer probe model per docs/v2/implementation/phase-6-3-redundancy-runtime.md §Stream B: - PeerHttpProbeLoop (Stream B.1) — fast-fail layer at 2 s / 1 s timeout. Hits each peer's http://{Host}:{DashboardPort}/healthz via an injected IHttpClientFactory. Writes the HTTP bit of PeerReachability while preserving the UA bit from the last UA probe so a transient HTTP blip doesn't clobber the authoritative UA reading. - PeerUaProbeLoop (Stream B.2) — authoritative layer at 10 s / 5 s timeout. Calls DiscoveryClient.GetEndpoints against opc.tcp://{Host}: {OpcUaPort} — cheap compared to a full Session.Create, no cert trust required. Short-circuits when the HTTP probe last reported the peer unhealthy (no wasted handshakes on a known-dead endpoint), clearing the stale UaHealthy bit in that case. Both inherit from BackgroundService, follow the tick/delay/catch pattern RedundancyPublisherHostedService + ResilienceStatusPublisherHostedService established, and expose TickAsync() as internal for test drive-through. New PeerProbeOptions class carries the four intervals/timeouts so operators can tune cadence per site. Registered as singleton in Program.cs; HTTP client registered by name so the OtOpcUa handler chain (Serilog enrichers, potential future OpenTelemetry instrumentation) isn't bypassed. Tests — 9 new unit tests across PeerHttpProbeLoopTests (5) and PeerUaProbeLoopTests (4). All pass. Server.Tests total 243 → 252. Full solution build clean. Docs: v2-release-readiness.md Phase 6.3 follow-ups list marks the peer-probe bullet struck-through with a close-out note. Still deferred in Phase 6.3: - OPC UA variable-node binding (task #117 — ServiceLevel + ServerUriArray) - sp_PublishGeneration lease wrap (task #118) - Client interop matrix (task #119) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
69e0d02c72 |
task-galaxy-e2e branch — non-FOCAS work-in-progress snapshot
Catch-all commit for pending work on the task-galaxy-e2e branch that
wasn't part of the FOCAS migration. Grouping by topic so future per-topic
commits can be cherry-picked if needed.
TwinCAT
- src/.../Driver.TwinCAT/AdsTwinCATClient.cs + TwinCATDriverFactoryExtensions.cs:
factory-registration extensions + ADS client refinements.
- src/.../Driver.TwinCAT.Cli/Commands/BrowseCommand.cs: new browse command
for the TwinCAT test-client CLI.
- tests/.../Driver.TwinCAT.IntegrationTests/TwinCAT3SmokeTests.cs + TwinCatProject/:
fixture scaffold with a minimal POU + README pointing at the TCBSD/ESXi
VM for e2e.
- docs/Driver.TwinCAT.Cli.md + docs/drivers/TwinCAT-Test-Fixture.md:
documentation for the above.
- docs/v3/twincat-backlog.md: forward-looking backlog seed.
Admin UI + fleet status
- src/.../Admin/Components/Pages/Clusters/DriversTab.razor + Hosts.razor:
UI refresh for fleet-status rendering.
- src/.../Admin/Hubs/FleetStatusHub.cs + FleetStatusPoller.cs +
Admin/Program.cs: SignalR hub + poller plumbing for live fleet data.
- tests/.../Admin.Tests/FleetStatusPollerTests.cs: poller coverage.
Server + redundancy runtime (Phase 6.3 follow-ups)
- src/.../Server/Hosting/RedundancyPublisherHostedService.cs: HostedService
that owns the RedundancyStatePublisher lifecycle + wires peer reachability.
- src/.../Server/Redundancy/ServerRedundancyNodeWriter.cs: OPC UA
variable-node writer binding ServiceLevel + ServerUriArray to the
publisher's events.
- src/.../Server/Program.cs + Server.csproj: hosted-service registration.
- tests/.../Server.Tests/ServerRedundancyNodeWriterTests.cs +
Server.Tests.csproj: coverage for the above.
Configuration
- src/.../Configuration/Validation/DraftValidator.cs +
tests/.../Configuration.Tests/DraftValidatorTests.cs: draft-validation
refinements.
E2E scripts (shared infrastructure)
- scripts/e2e/README.md + _common.ps1 + test-all.ps1: shared helpers + the
all-drivers test-all runner.
- scripts/e2e/test-opcuaclient.ps1: OPC UA Client e2e runner.
Docs
- docs/v2/implementation/phase-6-{1,2,3,4}*.md + exit-gate-phase-{3,7}.md:
phase-gate + implementation doc updates.
- docs/v2/plan.md: top-level plan refresh.
- docs/v2/redundancy-interop-playbook.md: client interop playbook for the
Phase 6.3 redundancy-runtime work.
Two orphan FOCAS docs remain on disk but deliberately unstaged —
docs/v2/focas-deployment.md and docs/v2/implementation/focas-simulator-plan.md
describe the now-retired Tier-C topology and should either be rewritten
or deleted in a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
7ba783de77 |
Tasks #211 #212 #213 — AbCip / S7 / AbLegacy server-side factories + seed SQL
Parent: #209. Follow-up to #210 (Modbus). Registers the remaining three non-Galaxy driver factories so a Config DB `DriverType` in {`AbCip`, `S7`, `AbLegacy`} actually boots a live driver instead of being silently skipped by DriverInstanceBootstrapper. Each factory follows the same shape as ModbusDriverFactoryExtensions + the existing Galaxy + FOCAS patterns: - Static `Register(DriverFactoryRegistry)` entry point. - Internal `CreateInstance(driverInstanceId, driverConfigJson)` — deserialises a DTO, strict-parses enum fields (fail-fast with an explicit "expected one of" list), composes the driver's options object, returns a new driver. - DriverType keys: `"AbCip"`, `"S7"`, `"AbLegacy"` (case-insensitive at the registry layer). DTO surfaces cover every option the respective driver's Options class exposes — devices, tags, probe, timeouts, per-driver quirks (AbCip `EnableControllerBrowse` / `EnableAlarmProjection`, S7 Rack/Slot/ CpuType, AbLegacy PlcFamily). Seed SQL (mirrors `seed-modbus-smoke.sql` shape): - `seed-abcip-smoke.sql` — `abcip-smoke` cluster + ControlLogix device + `TestDINT:DInt` tag, pointing at the ab_server compose fixture (`ab://127.0.0.1:44818/1,0`). - `seed-s7-smoke.sql` — `s7-smoke` cluster + S71500 CPU + `DB1.DBW0:Int16` tag at the python-snap7 fixture (`127.0.0.1:1102`, non-priv port). - `seed-ablegacy-smoke.sql` — `ablegacy-smoke` cluster + SLC 500 + `N7:5` tag. Hardware-gated per #222; placeholder gateway to be replaced with real SLC/MicroLogix/PLC-5/RSEmulate before running. Build plumbing: - Each driver project now ProjectReferences `Core` (was `Core.Abstractions`-only). `DriverFactoryRegistry` lives in `Core.Hosting` so the factory extensions can't compile without it. Matches the FOCAS + Galaxy.Proxy reference shape. - `Server.csproj` adds the three new driver ProjectReferences so Program.cs resolves the symbols at compile-time + ships the assemblies at runtime. Full-solution build: 0 errors, 334 pre-existing xUnit1051 warnings only. Live boot verification of all four (Modbus + these three) happens in the exit-gate PR — factories + seeds are pre-conditions and are being shipped first so the exit-gate PR can scope to "does the server publish the expected NodeIds + does the e2e script pass." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
55245a962e |
Task #210 — Modbus server-side factory + seed SQL (closes first of #209 umbrella)
Parent: #209. Adds the server-side wiring so a Config DB `DriverType='Modbus'` row actually boots a Modbus driver instance + publishes its tags under OPC UA NodeIds, instead of being silently skipped by DriverInstanceBootstrapper. Changes: - `ModbusDriverFactoryExtensions` (new) — mirrors `GalaxyProxyDriverFactoryExtensions` + `FocasDriverFactoryExtensions`. `DriverTypeName="Modbus"`, `CreateInstance` deserialises `ModbusDriverConfigDto` (Host/Port/UnitId/TimeoutMs/Probe/Tags) to a full `ModbusDriverOptions` and hands back a `ModbusDriver`. Strict enum parsing (Region / DataType / ByteOrder / StringByteOrder) — unknown values fail fast with an explicit "expected one of" error rather than at first read. - `Program.cs` — register the factory after Galaxy + FOCAS. - `Driver.Modbus.csproj` — add `Core` project reference (the DI-free factory needs `DriverFactoryRegistry` from `Core.Hosting`). Matches the FOCAS driver's reference shape. - `Server.csproj` — add the `Driver.Modbus` ProjectReference so the Program.cs registration compiles against the same assembly the server loads at runtime. - `scripts/smoke/seed-modbus-smoke.sql` (new) — one-cluster smoke seed modelled on `seed-phase-7-smoke.sql`. Creates a `modbus-smoke` cluster + `modbus-smoke-node` + Draft generation + Namespace + UnsArea/UnsLine/ Equipment + one Modbus `DriverInstance` pointing at the pymodbus standard fixture (`127.0.0.1:5020`) + one Tag at `HR[200]:UInt16`, ending in `EXEC sp_PublishGeneration`. HR[100] is deliberately *not* used because pymodbus `standard.json` runs an auto-increment action on that register. Full-solution build: 0 errors, only the pre-existing xUnit1051 warnings. AB CIP / S7 / AB Legacy factories follow in their own PRs per #211 / #212 / #213. Live boot verification happens in the exit-gate PR once all four factories are in place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
dfe3731c73 |
Task #220 — Wire FOCAS into DriverFactoryRegistry bootstrap pipeline
Closes the non-hardware gap surfaced in the #220 audit: FOCAS had full Tier-C architecture (Driver.FOCAS + Driver.FOCAS.Host + Driver.FOCAS.Shared, supervisor, post-mortem MMF, NSSM scripts, 239 tests) but no factory registration, so config-DB DriverInstance rows of type "FOCAS" would fail at bootstrap with "unknown driver type". Hardware-gated FwlibHostedBackend (real Fwlib32 P/Invoke inside the Host process) stays deferred under #222 lab-rig. Ships: - FocasDriverFactoryExtensions.Register(registry) mirroring the Galaxy pattern. JSON schema selects backend via "Backend" field: "ipc" (default) — IpcFocasClientFactory → named-pipe FocasIpcClient → Driver.FOCAS.Host process (Tier-C isolation) "fwlib" — direct in-process FwlibFocasClientFactory (P/Invoke) "unimplemented" — UnimplementedFocasClientFactory (fail-fast on use — useful for staging DriverInstance rows pre-Host-deploy) - Devices / Tags / Probe / Timeout / Series feed into FocasDriverOptions. Series validated eagerly at top-level so typos fail at bootstrap, not first read. Tag DataType + Series enum values surface clear errors listing valid options. - Program.cs adds FocasDriverFactoryExtensions.Register alongside Galaxy. - Driver.FOCAS.csproj references Core (for DriverFactoryRegistry). - Server.csproj adds Driver.FOCAS ProjectReference so the factory type is reachable from Program.cs. Tests: 13 new FocasDriverFactoryExtensionsTests covering: registry entry, case-insensitive lookup, ipc backend with full config, ipc defaults, missing PipeName/SharedSecret errors, fwlib backend short-path, unimplemented backend, unknown-backend error, unknown-Series error, tag missing DataType, null/ws args, duplicate-register throws. Regression: 202 FOCAS + 13 FOCAS.Host + 24 FOCAS.Shared + 239 Server all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3d78033ea4 |
Driver-instance bootstrap pipeline (#248) — DriverInstance rows materialise as live IDriver instances
Closes the gap surfaced by Phase 7 live smoke (#240): DriverInstance rows in the central config DB had no path to materialise as live IDriver instances in DriverHost, so virtual-tag scripts read BadNodeIdUnknown for every tag. ## DriverFactoryRegistry (Core.Hosting) Process-singleton type-name → factory map. Each driver project's static Register call pre-loads its factory at Program.cs startup; the bootstrapper looks up by DriverInstance.DriverType + invokes with (DriverInstanceId, DriverConfig JSON). Case-insensitive; duplicate-type registration throws. ## GalaxyProxyDriverFactoryExtensions.Register (Driver.Galaxy.Proxy) Static helper — no Microsoft.Extensions.DependencyInjection dep, keeps the driver project free of DI machinery. Parses DriverConfig JSON for PipeName + SharedSecret + ConnectTimeoutMs. DriverInstanceId from the row wins over JSON per the schema's UX_DriverInstance_Generation_LogicalId. ## DriverInstanceBootstrapper (Server) After NodeBootstrap loads the published generation: queries DriverInstance rows scoped to that generation, looks up the factory per row, constructs + DriverHost.RegisterAsync (which calls InitializeAsync). Per plan decision #12 (driver isolation), failure of one driver doesn't prevent others — logs ERR + continues + returns the count actually registered. Unknown DriverType (factory not registered) logs WRN + skips so a missing-assembly deployment doesn't take down the whole server. ## Wired into OpcUaServerService.ExecuteAsync After NodeBootstrap.LoadCurrentGenerationAsync, before PopulateEquipmentContentAsync + Phase7Composer.PrepareAsync. The Phase 7 chain now sees a populated DriverHost so CachedTagUpstreamSource has an upstream feed. ## Live evidence on the dev box Re-ran the Phase 7 smoke from task #240. Pre-#248 vs post-#248: Equipment namespace snapshots loaded for 0/0 driver(s) ← before Equipment namespace snapshots loaded for 1/1 driver(s) ← after Galaxy.Host pipe ACL denied our SID (env-config issue documented in docs/ServiceHosting.md, NOT a code issue) — the bootstrapper logged it as "failed to initialize, driver state will reflect Faulted" and continued past the failure exactly per plan #12. The rest of the pipeline (Equipment walker + Phase 7 composer) ran to completion. ## Tests — 5 new DriverFactoryRegistryTests Register + TryGet round-trip, case-insensitive lookup, duplicate-type throws, null-arg guards, RegisteredTypes snapshot. Pure functions; no DI/DB needed. The bootstrapper's DB-query path is exercised by the live smoke (#240) which operators run before each release. |
||
|
|
f64a8049d8 |
Phase 7 follow-up #243 — CachedTagUpstreamSource + Phase7EngineComposer
Ships the composition kernel that maps Config DB rows (Script / VirtualTag /
ScriptedAlarm) to the runtime definitions VirtualTagEngine + ScriptedAlarmEngine
consume, builds the engine instances, and wires OnEvent → historian-sink routing.
## src/ZB.MOM.WW.OtOpcUa.Server/Phase7/
- CachedTagUpstreamSource — implements both Core.VirtualTags.ITagUpstreamSource and
Core.ScriptedAlarms.ITagUpstreamSource (identical shape, distinct namespaces) on one
concrete type so the composer can hand one instance to both engines. Thread-safe
ConcurrentDictionary value cache with synchronous ReadTag + fire-on-write
Push(path, snapshot) that fans out to every observer registered via SubscribeTag.
Unknown-path reads return a BadNodeIdUnknown-quality snapshot (status 0x80340000)
so scripts branch on quality naturally.
- Phase7EngineComposer.Compose(scripts, virtualTags, scriptedAlarms, upstream,
alarmStateStore, historianSink, rootScriptLogger, loggerFactory) — single static
entry point that:
* Indexes scripts by ScriptId, resolves VirtualTag.ScriptId + ScriptedAlarm.PredicateScriptId
to full SourceCode
* Projects DB rows to VirtualTagDefinition + ScriptedAlarmDefinition (mapping
DataType string → DriverDataType enum, AlarmType string → AlarmKind enum,
Severity 1..1000 → AlarmSeverity bucket matching the OPC UA Part 9 bands
that AbCipAlarmProjection + OpcUaClient MapSeverity already use)
* Constructs VirtualTagEngine + loads definitions (throws InvalidOperationException
with the list of scripts that failed to compile — aggregated like Streams B+C)
* Constructs ScriptedAlarmEngine + loads definitions + wires OnEvent →
IAlarmHistorianSink.EnqueueAsync using ScriptedAlarmEvent.Emission as the event
kind + Condition.LastAckUser/LastAckComment for audit fields
* Returns Phase7ComposedSources with Disposables list the caller owns
Empty Phase 7 config returns Phase7ComposedSources.Empty so deployments without
scripts / alarms behave exactly as pre-Phase-7. Non-null sources flow into
OpcUaApplicationHost's virtualReadable / scriptedAlarmReadable plumbing landed by
task #239 — DriverNodeManager then dispatches reads by NodeSourceKind per PR #186.
## Tests — 12/12
CachedTagUpstreamSourceTests (6):
- Unknown-path read returns BadNodeIdUnknown-quality snapshot
- Push-then-Read returns cached value
- Push fans out to subscribers in registration order
- Push to one path doesn't fire another path's observer
- Dispose of subscription handle stops fan-out
- Satisfies both Core.VirtualTags + Core.ScriptedAlarms ITagUpstreamSource interfaces
Phase7EngineComposerTests (6):
- Empty rows → Phase7ComposedSources.Empty (both sources null)
- VirtualTag rows → VirtualReadable non-null + Disposables populated
- Missing script reference throws InvalidOperationException with the missing ScriptId
in the message
- Disabled VirtualTag row skipped by projection
- TimerIntervalMs → TimeSpan.FromMilliseconds round-trip
- Severity 1..1000 maps to Low/Medium/High/Critical at 250/500/750 boundaries
(matches AbCipAlarmProjection + OpcUaClient.MapSeverity banding)
## Scope — what this PR does NOT do
The composition kernel is the tricky part; the remaining wiring is three narrower
follow-ups that each build on this PR:
- task #244 — driver-bridge feed that populates CachedTagUpstreamSource from live
driver subscriptions. Without this, ctx.GetTag returns BadNodeIdUnknown even when
the driver has a fresh value.
- task #245 — ScriptedAlarmReadable adapter exposing each alarm's current Active
state as IReadable. Phase7EngineComposer.Compose currently returns
ScriptedAlarmReadable=null so reads on Source=ScriptedAlarm variables return
BadNotFound per the ADR-002 "misconfiguration not silent fallback" signal.
- task #246 — Program.cs call to Phase7EngineComposer.Compose with config rows
loaded from the sealed-cache DB read, plus SqliteStoreAndForwardSink lifecycle
wiring at %ProgramData%/OtOpcUa/alarm-historian-queue.db with the Galaxy.Host
IPC writer from Stream D.
Task #240 (live OPC UA E2E smoke) depends on all three follow-ups landing.
|
||
|
|
5c0d3154c1 |
Roslyn analyzer — detect unwrapped driver-capability calls (OTOPCUA0001). Closes task #200. New netstandard2.0 analyzer project src/ZB.MOM.WW.OtOpcUa.Analyzers registered as an <Analyzer>-item ProjectReference from the Server csproj so the warning fires at every Server compile. First (and only so far) rule OTOPCUA0001 — "Driver capability call must be wrapped in CapabilityInvoker" — walks every InvocationOperation in the AST + trips when (a) the target method implements one of the seven guarded capability interfaces (IReadable / IWritable / ITagDiscovery / ISubscribable / IHostConnectivityProbe / IAlarmSource / IHistoryProvider) AND (b) the method's return type is Task, Task<T>, ValueTask, or ValueTask<T> — the async-wire-call constraint narrows the rule to the surfaces the Phase 6.1 pipeline actually wraps + sidesteps pure in-memory accessors like IHostConnectivityProbe.GetHostStatuses() which would trigger false positives AND (c) the call does NOT sit inside a lambda argument passed to CapabilityInvoker.ExecuteAsync / ExecuteWriteAsync / AlarmSurfaceInvoker.*. The wrapper detection walks up the syntax tree from the call site, finds any enclosing InvocationExpressionSyntax whose method's containing type is one of the wrapper classes, + verifies the call lives transitively inside that invocation's AnonymousFunctionExpressionSyntax argument — a sibling "result = await driver.ReadAsync(...)" followed by a separate invoker.ExecuteAsync(...) call does NOT satisfy the wrapping rule + the analyzer flags it (regression guard in the 5th test). Five xunit-v3 + Shouldly tests at tests/ZB.MOM.WW.OtOpcUa.Analyzers.Tests: direct ReadAsync in server namespace trips; wrapped ReadAsync inside CapabilityInvoker.ExecuteAsync lambda passes; direct WriteAsync trips; direct DiscoverAsync trips; sneaky pattern — read outside the lambda + ExecuteAsync with unrelated lambda nearby — still trips. Hand-rolled test harness compiles a stub-plus-user snippet via CSharpCompilation.WithAnalyzers + runs GetAnalyzerDiagnosticsAsync directly, deliberately avoiding Microsoft.CodeAnalysis.CSharp.Analyzer.Testing.XUnit because that package pins to xunit v2 + this repo is on xunit.v3 everywhere else. RS2008 release-tracking noise suppressed by adding AnalyzerReleases.Shipped.md + AnalyzerReleases.Unshipped.md as AdditionalFiles, which is the canonical Roslyn-analyzer hygiene path. Analyzer DLL referenced from Server.csproj via ProjectReference with OutputItemType=Analyzer + ReferenceOutputAssembly=false — the DLL ships as a compiler plugin, not a runtime dependency. Server build validates clean: the analyzer activates on every Server file but finds zero violations, which confirms the Phase 6.1 wrapping work done in prior PRs is complete + the analyzer is now the regression guard preventing the next new capability surface from being added raw. slnx updated with both the src + tests project entries. Full solution build clean, analyzer suite 5/5 passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9dd5e4e745 |
Phase 6.1 Stream C — health endpoints on :4841 + LogContextEnricher + Serilog JSON sink + CapabilityInvoker enrichment
Closes Stream C per docs/v2/implementation/phase-6-1-resilience-and-observability.md. Core.Observability (new namespace): - DriverHealthReport — pure-function aggregation over DriverHealthSnapshot list. Empty fleet = Healthy. Any Faulted = Faulted. Any Unknown/Initializing (no Faulted) = NotReady. Any Degraded or Reconnecting (no Faulted, no NotReady) = Degraded. Else Healthy. HttpStatus(verdict) maps to the Stream C.1 state matrix: Healthy/Degraded → 200, NotReady/Faulted → 503. - LogContextEnricher — Serilog LogContext wrapper. Push(id, type, capability, correlationId) returns an IDisposable scope; inner log calls carry DriverInstanceId / DriverType / CapabilityName / CorrelationId structured properties automatically. NewCorrelationId = 12-hex-char GUID slice for cases where no OPC UA RequestHeader.RequestHandle is in flight. CapabilityInvoker — now threads LogContextEnricher around every ExecuteAsync / ExecuteWriteAsync call site. OtOpcUaServer passes driver.DriverType through so logs correlate to the driver type too. Every capability call emits structured fields per the Stream C.4 compliance check. Server.Observability: - HealthEndpointsHost — standalone HttpListener on http://localhost:4841/ (loopback avoids Windows URL-ACL elevation; remote probing via reverse proxy or explicit netsh urlacl grant). Routes: /healthz → 200 when (configDbReachable OR usingStaleConfig); 503 otherwise. Body: status, uptimeSeconds, configDbReachable, usingStaleConfig. /readyz → DriverHealthReport.Aggregate + HttpStatus mapping. Body: verdict, drivers[], degradedDrivers[], uptimeSeconds. anything else → 404. Disposal cooperative with the HttpListener shutdown. - OpcUaApplicationHost starts the health host after the OPC UA server comes up and disposes it on shutdown. New OpcUaServerOptions knobs: HealthEndpointsEnabled (default true), HealthEndpointsPrefix (default http://localhost:4841/). Program.cs: - Serilog pipeline adds Enrich.FromLogContext + opt-in JSON file sink via `Serilog:WriteJson = true` appsetting. Uses Serilog.Formatting.Compact's CompactJsonFormatter (one JSON object per line — SIEMs like Splunk, Datadog, Graylog ingest without a regex parser). Server.Tests: - Existing 3 OpcUaApplicationHost integration tests now set HealthEndpointsEnabled=false to avoid port :4841 collisions under parallel execution. - New HealthEndpointsHostTests (9): /healthz healthy empty fleet; stale-config returns 200 with flag; unreachable+no-cache returns 503; /readyz empty/ Healthy/Faulted/Degraded/Initializing drivers return correct status and bodies; unknown path → 404. Uses ephemeral ports via Interlocked counter. Core.Tests: - DriverHealthReportTests (8): empty fleet, all-healthy, any-Faulted trumps, any-NotReady without Faulted, Degraded without Faulted/NotReady, HttpStatus per-verdict theory. - LogContextEnricherTests (8): all 4 properties attach; scope disposes cleanly; NewCorrelationId shape; null/whitespace driverInstanceId throws. - CapabilityInvokerEnrichmentTests (2): inner logs carry structured properties; no context leak outside the call site. Full solution dotnet test: 1016 passing (baseline 906, +110 for Phase 6.1 so far across Streams A+B+C). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ef2a810b2d |
Phase 3 PR 34 — Host-status publisher (Server) + /hosts drill-down page (Admin). Closes LMX follow-up #7 by wiring together the data layer from PR 33. Server.HostStatusPublisher is a BackgroundService that walks every driver registered in DriverHost every 10 seconds, skips drivers that don't implement IHostConnectivityProbe, calls GetHostStatuses() on each probe-capable driver, and upserts one DriverHostStatus row per (NodeId, DriverInstanceId, HostName) into the central config DB. Upsert path: SingleOrDefaultAsync on the composite PK; if no row exists, Add a new one; if a row exists, LastSeenUtc advances unconditionally (heartbeat) and State + StateChangedUtc update only on transitions so Admin UI can distinguish 'still reporting, still Running' from 'freshly transitioned to Running'. MapState translates Core.Abstractions.HostState to Configuration.Enums.DriverHostState (intentional duplicate enum — Configuration project stays free of driver-runtime deps per PR 33's choice). If a driver's GetHostStatuses throws, log warning and skip that driver this tick — never take down the Server on a publisher failure. If the DB is unreachable, log warning + retry next heartbeat (no buffering — next tick's current-state snapshot is more useful than replaying stale transitions after a long outage). 2-second startup delay so NodeBootstrap's RegisterAsync calls land before the first publish tick, then tick runs immediately so a freshly-started Server surfaces its host topology in the Admin UI without waiting a full interval.
Polling chosen over event-driven for initial scope: simpler, matches Admin UI consumer cadence, avoids DriverHost lifecycle-event plumbing that doesn't exist today. Event-driven push for sub-heartbeat latency is a straightforward follow-up. Admin.Services.HostStatusService left-joins DriverHostStatus against ClusterNode on NodeId so rows persist even when the ClusterNode entry doesn't exist yet (first-boot bootstrap case). StaleThreshold = 30s — covers one missed publisher heartbeat plus a generous buffer for clock skew and GC pauses. Admin Components/Pages/Hosts.razor — FleetAdmin-visible page grouped by cluster (handles the '(unassigned)' case for rows without a matching ClusterNode). Four summary cards (Hosts / Running / Stale / Faulted); per-cluster table with Node / Driver / Host / State + Stale-badge / Last-transition / Last-seen / Detail columns; 10s auto-refresh via IServiceScopeFactory timer pattern matching FleetStatusPoller + Fleet dashboard (PR 27). Row-class highlighting: Faulted → table-danger, Stale → table-warning, else default. State badge maps DriverHostState enum to bootstrap color classes. Sidebar link added between 'Fleet status' and 'Clusters'. Server csproj adds Microsoft.EntityFrameworkCore.SqlServer 10.0.0 + registers OtOpcUaConfigDbContext in Program.cs scoped via NodeOptions.ConfigDbConnectionString (no Admin-style manual SQL raw — the DbContext is the only access path, keeps migrations owner-of-record). Tests — HostStatusPublisherTests (4 new Integration cases, uses per-run throwaway DB matching the FleetStatusPollerTests pattern): publisher upserts one row per host from each probe-capable driver and skips non-probe drivers; second tick advances LastSeenUtc without creating duplicate rows (upsert pattern verified end-to-end); state change between ticks updates State AND StateChangedUtc (datetime2(3) rounds to millisecond precision so comparison uses 1ms tolerance — documented inline); MapState translates every HostState enum member. Server.Tests Integration: 4 new tests pass. Admin build clean, Admin.Tests Unit still 23 / 0. docs/v2/lmx-followups.md item #7 marked DONE with three explicit deferred items (event-driven push, failure-count column, SignalR fan-out). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
22d3b0d23c |
Phase 3 PR 19 — LDAP user identity + Basic256Sha256 security profile. Replaces the anonymous-only endpoint with a configurable security profile and an LDAP-backed UserName token validator. New IUserAuthenticator abstraction in Backend/Security/: LdapUserAuthenticator binds to the configured directory (reuses the pattern from Admin.Security.LdapAuthService without the cross-app dependency — Novell.Directory.Ldap.NETStandard 3.6.0 package ref added to Server alongside the existing OPCFoundation packages) and maps group membership to OPC UA roles via LdapOptions.GroupToRole (case-insensitive). DenyAllUserAuthenticator is the default when Ldap.Enabled=false so UserName token attempts return a clean BadUserAccessDenied rather than hanging on a localhost:3893 bind attempt. OpcUaSecurityProfile enum + LdapOptions nested record on OpcUaServerOptions. Profile=None keeps the PR 17 shape (SecurityPolicies.None + Anonymous token only) so existing integration tests stay green; Profile=Basic256Sha256SignAndEncrypt adds a second ServerSecurityPolicy (Basic256Sha256 + SignAndEncrypt) to the collection and, when Ldap.Enabled=true, adds a UserName token policy scoped to SecurityPolicies.Basic256Sha256 only — passwords must ride an encrypted channel, the stack rejects UserName over None. OtOpcUaServer.OnServerStarted hooks SessionManager.ImpersonateUser: AnonymousIdentityToken passes through; UserNameIdentityToken delegates to IUserAuthenticator.AuthenticateAsync — rejected identities throw ServiceResultException(BadUserAccessDenied); accepted identities get a RoleBasedIdentity that carries the resolved roles through session.Identity so future PRs can gate writes by role. OpcUaApplicationHost + OtOpcUaServer constructors take IUserAuthenticator as a dependency. Program.cs binds the new OpcUaServer:Ldap section from appsettings (Enabled defaults false, GroupToRole parsed as Dictionary<string,string>), registers IUserAuthenticator as LdapUserAuthenticator when enabled or DenyAllUserAuthenticator otherwise. PR 17 integration test updated to pass DenyAllUserAuthenticator so it keeps exercising the anonymous-only path unchanged. Tests — SecurityConfigurationTests (new, 13 cases): DenyAllAuthenticator rejects every credential; LdapAuthenticator rejects blank creds without hitting the server; rejects when Enabled=false; rejects plaintext when both UseTls=false AND AllowInsecureLdap=false (safety guard matching the Admin service); EscapeLdapFilter theory (4 rows: plain passthrough, parens/asterisk/backslash → hex escape) — regression guard against LDAP injection; ExtractOuSegment theory (3 rows: finds ou=, returns null when absent, handles multiple ou segments by returning first); ExtractFirstRdnValue theory (3 rows: strips cn= prefix, handles single-segment DN, returns plain string unchanged when no =). OpcUaServerOptions_default_is_anonymous_only asserts the default posture preserves PR 17 behavior. InternalsVisibleTo('ZB.MOM.WW.OtOpcUa.Server.Tests') added to Server csproj so ExtractOuSegment and siblings are reachable from the tests. Full solution: 0 errors, 180 tests pass (8 Core + 14 Proxy + 24 Configuration + 6 Shared + 91 Galaxy.Host + 19 Server (17 unit + 2 integration) + 18 Admin). Live-LDAP integration test (connect via Basic256Sha256 endpoint with a real user from GLAuth, assert the session.Identity carries the mapped role) is deferred to a follow-up — it requires the GLAuth dev instance to be running at localhost:3893 which is dev-machine-specific, and the test harness for that also needs a fresh client-side certificate provisioned by the live server's trusted store.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f53c39a598 |
Phase 3 PR 16 — concrete OPC UA server scaffolding with AlarmConditionState materialization. Introduces the OPCFoundation.NetStandard.Opc.Ua.Server package (v1.5.374.126, same version the v1 stack already uses) and two new server-side classes: DriverNodeManager : CustomNodeManager2 is the concrete realization of PR 15's IAddressSpaceBuilder contract — Folder() creates FolderState nodes under an Organizes hierarchy rooted at ObjectsFolder > DriverInstanceId; Variable() creates BaseDataVariableState with DataType mapped from DriverDataType (Boolean/Int32/Float/Double/String/DateTime) + ValueRank (Scalar or OneDimension) + AccessLevel CurrentReadOrWrite; AddProperty() creates PropertyState with HasProperty reference. Read hook wires OnReadValue per variable to route to IReadable.ReadAsync; Write hook wires OnWriteValue to route to IWritable.WriteAsync and surface per-tag StatusCode. MarkAsAlarmCondition() materializes an OPC UA AlarmConditionState child of the variable, seeded from AlarmConditionInfo (SourceName, InitialSeverity → UA severity via Low=250/Medium=500/High=700/Critical=900, InitialDescription), initial state Enabled + Acknowledged + Inactive + Retain=false. Returns an IAlarmConditionSink whose OnTransition updates alarm.Severity/Time/Message and switches state per AlarmType string ('Active' → SetActiveState(true) + SetAcknowledgedState(false) + Retain=true; 'Acknowledged' → SetAcknowledgedState(true); 'Inactive' → SetActiveState(false) + Retain=false if already Acked) then calls alarm.ReportEvent to emit the OPC UA event to subscribed clients. Galaxy's GalaxyAlarmTracker (PR 14) now lands at a concrete AlarmConditionState node instead of just raising an unobserved C# event. OtOpcUaServer : StandardServer wires one DriverNodeManager per DriverHost.GetDriver during CreateMasterNodeManager — anonymous endpoint, no security profile (minimum-viable; LDAP + security-profile wire-up is the next PR). DriverHost gains public GetDriver(instanceId) so the server can enumerate drivers at startup. NestedBuilder inner class in DriverNodeManager implements IAddressSpaceBuilder by temporarily retargeting the parent's _currentFolder during each call so Folder→Variable→AddProperty land under the correct subtree — not thread-safe if discovery ran concurrently, but GenericDriverNodeManager.BuildAddressSpaceAsync is sequential per driver so this is safe by construction. NuGet audit suppress for GHSA-h958-fxgg-g7w3 (moderate-severity in OPCFoundation.NetStandard.Opc.Ua.Core 1.5.374.126; v1 stack already accepts this risk on the same package version). PR 16 is scoped as scaffolding — the actual server startup (ApplicationInstance, certificate config, endpoint binding, session management wiring into OpcUaServerService.ExecuteAsync) is deferred to a follow-up PR because it needs ApplicationConfiguration XML + optional-cert-store logic that depends on per-deployment policy decisions. The materialization shape is complete: a subsequent PR adds 100 LOC to start the server and all the already-written IAddressSpaceBuilder + alarm-condition + read/write wire-up activates end-to-end. Full solution: 0 errors, 152 unit tests pass (no new tests this PR — DriverNodeManager unit testing needs an IServerInternal mock which is heavyweight; live-endpoint integration tests land alongside the server-startup PR).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
01fd90c178 |
Phase 1 Streams B–E scaffold + Phase 2 Streams A–C scaffold — 8 new projects with ~70 new tests, all green alongside the 494 v1 IntegrationTests baseline (parity preserved: no v1 tests broken; legacy OtOpcUa.Host untouched). Phase 1 finish: Configuration project (16 entities + 10 enums + DbContext + DesignTimeDbContextFactory + InitialSchema/StoredProcedures/AuthorizationGrants migrations — 8 procs including sp_PublishGeneration with MERGE on ExternalIdReservation per decision #124, sp_RollbackToGeneration cloning rows into a new published generation, sp_ValidateDraft with cross-cluster-namespace + EquipmentUuid-immutability + ZTag/SAPID reservation pre-flight, sp_ComputeGenerationDiff with CHECKSUM-based row signature — plus OtOpcUaNode/OtOpcUaAdmin SQL roles with EXECUTE grants scoped to per-principal-class proc sets and DENY UPDATE/DELETE/INSERT/SELECT on dbo schema); managed DraftValidator covering UNS segment regex, path length, EquipmentUuid immutability across generations, same-cluster namespace binding (decision #122), reservation pre-flight, EquipmentId derivation (decision #125), driver↔namespace compatibility — returning every failing rule in one pass; LiteDB local cache with round-trip + ring pruning + corruption-fast-fail; GenerationApplier with per-entity Added/Removed/Modified diff and dependency-ordered callbacks (namespace → driver → device → equipment → poll-group → tag, Removed before Added); Core project with GenericDriverNodeManager (scaffold for the Phase 2 Galaxy port) and DriverHost lifecycle registry; Server project using Microsoft.Extensions.Hosting BackgroundService replacing TopShelf, with NodeBootstrap that falls back to LiteDB cache when the central DB is unreachable (decision #79); Admin project scaffolded as Blazor Server with Bootstrap 5 sidebar layout, cookie auth, three admin roles (ConfigViewer/ConfigEditor/FleetAdmin), Cluster + Generation services fronting the stored procs. Phase 2 scaffold: Driver.Galaxy.Shared (netstandard2.0) with full MessagePack IPC contract surface — Hello version negotiation, Open/CloseSession, Heartbeat, DiscoverHierarchy + GalaxyObjectInfo/GalaxyAttributeInfo, Read/WriteValues, Subscribe/Unsubscribe/OnDataChange, AlarmSubscribe/Event/Ack, HistoryRead, HostConnectivityStatus, Recycle — plus length-prefixed framing (decision #28) with a 16 MiB cap and thread-safe FrameWriter/FrameReader; Driver.Galaxy.Host (net48) implementing the Tier C cross-cutting protections from driver-stability.md — strict PipeAcl (allow configured server SID only, explicit deny on LocalSystem + Administrators), PipeServer with caller-SID verification via pipe.RunAsClient + WindowsIdentity.GetCurrent and per-process shared-secret Hello, Galaxy-specific MemoryWatchdog (warn at max(1.5×baseline, +200 MB), soft-recycle at max(2×baseline, +200 MB), hard ceiling 1.5 GB, slope ≥5 MB/min over 30-min rolling window), RecyclePolicy (1 soft recycle per hour cap + 03:00 local daily scheduled), PostMortemMmf (1000-entry ring buffer in %ProgramData%\OtOpcUa\driver-postmortem\galaxy.mmf, survives hard crash, readable cross-process), MxAccessHandle : SafeHandle (ReleaseHandle loops Marshal.ReleaseComObject until refcount=0 then calls optional unregister callback), StaPump with responsiveness probe (BlockingCollection dispatcher for Phase 1 — real Win32 GetMessage/DispatchMessage pump slots in with the same semantics when the Galaxy code lift happens), IsExternalInit shim for init setters on .NET 4.8; Driver.Galaxy.Proxy (net10) implementing IDriver + ITagDiscovery forwarding over the IPC channel with MX data-type and security-classification mapping, plus Supervisor pieces — Backoff (5s → 15s → 60s capped, reset-on-stable-run), CircuitBreaker (3 crashes per 5 min opens; 1h → 4h → manual cooldown escalation; sticky alert doesn't auto-clear), HeartbeatMonitor (2s cadence, 3 consecutive misses = host dead per driver-stability.md). Infrastructure: docker SQL Server remapped to host port 14330 to coexist with the native MSSQL14 Galaxy ZB DB instance on 1433; NuGetAuditSuppress applied per-project for two System.Security.Cryptography.Xml advisories that only reach via EF Core Design with PrivateAssets=all (fix ships in 11.0.0-preview); .slnx gains 14 project registrations. Deferred with explicit TODOs in docs/v2/implementation/phase-2-partial-exit-evidence.md: Phase 1 Stream E Admin UI pages (Generations listing + draft-diff-publish, Equipment CRUD with OPC 40010 fields, UNS Areas/Lines tabs, ACLs + permission simulator, Generic JSON config editor, SignalR real-time, Release-Reservation + Merge-Equipment workflows, LDAP login page, AppServer smoke test per decision #142), Phase 2 Stream D (Galaxy MXAccess code lift out of legacy OtOpcUa.Host, dual-service installer, appsettings → DriverConfig migration script, legacy Host deletion — blocked by parity), Phase 2 Stream E (v1 IntegrationTests against v2 topology, Client.CLI walkthrough diff, four 2026-04-13 stability findings regression tests, adversarial review — requires live MXAccess runtime).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |