Compare commits

...

65 Commits

Author SHA1 Message Date
Joseph Doherty 837172ab39 PR 5.8 — Per-platform ScanState probe parity scenarios
Closes Phase 5 scenario coverage. Both
GalaxyRuntimeProbeManager (legacy) and PerPlatformProbeWatcher (PR 4.7)
must surface the same per-host status stream:

- GetHostStatuses_emits_same_host_set_after_Discover — drives Discover
  on both backends, waits 1.5s for the probe watcher's first push, then
  asserts the platform-host set agrees (transport-entry names differ
  by design — legacy uses the Galaxy.Host process identity, mxgw uses
  MxAccess.ClientName, so we strip those before comparing).
- GetHostStatuses_state_per_platform_matches_across_backends — for
  every overlapping platform host, the HostState must be identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:31:09 -04:00
Joseph Doherty 80a0ca2651 PR 5.7 — Reconnect / disruption parity scenarios
- Reinitialize_returns_both_backends_to_Healthy — drives
  ReinitializeAsync on each backend, asserts DriverState.Healthy
  afterwards, then re-reads a 3-tag sample to confirm the runtime
  surface is back. Recovery latency isn't pinned tightly (legacy = pipe
  + MxAccess COM client, mxgw = re-Register gw session — different
  cadences are expected).
- Health_state_diverges_only_when_one_backend_is_in_recovery — soft
  pin that both backends sit in Healthy or Degraded after init.

A tighter fault-injection scenario (toxiproxy-style) is the 5.7
follow-up — landed when the parity rig grows that capability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:29:44 -04:00
Joseph Doherty 8d042c631b PR 5.6 — History-read parity scenarios
Galaxy history reads route through the server-owned HistoryRouter
(Phase 1, PR 1.3) — neither Galaxy backend implements IHistoryProvider
directly. Parity surface here is the routing decision:

- Discover_emits_same_historized_attribute_set_for_both_backends — the
  IsHistorized attribute set must agree symmetric-set-wise; that's what
  HistoryRouter consumes when deciding whether to route a HistoryRead to
  the Wonderware historian sidecar.
- Neither_Galaxy_backend_implements_IHistoryProvider_directly — pins
  the architectural decision so a regression that re-introduces a
  per-driver history path fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:29:01 -04:00
Joseph Doherty bbdbdf8afb PR 5.5 — Alarm transition parity scenarios
- Discover_emits_same_AlarmConditionInfo_per_alarm_attribute — both
  backends produce the same alarm-condition source-node-id set, with
  matching SourceName / InitialSeverity / InAlarmRef / DescAttrNameRef
  per condition. Skips when the rig's Galaxy carries no alarm-marked
  attributes.
- Discover_marks_at_least_one_alarm_attribute_when_dev_Galaxy_has_alarms
  — IsAlarm-marked variable count parity, soft-pinned (count must
  match across backends but doesn't have to be non-zero).

Alarm-event persistence (the SQLite store-and-forward → Wonderware
historian event store path) is exercised in PR 5.6 against the
historian sidecar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:28:13 -04:00
Joseph Doherty 982771df9a PR 5.4 — Write-by-classification parity scenarios
Both backends route a write through the same path keyed off the attribute's
SecurityClassification, so a single write request must produce the same
StatusCode on each:

- FreeAccess_or_Operate_write_returns_same_StatusCode_on_both_backends
  picks the first numeric FreeAccess/Operate attribute and writes 0.0.
- Configure_class_write_routes_through_secured_path_on_both_backends
  picks a Configure/Tune attribute, writes through the secured path,
  asserts StatusCode parity (the test doesn't care whether the write
  succeeds — only that both backends produce the same outcome).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:26:57 -04:00
Joseph Doherty 9db6da9c20 PR 5.3 — Subscribe + event-rate parity scenarios
- Subscribe_returns_a_handle_for_each_backend — both backends accept
  the same full-reference list and return a non-null handle, with
  symmetric Unsubscribe cleanup.
- Subscribe_event_rate_within_tolerance_for_a_3s_window — counts
  OnDataChange invocations on each backend across a 3s window and
  asserts the mxgw/legacy ratio sits in [0.5, 1.5]. Skips when the
  sampled tags don't change in the window (configuration-only Galaxy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:25:42 -04:00
Joseph Doherty 71443ecbf3 PR 5.2 — Browse + read parity scenarios
Three scenarios using ParityHarness.RequireBoth:

- Discover_emits_same_variable_set_for_both_backends — symmetric set diff
  on the full-reference set must be empty.
- Discover_emits_same_DataType_and_SecurityClass_per_attribute — meta
  triple (DriverDataType, SecurityClass, IsHistorized) must match per
  attribute.
- Read_returns_same_value_and_status_for_a_sampled_attribute — samples
  the first 5 discovered variables, reads through both backends, asserts
  StatusCode equality and value-CLR-type equality (raw values may drift
  between the two reads on a live Galaxy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:24:36 -04:00
Joseph Doherty 82cdf460c5 PR 5.1 — Driver.Galaxy.ParityTests project shell + ParityHarness
Side-by-side fixture that boots both backends against the same dev Galaxy:

- Legacy GalaxyProxyDriver against an out-of-process Galaxy.Host EXE
  (skipped when ZB SQL on localhost:1433 isn't reachable or when the EXE
  hasn't been built).
- New in-process GalaxyDriver against an mxaccessgw gateway at
  http://localhost:5120 by default (skipped when the gateway isn't
  reachable). Endpoint, API key, and client name are env-var overridable
  for the central parity host.

Per-backend availability is independent — each scenario decides whether
to RequireBoth, GetDriver(specific), or use RunOnAvailableAsync to drive
both with the same closure and diff snapshots. PR 5.2–5.8 land scenarios
on top of this shell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:22:04 -04:00
Joseph Doherty 21cac4c8c4 PR 4.W — Galaxy:Backend wiring + server-side factory registration
- GalaxyDriver.InitializeAsync now builds the production gw runtime (MxGatewayClient,
  GalaxyMxSession, GatewayGalaxySubscriber, GatewayGalaxyDataWriter,
  ReconnectSupervisor, HostConnectivityForwarder, PerPlatformProbeWatcher) when no
  test seams are pre-injected; Dispose tears the chain down in order.
- GetHealth surfaces supervisor.IsDegraded as DriverState.Degraded so a transport
  drop is observable without polling the supervisor directly.
- DiscoverAsync now refreshes the per-platform probe watcher's membership against
  $WinPlatform / $AppEngine objects after every discovery pass.
- OnPumpDataChange routes ScanState changes through the probe watcher in addition
  to fanning out OnDataChange to ISubscribable consumers.
- Server registers GalaxyDriver under "GalaxyMxGateway" alongside the legacy
  "Galaxy" GalaxyProxyDriver factory so DriverInstance rows can opt in.
- Bumped Server.Tests' Microsoft.Extensions.Logging.Abstractions to 10.0.7 to
  resolve the downgrade pulled in transitively via MxGateway.Client.
- Lifecycle factory tests switched to the internal seam-injection ctor so they
  no longer attempt a real gRPC connect during InitializeAsync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:10:31 -04:00
Joseph Doherty dae520b9c0 PR 4.7 — Host-connectivity probes (IHostConnectivityProbe scaffold)
HostStatusAggregator merges transport + per-platform host entries with
change-event diffing (re-asserting same state is a no-op so a stable
ScanState=Running burst doesn't fan out duplicates). PerPlatformProbeWatcher
ports the legacy GalaxyRuntimeProbeManager state machine onto the gw
subscription path: SubscribeBulk for `<tag>.ScanState`, idempotent
SyncPlatformsAsync (subscribe new, unsubscribe dropped), and a
DecodeState helper pinning bool/int/string ScanState values + bad-quality
fallback. HostConnectivityForwarder is the skeleton for the gw-6
StreamSessionHealth signal — until that mxaccessgw RPC ships, PR 4.5's
ReconnectSupervisor pushes transport state by calling SetTransport on
session connect/disconnect.

GalaxyDriver wiring (implement IHostConnectivityProbe, route OnDataChange
to PerPlatformProbeWatcher, expose GetHostStatuses() / OnHostStatusChanged,
push transport from supervisor) is deferred to PR 4.W to avoid conflict
with the rest of the Phase 4 deferred wiring (4.5 supervisor + 4.6
DeployWatcher).

Tests: 19 new
- HostStatusAggregatorTests (9): empty snapshot, new-host change with
  Unknown predecessor, same-state silence, transition diff, snapshot
  reflects every host, case-insensitive host names, Remove returns true
  for tracked, Remove false for unknown, concurrent updates don't corrupt.
- HostConnectivityForwarderTests (5): SetTransport routes under client
  name, transitions fire change, repeated same-state silent, empty client
  name throws, post-dispose throws.
- PerPlatformProbeWatcherTests (5 + theory pinning DecodeState's full
  truth table): subscribe N platforms, idempotent re-sync, removed
  platforms unsubscribed + dropped from aggregator, OnProbeValueChanged
  routing for Running/Stopped/bad-quality/foreign-ref, Dispose
  unsubscribes everything.

NOTE: build is currently broken because mxaccessgw/clients/dotnet/ has
been removed from C:\Users\dohertj2\Desktop\mxaccessgw — this PR's source
is internally consistent and isolated from the missing dependency, but the
existing Driver.Galaxy code (PRs 4.1–4.6) can't compile until the .NET
client is restored. Once it is, expect 116 + 19 = 135 tests in the
Driver.Galaxy.Tests project.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:47:13 -04:00
Joseph Doherty 123e3e48b9 PR 4.5 — ReconnectSupervisor
State machine that drives GalaxyDriver's recovery from gw transport
failure. Healthy → TransportLost → Reopening → Replaying → Healthy. Drivers
report failure signals; the supervisor runs reopen + replay with capped
exponential backoff (default 500ms → 30s) until both succeed.

Files:
- Runtime/ReconnectSupervisor.cs — state machine with snapshot, change
  event, last-error tracking, and a one-attempt-at-a-time recovery loop.
  Idempotent ReportTransportFailure: repeated failure reports during an
  in-flight recovery do not spawn parallel loops. Reopen + replay are
  caller-supplied callbacks (the driver injects them in the wire-up PR);
  reopen re-Registers the gw session, replay re-establishes every active
  subscription via gw's ReplaySubscriptionsCommand (mxaccessgw issue gw-3)
  or the SubscribeBulk fallback. Dispose cancels the loop cleanly.
- Public StateTransition record + IsDegraded predicate the driver maps
  to DriverState.Degraded for health snapshots.

Wiring (GalaxyDriver subscribes the supervisor to its EventPump's
transport-failure signal, exposes IsDegraded through GetHealth(), routes
reopen/replay callbacks through GalaxyMxSession + SubscriptionRegistry)
lands in PR 4.W to avoid conflict with the parallel host-probe track
(PR 4.7) and align the wire-up with the rest of Phase 4's plumbing.

9 supervisor tests (full state-machine traversal, retry-until-success on
both reopen and replay failures, idempotent failure reports, last-error
propagation, Dispose mid-recovery, post-dispose throws, fast-path Healthy
WaitForHealthy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:39:21 -04:00
Joseph Doherty 7922e573b1 PR 4.6 — DeployWatcher (IRediscoverable scaffold)
DeployWatcher consumes GalaxyRepositoryClient.WatchDeployEventsAsync,
suppresses the bootstrap event, and raises RediscoveryEventArgs whenever
time_of_last_deploy actually changes. Reconnect-on-error with capped
exponential backoff. GalaxyDriver wiring (IRediscoverable.OnRediscoveryNeeded
event + StartAsync inside InitializeAsync) lands in a follow-up so this PR
doesn't conflict with the parallel runtime track.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:33:37 -04:00
Joseph Doherty ce004c80ab PR 4.4 — ISubscribable + EventPump
Subscription path online. GalaxyDriver implements ISubscribable; subscribes
batches via gw SubscribeBulkAsync, runs a single shared EventPump consumer
of StreamEventsAsync, fans out OnDataChange events to every driver
subscription that observes the changed gw item handle.

Files:
- Runtime/GalaxySubscriptionHandle.cs — record implementing ISubscriptionHandle.
- Runtime/SubscriptionRegistry.cs — bookkeeping with forward (subscriptionId
  → bindings) and reverse (itemHandle → list of subscriptionIds) maps. The
  reverse map is the fan-out index so a single OnDataChange dispatches to
  every subscription that observes the changed handle.
- Runtime/IGalaxySubscriber.cs — driver-side seam: SubscribeBulk +
  UnsubscribeBulk + StreamEventsAsync. Production wraps GalaxyMxSession;
  tests substitute a fake driving synthetic MxEvents.
- Runtime/GatewayGalaxySubscriber.cs — production. Forwards to
  MxGatewaySession; bufferedUpdateIntervalMs is captured for now and
  becomes a SetBufferedUpdateInterval call once gw issue #102 / gw-9 lands
  (PR 6.3 picks this up).
- Runtime/EventPump.cs — long-running background consumer of
  StreamEventsAsync. Decodes MxValue + maps quality byte/MxStatusProxy via
  StatusCodeMap. Fan-out per subscriber resolves through the registry; bad
  handler exceptions are caught + logged, never break the dispatch loop.
  Filters out non-OnDataChange families (write-complete and operation-
  complete come back via InvokeAsync's reply path, not the event stream).

GalaxyDriver:
- Adds ISubscribable. SubscribeAsync allocates a subscription id,
  SubscribeBulks, builds the binding list (failed gw entries get
  ItemHandle=0 + a per-tag warn log), registers, and returns the handle.
  EventPump is started lazily on first subscribe; one pump per driver
  shared across all subscriptions.
- UnsubscribeAsync removes from the registry first (so stale events are
  filtered immediately) then calls UnsubscribeBulk best-effort. Foreign
  handles throw ArgumentException.
- ReadAsync NotSupportedException message updated: PR 4.4 no longer the
  pointer (deferred to a small follow-up that wraps the pump as a
  one-shot reader).
- Dispose tears down the pump first, then the repository client, then
  clears state.
- Internal ctor extended with optional subscriber parameter.

Tests (15 new, 109 Galaxy total):
- SubscriptionRegistryTests: monotonic id allocation, single+multi
  subscription fan-out, failed-handle exclusion, removal isolation, count
  invariants.
- GalaxyDriverSubscribeTests: handle allocation + value-change dispatch,
  multi-subscription fan-out, failed-tag silence, unsubscribe drops gw
  handle and stops dispatch, foreign handle throws, no-subscriber throws,
  empty-tag-list returns handle without calling gw.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:33:27 -04:00
Joseph Doherty a617086da1 PR 4.3 — IWritable + secured-write routing
Write path online. GalaxyDriver implements IWritable; routes by
SecurityClassification — SecuredWrite / VerifiedWrite tags go through
MxCommandKind.WriteSecured, everything else through MxGatewaySession.
WriteAsync. Per-tag classifications are captured during ITagDiscovery via a
SecurityCapturingBuilder wrapper that intercepts Variable() calls without
the discoverer needing to know about the driver's internal state.

Files:
- Runtime/MxValueEncoder.cs — boxed CLR value → MxValue. Covers seven Galaxy
  scalar types (bool/int8-32/uint8-32 → Int32, int64/uint64 → Int64, float,
  double, string, DateTime/DateTimeOffset → Timestamp) and 1-D array
  variants. Inverse of MxValueDecoder; round-trip pinned by tests.
  DateTime.Local converts to UTC; unsupported types throw ArgumentException.
- Runtime/IGalaxyDataWriter.cs — driver-side seam. Tests inject a fake to
  capture routing decisions; production path uses GatewayGalaxyDataWriter.
- Runtime/GatewayGalaxyDataWriter.cs — production. Lazy-AddItem caches
  itemHandles, encodes value, routes Write vs WriteSecured, translates
  MxCommandReply (ProtocolStatus → BadCommunicationError; first
  MxStatusProxy in statuses[] via StatusCodeMap.FromMxStatus). Per-tag
  exception isolation: one bad write doesn't fail the batch.
- GalaxyDriver: now implements IWritable. Discovery wraps the supplied
  IAddressSpaceBuilder in SecurityCapturingBuilder which records each
  attribute's SecurityClass into _securityByFullRef before delegating.
  WriteAsync resolves classification per tag (FreeAccess default for
  unknown tags — matches the legacy backend), routes through the injected
  writer. Throws NotSupportedException with PR 4.4 pointer when no writer
  is wired (production path requires GalaxyMxSession.Connect from PR 4.4).

Tests (32 new, 94 Galaxy total):
- MxValueEncoder: every scalar type, narrowing checks (sbyte/short/byte/
  ushort fit Int32; uint within Int32 range; ulong within Int64),
  DateTime.Local → UTC conversion, array variants for bool/double/string/
  DateTime, Dimensions populated, unsupported-type throws ArgumentException,
  encoder/decoder round-trip pin.
- GalaxyDriverWriteTests: WriteAsync routes through fake writer with
  values intact; theory exercises every SecurityClassification value through
  the discovery-then-write path; unknown-tag defaults to FreeAccess; empty-
  request short-circuit; no-writer fail-loud; post-dispose throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:24:22 -04:00
Joseph Doherty 85bdf0d58b PR 4.2 — IReadable abstraction + StatusCodeMap + MxValueDecoder
Read path scaffold + the byte→uint quality mapping table that the parity
matrix (PR 5.x) pins. PR 4.4 supplies the production GW-backed reader; this
PR ships the abstraction and the supporting infrastructure so 4.4 just
plugs the implementation in.

Files:
- Runtime/StatusCodeMap.cs — explicit OPC DA quality byte → OPC UA
  StatusCode uint mapping. Extends the legacy Galaxy.Host
  HistorianQualityMapper with named constants (Good / GoodLocalOverride,
  Uncertain + 4 substatuses, Bad + 7 substatuses, BadInternalError) and an
  MxStatusProxy → uint helper that honors success flag → detail byte →
  detected_by transport-error fallback. Unknown bytes fall back to category
  bucket with a once-per-session diagnostic log so field captures can
  extend the table.
- Runtime/MxValueDecoder.cs — gateway MxValue → boxed CLR value for the
  seven Galaxy data types (Boolean, Int32, Int64, Float32, Float64, String,
  DateTime) plus their array variants. Honors MxValue.IsNull and
  RawValue passthrough.
- Runtime/IGalaxyDataReader.cs — driver-side seam for one-shot reads. PR
  4.4 ships the production wrapper around MxGatewaySession.SubscribeBulk +
  StreamEvents + UnsubscribeBulk; this PR exposes the contract so
  GalaxyDriver.ReadAsync wires through it.
- Runtime/GalaxyMxSession.cs — wrapper around MxGatewaySession that owns
  the Register handle. ConnectAsync opens session + Register; AttachForTests
  lets tests bypass real gw construction. PR 4.3/4.4/4.5 add write,
  subscribe, and reconnect surfaces.

GalaxyDriver:
- Implements IReadable. ReadAsync routes through the injected
  IGalaxyDataReader (test seam) when present; production path throws
  NotSupportedException pointing at PR 4.4 — protects deployments running
  this PR from silent wrong reads while signaling that the legacy-host
  backend (Galaxy:Backend=legacy-host) handles reads in the meantime.
- Internal ctor extended with optional dataReader parameter (default null,
  preserves PR 4.0/4.1 callers).

Tests: 42 new — exhaustive byte→uint table for StatusCodeMap (15 known
codes + category-bucket fallback for unknowns + MxStatusProxy precedence
rules + OPC UA top-byte invariants), every MxValue oneof case for the
decoder (bool/int32/int64/float/double/string/timestamp/3 array variants/
raw bytes/null), GalaxyDriver IReadable wiring (route-through, empty-
request, no-reader-throws, post-dispose-throws, status-code preservation).
62 Galaxy tests total pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:15:42 -04:00
Joseph Doherty ecba5cedf9 PR 4.1 — ITagDiscovery via GalaxyRepositoryClient + AlarmRefBuilder
Browse path online. GalaxyDriver now implements ITagDiscovery against the
gateway's GalaxyRepositoryClient (PR 0.1's mxaccessgw browse RPC) and feeds
the address-space builder one folder per gobject + one variable per dynamic
attribute, with alarm-bearing attributes carrying all five sub-attribute refs
the server-level AlarmConditionService (PR 2.2) needs.

Files:
- Browse/IGalaxyHierarchySource.cs — driver-side seam between the discoverer
  and the gateway. Test fakes return canned hierarchies so the discoverer's
  translation logic is exercised without a real gRPC channel.
- Browse/GatewayGalaxyHierarchySource.cs — production wrapper around
  GalaxyRepositoryClient.DiscoverHierarchyAsync (paged internally).
- Browse/GalaxyDiscoverer.cs — translates GalaxyObject → IAddressSpaceBuilder
  calls. Browse name = contained_name (falls back to tag_name); full
  reference = attr.full_tag_reference when set, else tag_name + "." +
  attribute_name. Skips objects/attributes with empty identity.
- Browse/DataTypeMap.cs — mx_data_type → DriverDataType (port from legacy
  GalaxyProxyDriver.MapDataType, same fallback to String for unknown codes).
- Browse/SecurityMap.cs — security_classification → SecurityClassification
  (port from legacy GalaxyProxyDriver.MapSecurity).
- Browse/AlarmRefBuilder.cs — populates the five sub-attribute refs by
  Galaxy convention (.InAlarm/.Priority/.DescAttrName/.Acked/.AckMsg). The
  same convention the legacy GalaxyAlarmTracker hard-coded; concentrated
  here so PR 2.2's service receives complete AlarmConditionInfo rows.

GalaxyDriver:
- Added internal ctor accepting IGalaxyHierarchySource? for test injection.
  Default lazily builds GatewayGalaxyHierarchySource around a
  GalaxyRepositoryClient constructed from options on first DiscoverAsync.
- Owned GalaxyRepositoryClient disposed in Dispose.
- ApiKey resolution is currently a passthrough of ApiKeySecretRef — PR 4.W
  (or follow-up) wires DPAPI-backed secret resolution.

csproj: path-based ProjectReference to mxaccessgw (the user is shipping
that repo on a parallel track; both repos sit side-by-side on the dev box).
Tests project also references MxGateway.Contracts directly to construct
GalaxyObject / GalaxyAttribute fixtures.

Tests: 10 new in Browse/GalaxyDiscovererTests.cs covering folder-per-object,
variable-per-attribute, full-ref defaulting + gw-supplied override, browse-
name fallback, every metadata field propagation, alarm sub-attribute ref
population, non-alarm rows skip MarkAsAlarmCondition, empty-identity skips,
empty-attribute-name skips, end-to-end through GalaxyDriver.DiscoverAsync.
20 total Galaxy tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:06:02 -04:00
Joseph Doherty f6a4f919e2 PR 4.0 — Driver.Galaxy project skeleton + factory
New in-process .NET 10 driver project at
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/. The Tier-A replacement for
Driver.Galaxy.Host + Driver.Galaxy.Proxy. PR 4.0 ships only the IDriver
shape + factory + options; capability bodies (browse, read, write,
subscribe, deploy-watch, host probes) land in PRs 4.1–4.7.

Files:
- Driver.Galaxy.csproj — net10 x64, AnyCPU+x64 platforms, references
  Core.Abstractions + Core. No MxGatewayClient ProjectReference yet — that
  comes in PR 4.2 once the gw NuGet package is wired (the user is
  shipping mxaccessgw on a parallel track).
- Config/GalaxyDriverOptions.cs — nested record hierarchy
  (Gateway/MxAccess/Repository/Reconnect) mirroring the JSON shape spelled
  out in lmx_mxgw_impl.md PR 4.0 acceptance section.
- GalaxyDriver.cs — minimal IDriver impl. Initialize/Shutdown toggle
  DriverHealth between Healthy/Unknown; Reinitialize bumps the timestamp;
  GetMemoryFootprint=0 (PR 4.4 wires SubscriptionRegistry size);
  FlushOptionalCachesAsync no-op. Logs intent on lifecycle calls so
  partial deployments are diagnosable.
- GalaxyDriverFactoryExtensions.cs — JSON parser, default fill-ins,
  validation throw on missing required fields. Driver type name
  "GalaxyMxGateway" intentionally distinct from legacy "Galaxy" so both
  factories coexist during parity testing (Phase 5). PR 4.W's
  Galaxy:Backend switch picks one or the other.

Tests:
- 10 tests in Driver.Galaxy.Tests covering minimal-config defaults, full
  override path, three required-field error cases, factory registration
  via DriverFactoryRegistry.TryGet, lifecycle health transitions
  (Init → Shutdown → Reinit), Dispose idempotency, and post-disposal
  ObjectDisposedException.

slnx: registers the new Driver.Galaxy + Driver.Galaxy.Tests projects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:57:31 -04:00
Joseph Doherty 854827090a PR 3.W — Phase 3 wire-up: Wonderware sidecar DI registration
Solution + DI plumbing to complete Phase 3. With this PR the .NET 10 server
can boot with the Wonderware historian sidecar in the loop, gated by config
so existing deployments are unaffected.

slnx: registers Driver.Historian.Wonderware (net48 sidecar),
Driver.Historian.Wonderware.Client (net10 client), and both test projects.

Server.csproj: adds ProjectReference to the .NET 10 client.

Program.cs: reads Historian:Wonderware:* configuration. When Enabled=true,
constructs a WonderwareHistorianClient singleton and:
  - Registers it as IAlarmHistorianWriter so the SqliteStoreAndForwardSink
    drain (task #248) can pick it up.
  - Registers a WonderwareHistorianBootstrap hosted service that, on
    StartAsync, calls IHistoryRouter.Register(prefix, client) under the
    configured DriverInstancePrefix (default "galaxy") — lets the
    HistoryRead* dispatch in DriverNodeManager find the sidecar via
    longest-prefix-match resolution.

When Enabled=false (the default), DriverNodeManager keeps using its
internal LegacyDriverHistoryAdapter for the read path and the existing
NullAlarmHistorianSink stays in place — drop-in compatible with every
deployment that hasn't moved off Galaxy.Host yet.

42 server integration tests + 10 client tests pass. Full solution build
clean (0/0).

Note: scripts/install/Install-Services.ps1 and
src/.../Server/appsettings.json carry intermixed user WIP and are NOT
committed in this PR. Equivalent edits applied locally:

  Install-Services.ps1: new -InstallWonderwareHistorian switch installs the
  OtOpcUaWonderwareHistorian service alongside OtOpcUaGalaxyHost;
  generates a fresh historian shared secret; OtOpcUa service depends on
  both when historian sidecar is installed.

  Server/appsettings.json: new Historian.Wonderware section with
  Enabled=false default, PipeName/SharedSecret/PeerName/
  DriverInstancePrefix/ConnectTimeoutSeconds/CallTimeoutSeconds keys.

Both pieces should land in a follow-up commit once the user's WIP on those
files clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:48:47 -04:00
Joseph Doherty 14947fde51 PR 3.4 — Wonderware historian sidecar .NET 10 client
New project Driver.Historian.Wonderware.Client (net10 x64) implements both
Core.Abstractions.IHistorianDataSource (read paths consumed by the server's
IHistoryRouter) and Core.AlarmHistorian.IAlarmHistorianWriter (alarm-event
drain consumed by SqliteStoreAndForwardSink) against the sidecar's PR 3.3
pipe protocol.

Wire-format files (Framing/MessageKind, Hello, Contracts, FrameReader,
FrameWriter) are byte-identical mirrors of the sidecar's net48 originals —
the sidecar can't be referenced as a ProjectReference because of the
runtime/bitness gap, so we duplicate and pin the wire bytes via tests.

PipeChannel owns one bidirectional NamedPipeClientStream + Hello handshake +
serializes calls. Single in-flight at a time (semaphore); transport failures
trigger one in-flight reconnect-and-retry before propagating. Connect is
abstracted behind a Func<CancellationToken, Task<Stream>> so tests inject
in-process pipes.

WonderwareHistorianClient maps:
- HistorianSampleDto.Quality (raw OPC DA byte) → OPC UA StatusCode uint via
  QualityMapper (port of HistorianQualityMapper from sidecar).
- HistorianAggregateSampleDto.Value=null → BadNoData (0x800E0000).
- WriteAlarmEventsReply.PerEventOk[i]=true → Ack, false → RetryPlease.
  Whole-call failure or transport exception → RetryPlease for every event in
  the batch (drain worker handles backoff).
- AlarmHistorianEvent → AlarmHistorianEventDto with severity bucketed via
  AlarmSeverity-to-ushort mapping (Low=250, Medium=500, High=700, Crit=900).

GetHealthSnapshot tracks transport success + sidecar-reported failure
separately; ConsecutiveFailures rises on operation-level errors, not just
transport drops.

10 round-trip tests via FakeSidecarServer (in-process net10 fake using the
client's own framing): byte→uint quality mapping, null-bucket BadNoData,
at-time order preservation, event-field round-trip, sidecar error surfacing,
WriteBatch per-event status, whole-call retry-please mapping, Hello
shared-secret rejection, transport-drop reconnect-and-retry, health snapshot
counters.

PR 3.W will register this client as IHistorianDataSource + IAlarmHistorianWriter
in OpcUaServerService DI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:40:56 -04:00
Joseph Doherty 9f7a4ac769 PR 3.3 — Wonderware sidecar pipe protocol + dispatcher
Sidecar now serves a length-prefixed, kind-tagged MessagePack pipe protocol
mirroring Galaxy.Host's: 4-byte BE length + 1-byte MessageKind + body, 16 MiB
cap. Hello handshake validates per-process shared secret + protocol major
version + caller SID via ImpersonateNamedPipeClient before any work frame
runs.

Five contract pairs ship in this PR:

  ReadRawRequest          ↔ ReadRawReply
  ReadProcessedRequest    ↔ ReadProcessedReply
  ReadAtTimeRequest       ↔ ReadAtTimeReply
  ReadEventsRequest       ↔ ReadEventsReply
  WriteAlarmEventsRequest ↔ WriteAlarmEventsReply

Timestamps cross the wire as DateTime ticks (long) to dodge MessagePack's
DateTime kind/timezone quirks; both sides convert with DateTime(ticks, Utc).
Sample values cross as MessagePack-serialized byte[] so the .NET 10 client
(PR 3.4) deserializes per the tag's mx_data_type without the sidecar needing
to know OPC UA types.

HistorianFrameHandler dispatches by MessageKind to IHistorianDataSource (the
PR 3.2 lifted interface) for reads, and to a new IAlarmEventWriter strategy
for the alarm-event persistence path. Per-call exceptions surface as
Success=false replies so a single bad request doesn't kill the connection.
WriteAlarmEvents replies carry per-event success flags; the SQLite
store-and-forward sink retries failed slots on the next drain tick.

Program.cs spins the pipe server when OTOPCUA_HISTORIAN_ENABLED=true. Pipe-
only mode (default false) preserves PR 3.1's smoke-test behaviour: the host
still validates env vars and waits for Ctrl-C, but doesn't initialize the
Wonderware SDK.

Sidecar test project gains 8 round-trip tests (37 total now): every contract
pair round-trips through FrameReader/FrameWriter via in-memory streams, the
handler surfaces historian exceptions cleanly, WriteAlarmEvents per-event
status flows through, and the no-writer-configured path returns a clean
error reply.

Added MessagePack 2.5.187 to the sidecar csproj.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:27:17 -04:00
Joseph Doherty bc7ec746c5 PR 1+2.W — Wire HistoryRouter + AlarmConditionService into DI
Server-side singletons threaded through OpcUaApplicationHost → OtOpcUaServer
→ DriverNodeManager construction. New ctor parameters are last-position
optional with null defaults so every existing test construction site
(OpcUaServerIntegrationTests, AlarmSubscribeIntegrationTests, etc.) keeps
working unchanged.

Program.cs:
  AddSingleton<IHistoryRouter, HistoryRouter>();
  AddSingleton<AlarmConditionService>();

The router stays empty after this PR. DriverNodeManager's internal
LegacyDriverHistoryAdapter handles every driver that still implements
IHistoryProvider; PR 3.W will register the Wonderware sidecar as a router
source; PR 7.2 retires the legacy fallback entirely.

44 alarm + history + integration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:13:51 -04:00
Joseph Doherty 9365beb966 PR 3.2 — Lift Wonderware Historian SDK code to sidecar
Move all historian implementation files from Driver.Galaxy.Host/Backend/Historian/
to Driver.Historian.Wonderware/Backend/. Sidecar now owns the aahClientManaged /
aahClientCommon SDK references; Galaxy.Host project-references the sidecar so
MxAccessGalaxyBackend keeps building until PR 7.2 retires Galaxy.Host entirely.

10 source files moved (preserving git history via git mv):
  IHistorianDataSource, HistorianDataSource, HistorianClusterEndpointPicker,
  HistorianClusterNodeState, HistorianConfiguration, HistorianEventDto,
  HistorianHealthSnapshot, HistorianQualityMapper, HistorianSample,
  IHistorianConnectionFactory.

2 historian tests moved alongside (HistorianClusterEndpointPickerTests,
HistorianQualityMapperTests). Sidecar test project now hosts 29 tests (1 PR 3.1
smoke + 28 moved historian tests, all passing).

Galaxy.Host's remaining 6 historian-flavored tests (HistorianWiringTests,
HistoryReadAtTimeTests, HistoryReadEventsTests, HistoryReadProcessedTests)
keep passing via the project reference — using directives updated to reach
the new namespace.

Sidecar deliberately speaks no Core.Abstractions — its surface is the legacy
List<HistorianSample> shape; PR 3.4's .NET 10 client translates to the
Core.Abstractions shapes added in PR 1.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:13:13 -04:00
Joseph Doherty ef22a61c39 v2 mxgw migration — Phase 1+2+3.1 wiring (7 PRs)
Foundational PRs from lmx_mxgw_impl.md, all green. Bodies only — DI/wiring
deferred to PR 1+2.W (combined wire-up) and PR 3.W.

PR 1.1 — IHistorianDataSource lifted to Core.Abstractions/Historian/
  Reuses existing DataValueSnapshot + HistoricalEvent shapes; sidecar (PR
  3.4) translates byte-quality → uint StatusCode internally.

PR 1.2 — IHistoryRouter + HistoryRouter on the server
  Longest-prefix-match resolution, case-insensitive, ObjectDisposed-guarded,
  swallow-on-shutdown disposal of misbehaving sources.

PR 1.3 — DriverNodeManager.HistoryRead* dispatch through IHistoryRouter
  Per-tag resolution with LegacyDriverHistoryAdapter wrapping
  `_driver as IHistoryProvider` so existing tests + drivers keep working
  until PR 7.2 retires the fallback.

PR 2.1 — AlarmConditionInfo extended with five sub-attribute refs
  InAlarmRef / PriorityRef / DescAttrNameRef / AckedRef / AckMsgWriteRef.
  Optional defaulted parameters preserve all existing 3-arg call sites.

PR 2.2 — AlarmConditionService state machine in Server/Alarms/
  Driver-agnostic port of GalaxyAlarmTracker. Sub-attribute refs come from
  AlarmConditionInfo, values arrive as DataValueSnapshot, ack writes route
  through IAlarmAcknowledger. State machine preserves Active/Acknowledged/
  Inactive transitions, Acked-on-active reset, post-disposal silence.

PR 2.3 — DriverNodeManager wires AlarmConditionService
  MarkAsAlarmCondition registers each alarm-bearing variable with the
  service; DriverWritableAcknowledger routes ack-message writes through
  the driver's IWritable + CapabilityInvoker. Service-raised transitions
  route via OnAlarmServiceTransition → matching ConditionSink. Legacy
  IAlarmSource path unchanged for null service.

PR 3.1 — Driver.Historian.Wonderware shell project (net48 x86)
  Console host shell + smoke test; SDK references + code lift come in
  PR 3.2.

Tests: 9 (PR 1.1) + 5 (PR 2.1) + 10 (PR 1.2) + 19 (PR 2.2) + 1 (PR 3.1)
all pass. Existing AlarmSubscribeIntegrationTests + HistoryReadIntegrationTests
unchanged.

Plan + audit docs (lmx_backend.md, lmx_mxgw.md, lmx_mxgw_impl.md)
included so parallel subagent worktrees can read them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:03:36 -04:00
Joseph Doherty 012c42a846 Task #156 — TagsTab: per-tag advanced Modbus fields (Deadband, UnitId, CoalesceProhibited)
#155 wired the basic tag form (Name / Driver / Equipment / DataType / Access /
WriteIdempotent + ModbusAddressEditor for the address). The per-tag knobs added
across #141 / #142 / #143 still required operators to hand-edit TagConfig JSON.
This commit exposes them through an "Advanced" expander.

UI changes (TagsTab.razor):

- Collapsible "▶ Advanced (Deadband / UnitId override / CoalesceProhibited)"
  button below the address editor, visible only when the selected driver is
  Modbus. Collapsed by default — basic form covers the typical edit workflow.
- Three numeric / checkbox inputs with inline help text explaining each knob's
  purpose and when to use it.
- _showAdvanced auto-opens on Edit when any of the advanced fields are present
  in the existing TagConfig — operators see immediately what's been configured.

Save-side serialization:

- New RefreshTagConfigJson serializes the address + advanced fields into a
  structured JSON object using a Dictionary<string, object?>. Fields with
  default / empty values are omitted to keep diffs in the existing draft-diff
  viewer minimal — a tag with only an address still produces
  `{"addressString":"40001:F"}` and not a full superset object with nulls.
- OnAddressChanged + OnAdvancedChanged both delegate to RefreshTagConfigJson
  so any input change keeps TagConfig in sync.

Read-side hydration:

- New HydrateModbusFromTagConfig parses an existing TagConfig JSON and
  populates _modbusAddress + the three advanced fields. Falls back to empty
  defaults on malformed JSON. ResetAdvanced is called before hydration on
  every form open so leftover state from a previous edit doesn't leak.

ResetAdvanced helper introduced + called from StartAdd so a fresh "New tag"
form starts with everything cleared.

Tests (1 new in TagServiceTests):
- TagConfig_With_Advanced_Modbus_Fields_RoundTrips_Through_Factory — creates a
  tag whose TagConfig carries addressString + deadband + unitId +
  coalesceProhibited, persists via TagService, reloads, asserts every field
  survives. Then constructs a wrapping driver-config JSON and feeds it to
  ModbusDriverFactoryExtensions.CreateInstance — confirms the field NAMES the
  UI emits match what BuildTag's DTO consumes. If the UI's JSON shape ever
  drifts from the factory's expected DTO, this test catches it before users do.

119 + 1 = 120 Admin tests green. Solution build clean.
2026-04-25 04:22:50 -04:00
Joseph Doherty ec57df1009 Task #155 — TagService + TagsTab CRUD UI for Modbus tags
Closes the remaining loop on user-visible Modbus tag editing. Pre-#155 tags
arrived only via SQL seeding or runtime ITagDiscovery; the Admin UI had no
interactive surface for creating / editing / deleting tag rows.

Changes:

- TagService.cs (Admin/Services/) — CRUD wrapper around OtOpcUaConfigDbContext.Tags.
  ListAsync supports optional driver / equipment filters; CreateAsync auto-derives
  TagId; UpdateAsync persists editable fields; DeleteAsync removes the row. Mirrors
  the EquipmentService shape.
- TagsTab.razor (Components/Pages/Clusters/) — list + filter + add/edit/remove form.
  The address/config editor is conditional: when the selected DriverInstance is
  Modbus, ModbusAddressEditor (#145) renders with live-parse preview; otherwise a
  generic JSON textarea (matches the DriversTab pattern from #147). Save-side
  serializes the address-string into TagConfig as `{"addressString":"..."}` JSON.
- ClusterDetail.razor — new "Tags" tab in the cluster-detail nav strip + the routing
  switch.
- Program.cs — TagService registered as a scoped DI service.

Drive-by fix: ModbusDriverFactoryExtensions.CreateInstance promoted from internal
to public — Admin.Tests was using it via reflection-friendly internal access that
broke under the #153 logger overload addition. Public is the right access modifier
anyway since the Server-side bootstrapper calls it from a different assembly.

Drive-by fix #2: ModbusDriverConfigDto was missing MaxReadGap (#143) — surfaced by
the #147 round-trip test that flips MaxReadGap=12 in the view model and asserts
it lands on the resolved options. Added the field + binding line. Confirms #143's
DriverConfig JSON binding was incomplete since the original commit; no production
deployment configured this knob through JSON until now so the gap stayed hidden.

Tests (4 new TagServiceTests):
- Create_And_List_Surfaces_The_Tag — CreateAsync auto-assigns TagId; list returns
  the row.
- List_Filters_By_DriverInstance — driver-scoped filter works.
- Update_Persists_Editable_Fields — Name / DataType / AccessLevel / TagConfig all
  persist through Update.
- Delete_Removes_The_Row — basic delete verification.

113 + 4 (TagService) + 2 (DriversTab round-trip restored after compile fix) = 119
Admin tests green. Solution build clean.

Caveat: bUnit-style render tests for TagsTab still aren't included — Admin.Tests
doesn't have bUnit set up. The TagService logic is fully covered; the razor
component's parser/save glue is exercised by hand at runtime for now.
2026-04-25 01:51:02 -04:00
Joseph Doherty 802366c2c6 Task #154 — driver-diagnostics RPC: HTTP endpoint + Admin client
Foundation for surfacing per-driver runtime state from the Server process to
the Admin UI. #152 shipped GetAutoProhibitedRanges() as an in-process
accessor; #154 makes it reachable across processes.

Server side (HealthEndpointsHost):
- New URL family: /diagnostics/drivers/{driverInstanceId}/{driverType}/{topic}
- First wired topic: /diagnostics/drivers/{id}/modbus/auto-prohibited
- Driver-agnostic at the URL level — future driver types add their own
  segments[3] cases (e.g. /diagnostics/drivers/{id}/s7/dropped-pdus).
- 404 when the driver instance doesn't exist; 400 when the driver exists
  but isn't a Modbus driver (the per-type endpoint is wrong for this row).
- Response shape is flat JSON (unitId / region / startAddress / endAddress /
  lastProbedUtc / bisectionPending) so consumers don't have to reference the
  Driver.Modbus assembly's ModbusAutoProhibition record.
- Re-uses the existing HttpListener bound to localhost:4841 — same auth /
  reachability story as /healthz and /readyz.

Admin side:
- DriverDiagnosticsClient (Services/) — HttpClient wrapper that fetches the
  per-driver Modbus prohibition list. Returns null on 404/400 (driver
  missing or wrong type); throws on transport failures.
- ModbusAutoProhibitionsResponse + ModbusAutoProhibitionRow flat DTOs —
  client doesn't take a dep on Driver.Modbus.
- ModbusDiagnostics.razor at /modbus/diagnostics/{driverInstanceId} —
  table view with BISECTING (warning yellow) / ISOLATED (danger red)
  badges, relative timestamps (e.g. "5m ago"), Refresh button. Errors
  surface inline rather than swallowing.
- HttpClient registration in Program.cs reads
  DriverDiagnostics:ServerBaseUrl from appsettings.json (default
  http://localhost:4841/ for same-host deployments).

Tests (3 new in HealthEndpointsHostTests):
- Diagnostics_ReturnsModbusAutoProhibitions_ForLiveDriver — registers a
  Modbus driver with a programmable transport that protects register 102,
  records the prohibition via a coalesced ReadAsync, hits the endpoint,
  asserts the returned JSON matches (unitId / region / start / end / pending).
- Diagnostics_404_When_Driver_Not_Found
- Diagnostics_400_When_Driver_Is_Wrong_Type

Architecture note: the Admin-side bUnit-style component test isn't included
because Admin.Tests doesn't have bUnit set up. The DriverDiagnosticsClient
is unit-testable on its own with a mock HandlerStub if needed — left as a
follow-up alongside the broader bUnit setup task.

The diagnostic page is now reachable at /modbus/diagnostics/{driverId} from
any Admin instance pointing at a Server endpoint URL. Future driver types
(S7, AbCip) plug into the same channel by adding their own URL segments
in HealthEndpointsHost.WriteDriverDiagnosticsAsync.
2026-04-25 01:32:21 -04:00
Joseph Doherty 8004394892 Task #153 — ModbusDriver: inject ILogger so prohibition events reach a sink
#152 left a hook for structured logging when an auto-prohibition first
fires; this commit completes the wiring.

Changes:
- ModbusDriver constructor takes an optional ILogger<ModbusDriver> (defaults
  to NullLogger). Existing standalone callers stay compile-clean.
- RecordAutoProhibition logs LogWarning on first-fire only (re-fires of the
  same range stay quiet via the existing isNew de-dupe). Format includes
  DriverInstanceId, UnitId, Region, Start, End, Span — log aggregators can
  filter / count by any field.
- New LogProhibitionCleared helper called by both StraightReprobeAsync (when
  the re-probe succeeds on a single-register range) and BisectAndReprobeAsync
  (per-half clearing + a single combined line when both halves succeed).
- ModbusDriverFactoryExtensions.Register accepts an optional ILoggerFactory.
  Captured at registration time and used in the factory closure to construct
  a per-driver logger. Server bootstrap code that already has an ILoggerFactory
  in DI threads it through with a single argument addition; old call sites
  (Register(registry)) keep working with a null logger.

Tests (2 new ModbusLoggerInjectionTests):
- First_Failure_Emits_Single_Warning_Subsequent_Refire_Stays_Quiet — pins
  the de-dupe behaviour. First scan logs one warning with the expected
  structured fields; second scan with the same prohibition stays silent.
- Reprobe_Clearing_Prohibition_Emits_Information_Log — protected register
  unlocked between record and re-probe; re-probe success emits an info log
  containing "cleared".

CapturingLogger test harness is purpose-built (xUnit doesn't ship a logger
mock by default and adding Moq is overkill for two tests).

240 + 2 = 242 unit tests green.
2026-04-25 01:26:20 -04:00
Joseph Doherty b8df230eb8 Task #152 — Modbus coalescing: surface auto-prohibitions through diagnostics
Auto-prohibited ranges (#148) were previously visible only through an
internal AutoProhibitedRangeCount accessor used by tests. Production
operators had no way to see what the planner had learned without pulling
logs or inspecting driver state.

Changes:

- New public record `ModbusAutoProhibition(UnitId, Region, StartAddress,
  EndAddress, LastProbedUtc, BisectionPending)` — operator-facing snapshot
  shape. Lives in the addressing assembly's logical namespace alongside
  the other public types.
- `ModbusDriver.GetAutoProhibitedRanges()` returns
  `IReadOnlyList<ModbusAutoProhibition>` — a copy of the live prohibition
  map. Lock-protected snapshot so consumers don't race with the re-probe
  loop.
- RecordAutoProhibition tracks first-fire vs re-fire via the dictionary
  insert path, leaving a hook to add structured logging once an ILogger
  is plumbed through (currently elided to keep the constructor minimal
  for testability — a future change can wire ILogger and emit a single
  warning per first-fire).

Tests (1 new, additive to the 6 in ModbusCoalescingAutoRecoveryTests):
- GetAutoProhibitedRanges_Surfaces_Operator_Visible_Snapshot — confirms
  the snapshot shape: empty before any failure, populated with correct
  UnitId/Region/Start/End/BisectionPending after a failed coalesced read,
  LastProbedUtc within the recent past.

Docs:
- docs/v2/modbus-addressing.md — new "Coalescing auto-recovery" subsection
  consolidates the #148/#150/#151/#152 surface in one place. Documents
  the diagnostic accessor + flags the in-process consumption pattern
  (Server health endpoints today; Admin UI when an RPC channel exists).

239 + 1 = 240 unit tests green.

Caveat: the Admin UI surfacing (table render, "clear all prohibitions"
button) is intentionally NOT shipped here. Admin can't reach a live
ModbusDriver instance without a driver-diagnostics RPC channel that
doesn't exist yet — that's a larger architectural piece. For now the
data is queryable in-process by the Server's health endpoints; once an
RPC channel lands, Admin can wire the existing GetAutoProhibitedRanges
into a Blazor table without further driver changes.
2026-04-25 01:19:10 -04:00
Joseph Doherty f823c81c96 Task #150 — Modbus coalescing: bisection-style range narrowing
Pre-#150 a coalesced read failure recorded the FULL failed range as
permanently prohibited. Healthy registers around the actual protected
register stayed in per-tag mode forever (until ReinitializeAsync). The
re-probe loop shipped in #151 retried the whole range as a single block,
which would either succeed (clearing everything) or fail (changing
nothing).

Post-#150 the re-probe loop bisects multi-register prohibitions:

- _autoProhibited refactored from Dictionary<key, DateTime> to
  Dictionary<key, ProhibitionState> where ProhibitionState carries
  LastProbedUtc + SplitPending. Multi-register prohibitions enter with
  SplitPending=true; single-register prohibitions enter with
  SplitPending=false (already minimal).
- ReprobeLoopAsync delegates the per-pass work to
  RunReprobeOnceForTestAsync (also exposed for synchronous test driving).
  Each entry routes to BisectAndReprobeAsync (split-pending + multi-reg)
  or StraightReprobeAsync (single-reg / non-split-pending).
- Bisection: split (start, end) at mid = (start+end)/2. Try (start, mid)
  and (mid+1, end) as separate coalesced reads. Each FAILED half re-enters
  the prohibition map with SplitPending = (its end > its start). SUCCEEDED
  halves vanish, freeing the planner to coalesce across them on the next
  scan.
- Convergence: log2(span) re-probe ticks pin the prohibition to the
  actual single offending register(s). For a 100-register block with one
  protected address that's ~7 ticks.

Tests (3 new ModbusCoalescingBisectionTests):
- Bisection_Narrows_Multi_Register_Prohibition_Per_Reprobe — 11 tags
  100..110 with protected address 105. After 4 re-probe passes the
  prohibition collapses from (100..110) → (100..105) → (103..105) →
  (105..105).
- Bisection_Clears_When_Both_Halves_Are_Healthy — transient failure
  scenario; protection lifted before re-probe; both bisection halves
  succeed and the parent vanishes entirely.
- Bisection_Splits_Into_Two_When_Both_Halves_Still_Fail — TwoHoleTransport
  with protected addresses 102 + 108 in the same coalesced range. After
  bisection both halves still fail (each contains one of the protected
  addresses); the prohibition map grows to 2 entries.

236 + 3 = 239 unit tests green. Solution build clean.
2026-04-25 01:16:09 -04:00
Joseph Doherty 9e4aae350b Task #151 — Modbus coalescing: periodic re-probe of auto-prohibitions
#148 introduced auto-prohibited coalesced ranges that persist for the
driver lifetime. Long-running deployments with transient PLC permission
changes (firmware update unlocking a previously-protected register,
operator reconfiguring the device) had no recovery short of operator
restart.

Adds an opt-in background loop that re-probes each prohibition periodically:

- ModbusDriverOptions.AutoProhibitReprobeInterval (TimeSpan?, default null
  = disabled). Set to e.g. TimeSpan.FromHours(1) to opt in.
- _autoProhibited refactored from HashSet<key> to Dictionary<key, DateTime>
  so each entry tracks its last failure / last re-probe timestamp.
- ReprobeLoopAsync runs on the same Task.Run pattern as ProbeLoopAsync;
  cancelled by ShutdownAsync. Each tick snapshots the prohibition set
  and issues a one-shot coalesced read per range. Successful re-probes
  drop the prohibition; failed ones bump the timestamp + leave the
  prohibition in place.
- Communication failures during re-probe (transport-level) are treated
  the same as PLC-exception failures — the prohibition stays, but isn't
  upgraded to "permanent" since transports recover. The driver-instance
  health surface picks up the failure separately.
- ShutdownAsync explicitly clears the prohibition set so a manual restart
  via ReinitializeAsync starts with a clean slate (matches the old
  "restart to clear" semantics).
- Factory DTO + JSON binding extended with AutoProhibitReprobeMs field.

Tests (2 new, additive to the 3 in ModbusCoalescingAutoRecoveryTests):
- Reprobe_Clears_Prohibition_When_Range_Becomes_Healthy — protected
  register at 102 records prohibition; clearing the simulated protection
  + invoking the re-probe drops the prohibition.
- Reprobe_Leaves_Prohibition_When_Range_Is_Still_Bad — re-probe on a
  still-failing range keeps the prohibition in place.

Tests use a new internal RunReprobeOnceForTestAsync helper to fire one
re-probe pass synchronously, so the suite doesn't have to wait on the
background timer (the loop's timer behaviour is exercised implicitly via
the InitializeAsync wire-up + the synchronous helper sharing the actual
re-probe code path).

234 + 2 = 236 unit tests green.
2026-04-25 01:12:48 -04:00
Joseph Doherty 8de152df4f Task #149 — Modbus address-preview page + ImportEquipment help
The original task scope assumed a per-tag editor lived in EquipmentTab.razor
or a similar surface. Reading the codebase confirmed that's not the case:
tags are seeded via SQL (scripts/smoke/*) or arrive at runtime through
ITagDiscovery; the Admin UI has no per-tag CRUD page today. Equipment
import is for equipment metadata (Name / MachineCode / ZTag / SAPID /
Identification) — not tag rows.

Adjusted scope:

1. ModbusAddressPreview.razor — new standalone page at /modbus/address-preview.
   Hosts the ModbusAddressEditor component shipped in #145 + the family
   selector + a copy-pasteable grammar reference. Operators can sanity-check
   address-string syntax (40001:F:CDAB / HR1:I / V2000:F / D100:I etc.)
   without committing it to a config row first.

2. ImportEquipment.razor — appended a secondary alert banner clarifying
   that Modbus per-tag addressing isn't part of equipment import; points
   users at the Drivers tab + the new preview tool.

Builds clean against the existing Admin app. The actual per-tag CRUD UI is
still a separate piece of work — when it ships, it can drop in
ModbusAddressEditor directly. The preview page acts as the canonical
demonstration of how to use the component.

Razor caveat: the grammar reference uses literal `<...>` syntax tokens
that the Razor parser interprets as malformed elements when inlined in a
<pre> block. Held as a string field (_grammarReference) and rendered
through @ binding to sidestep the parser conflict.
2026-04-25 01:09:24 -04:00
Joseph Doherty 3b0e093002 Task #148 — Modbus block-coalescing: auto-recover from protected register holes
Pre-#148 behaviour: a coalesced FC03/FC04 read that crossed a write-only or
PLC-fault register marked every member tag Bad until the operator manually
flagged the offending tag with CoalesceProhibited. Healthy tags around the
hole stayed broken indefinitely.

Post-#148: two-stage recovery, no operator intervention needed.

1. Same-scan fallback: when a coalesced read fails with a Modbus exception
   (IllegalDataAddress, SlaveDeviceFailure, etc.), the planner does NOT
   mark members handled. The per-tag fallback in the same scan reads each
   member individually — non-protected members surface Good values
   immediately, and only the actual protected register stays Bad.

2. Cross-scan prohibition: the failed range (Unit, Region, Start, End) is
   recorded in a per-driver `_autoProhibited` set. On subsequent scans the
   planner checks each candidate merge against the set and refuses to
   re-form any block that overlaps a known-bad range. Net effect: after one
   scan with a failure, the protected range goes "per-tag mode" indefinitely
   while ranges around it keep coalescing normally.

Communication failures (timeouts, socket drops) are NOT auto-prohibited —
they're transport-level, not structural. The same coalesced read can succeed
once the transport recovers; recording it as "permanently bad" would defeat
coalescing for the whole driver instance.

Auto-prohibition state lives for the driver lifetime and clears on
ReinitializeAsync (operator restart). A periodic re-probe is a follow-up if
deployments need it without a restart.

Implementation:
- Added `_autoProhibited` HashSet<(byte, ModbusRegion, ushort, ushort)> +
  `_autoProhibitedLock` on ModbusDriver.
- `RangeIsAutoProhibited(unit, region, start, end)` overlap check called
  from the planner when forming blocks.
- `RecordAutoProhibition(...)` called from the catch (ModbusException)
  branch.
- The catch (Exception) branch (non-Modbus failures) keeps the pre-#148
  "mark all Bad in this scan, don't auto-prohibit" behaviour.
- Internal `AutoProhibitedRangeCount` accessor for tests.

Tests (3 new ModbusCoalescingAutoRecoveryTests):
- First_Failure_Falls_Back_To_PerTag_Same_Scan — three tags around a
  protected register at 102: T100 + T104 surface Good values via the
  per-tag fallback in the SAME scan; T102 surfaces the exception.
- Second_Scan_Skips_Coalesced_Read_Of_Prohibited_Range — confirms scan 2
  doesn't re-attempt the failed merge (no FC03 with quantity > 1 at the
  prohibited start).
- Tags_Outside_Prohibited_Range_Still_Coalesce — separate cluster at HR
  200..202 keeps coalescing normally even after the 100..104 cluster is
  prohibited.

234/234 unit tests green.

Follow-ups intentionally NOT shipped (smaller, independent changes):
- Bisection-style range narrowing — currently the prohibition range is the
  full failed block; the planner doesn't try to find the exact protected
  register. Operator-visible diagnostic + prohibition stays correct.
- Periodic re-probe to clear stale prohibitions.
- Surface auto-prohibited ranges through GetHostStatuses or a new
  diagnostic so the Admin UI can show what's been auto-isolated.
2026-04-25 01:01:42 -04:00
Joseph Doherty 0b7653d3b2 Task #147 — wire ModbusOptionsEditor into DriversTab
Branches the DriversTab driver-add form on driver type:
- For DriverType=Modbus, render the typed <ModbusOptionsEditor> component
  shipped in #145 instead of the generic JSON textarea.
- For other driver types, the existing textarea stays (other drivers ship
  their own typed editors per decision #94).

On Save, when type is Modbus, the form serialises ModbusOptionsViewModel
into the JSON DTO shape ModbusDriverFactoryExtensions consumes (host /
port / unitId / family / keepAlive / reconnect / max*** / writeOnChangeOnly
/ etc.). Other types still pass the textarea contents verbatim.

Drive-by fix: the DriverType dropdown listed "ModbusTcp" but the actual
factory-registered name is "Modbus" — DriverInstanceBootstrapper would
silently skip a row created with the old label because the factory lookup
would miss. Renamed to match.

Tests (2 new in ModbusOptionsViewModelTests):
- DriversTab_Serialized_Defaults_RoundTrip_Through_Factory — unedited
  view-model serializes to a JSON the factory accepts; resulting
  ModbusDriverOptions matches the form defaults bit-for-bit.
- DriversTab_Serializes_Edited_Values_Correctly — flipping Host / Port /
  UnitId / Family / MaxReadGap / WriteOnChangeOnly in the view model
  surfaces in the constructed driver's options.

The serializer in the test mirrors DriversTab.razor's SerializeModbusOptions
helper. If the form's serialization shape drifts, both must be updated
together; that's the cost of testing through the JSON DTO without bUnit.

Follow-up still open: the per-tag editor (ModbusAddressEditor wiring into
EquipmentTab.razor + the bulk-import help-text update) — that's a separate
surface that touches the equipment-row CRUD flow; covered as a follow-up
when the equipment tag editor surface is next touched.
2026-04-25 00:58:03 -04:00
Joseph Doherty dfd027ebca Task #146 — Modbus addressing: align type codes with Wonderware DASMBTCP + Ignition
Web verification (2026-04-25) against current vendor docs surfaced concrete
grammar conflicts in the v1 suffix grammar shipped in #137. Hard cutover
before the Admin UI rolls out widely so users don't paste `:I` from a
Wonderware spreadsheet and silently get wrong-typed reads.

Sources:
- Wonderware DASMBTCP user guide
  https://cdn.logic-control.com/media/DASMBTCP.pdf
- Ignition Modbus addressing (8.1)
  https://www.docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/opc-ua-drivers/modbus/modbus-addressing

Type-code changes:

| Code   | Pre-#146 | Post-#146  | Vendor reference            |
|--------|----------|------------|------------------------------|
| `:S`   | (n/a)    | Int16      | Wonderware DASMBTCP `S`      |
| `:US`  | (n/a)    | UInt16     | Ignition `HRUS`              |
| `:I`   | Int16    | **Int32**  | Wonderware `I` + Ignition `HRI` |
| `:UI`  | UInt16   | **UInt32** | Ignition `HRUI`              |
| `:I_64`  | (n/a)  | Int64      | Ignition `HRI_64`            |
| `:UI_64` | (n/a)  | UInt64     | Ignition `HRUI_64`           |
| `:BCD_32`| (n/a)  | BCD32      | Ignition `HRBCD_32`          |

Codes REMOVED (no clear vendor precedent + conflict with the new mapping):
`:DI`, `:L`, `:UDI`, `:UL`, `:LI`, `:ULI`, `:LBCD`. Pre-#146 configs that
use them get an "Unknown type code" diagnostic at parse time so users get
a fast surface-level error rather than silent wrong-typed reads.

Codes UNCHANGED (already vendor-aligned): `:BOOL`, `:F`, `:D`, `:BCD`,
`:STR<n>`. Modicon 5/6-digit + mnemonic regions (HR/IR/C/DI) + bit suffix
`.N` are also unchanged.

Defaults:
- Coils / DiscreteInputs → `BOOL` (unchanged)
- HoldingRegisters / InputRegisters with no explicit type → Int16 (matches
  Ignition's bare `HR` default)

Byte-order mnemonics (`:ABCD` / `:CDAB` / `:BADC` / `:DCBA`) are kept but
documented as OtOpcUa-specific — they aren't in any major vendor's per-tag
address string. Ignition uses a `-R` suffix per prefix; Wonderware
configures word-order at the topic level.

Tests:
- 12 Type_Codes_Parse rows updated to assert the new mappings.
- New Removed_Aliases_Are_Rejected (×7) confirms each pre-#146 alias now
  fails fast with "Unknown type code".
- Worked_Example_Int16_Array uses the new `:S` code.
- New Worked_Example_Int32_Array_Via_I_Code documents the `:I = Int32`
  vendor-alignment intent so a future "fix" doesn't accidentally regress.
- Unknown_Type_Code_Rejected_With_Catalog updated to match the new error
  message ("Valid: BOOL, S, US, I, ...").

Docs:
- docs/v2/modbus-addressing.md — table replaced with the post-#146 codes,
  each row cites its Wonderware / Ignition reference. New "Codes removed
  in #146" subsection documents the cutover.
- docs/Driver.Modbus.Cli.md — example grammar list updated; explicit
  type-code reminder appended.

114 addressing tests + 231 driver tests still green. Solution build clean.
2026-04-25 00:51:50 -04:00
Joseph Doherty 5ea57d2d70 Task #138 — Modbus addressing grammar docs + e2e
Closes the docs/e2e end of the Modbus addressing line shipped across
#136-#145.

Docs:

- docs/v2/modbus-addressing.md (new) — full grammar reference.
  Region+offset (Modicon 5-digit / 6-digit / mnemonic), bit suffix,
  type codes (BOOL / I / UI / DI / UDI / LI / ULI / F / D / BCD / LBCD /
  STR<n>), all four byte-order mnemonics (ABCD / CDAB / BADC / DCBA),
  array-count semantics, family-native syntax (DL205 V/Y/C/X/SP and
  MELSEC D/M/X/Y with hex-vs-octal sub-family selection), driver-instance
  options (KeepAlive / Reconnect / IdleDisconnect, MaxCoilsPerRead and
  FC15/16 forcing, Deadband + WriteOnChangeOnly, MaxReadGap +
  CoalesceProhibited, multi-unit IPerCallHostResolver). Includes a worked
  JSON DTO example mixing AddressString + structured tag forms.

- docs/Driver.Modbus.Cli.md — appended a "v2 addressing grammar" section
  pointing users at the full reference, with quick-reference examples.

- Vendor-compatibility caveat documented: type codes and byte-order
  mnemonics were synthesised from training-era vendor docs (Wonderware
  DASMBTCP, Kepware KEPServerEX, Ignition, Matrikon, OAS) and should be
  verified against current vendor manuals before locking for production.

E2E tests (4 new AddressingGrammarTests in IntegrationTests):
- Modicon 5-digit and 6-digit forms map to identical wire offsets.
- Float32 + WordSwap (CDAB) round-trips end-to-end through the
  pymodbus simulator.
- Int16[5] array round-trips as a typed short[] surface.
- Block-read coalescing produces a wire-acceptable PDU when MaxReadGap=5
  bridges three nearby tags.

All tests skip gracefully when the pymodbus simulator at localhost:5020
is unreachable (matches the existing ModbusSimulatorFixture pattern).

Final test count across the Modbus addressing surface:
- 107 ModbusAddressing.Tests (parser + family + Modicon)
- 231 Driver.Modbus.Tests (driver, byte order, array, multi-unit, coalescing,
  protocol, subscribe, connection options)
- 110 Admin.Tests (incl. ModbusOptionsViewModel defaults pinning)
- 4 new AddressingGrammar integration tests (skip when sim down)
2026-04-25 00:32:27 -04:00
Joseph Doherty 858f300a61 Task #145 — Admin UI: expose new Modbus driver config
Two new Blazor components surface every Modbus knob added by #136-#144 so
users can configure the driver without hand-editing DriverConfig JSON.

ModbusAddressEditor.razor (live address-string parser preview):
- Bound to a string AddressString + a Family / MelsecSubFamily hint.
- On every input keystroke, runs ModbusAddressParser.TryParse and surfaces
  the resolved breakdown (Region, Offset, DataType, Bit, ByteOrder,
  ArrayCount, StringLength) inline as a green badge.
- On parse error, shows the parser's diagnostic in red.
- Re-uses the SAME parser the wire driver uses — grammar drift is
  impossible by construction.

ModbusOptionsEditor.razor (driver-instance options panel):
- Connection group (Host / Port / UnitId).
- Family group (#144) with conditional MelsecSubFamily dropdown.
- Keep-alive group (#139): Enabled / Time / Interval / RetryCount.
- Reconnect group (#139): InitialDelay / MaxDelay / BackoffMultiplier.
- Protocol group (#140): MaxRegistersPerRead / Write / Coils / ReadGap.
- Behaviour toggles (#140 + #141): UseFC15 / UseFC16 / WriteOnChangeOnly.
- Bound to ModbusOptionsViewModel — defaults match ModbusDriverOptions
  defaults so unedited rows produce the historical wire output verbatim.

Architecture:
- Admin project gains a ProjectReference to Driver.Modbus.Addressing
  (the shared parser assembly extracted in #136). Admin does NOT take a
  dep on Driver.Modbus itself — the addressing concerns are cleanly
  separated from the wire driver.
- Same-namespace shared assembly means components reference
  ModbusAddressParser / ModbusFamily / etc. without prefix gymnastics.

Tests:
- ModbusOptionsViewModelTests (1 test) — pins every default in the view
  model against the corresponding ModbusDriverOptions default. A
  regression that flips an unedited row to a non-default value gets
  caught here. (Test references both Admin and Driver.Modbus to make the
  cross-assembly comparison.)
- Live Blazor component testing requires bUnit, which isn't currently
  in the test setup; the parser logic the component wraps is fully
  covered by the 91 ModbusAddressParser tests in the addressing project,
  so the glue layer's behaviour is verifiable end-to-end already.

Caveat: the wiring into the existing DriverInstance edit page lives in
DriversTab.razor — that integration is left as a follow-up because it
touches the cluster-edit workflow specifically and the components in
this commit are framework-agnostic enough to drop in. The components
build clean against the existing Admin project; no behavioural change
to other tabs.
2026-04-25 00:26:43 -04:00
Joseph Doherty 366212417c Task #143 — Modbus block-read coalescing (with max-gap knob)
Adds a coalescing read planner that merges nearby tags into single FC03/FC04
PDUs, opt-in via ModbusDriverOptions.MaxReadGap. Default 0 = no coalescing
(every tag gets its own PDU — preserves pre-#143 wire output).

Worked example with MaxReadGap=10:
  T1 @ HR 100 (Int16, 1 reg)
  T2 @ HR 102 (Int16, 1 reg, gap 1 → joins block)
  T3 @ HR 110 (Float32, 2 regs, gap 7 → joins block)
  T4 @ HR 200 (Int16, 1 reg, gap 89 → splits, separate read)
  → 2 PDUs total: FC03 start=100 quantity=12 + FC03 start=200 quantity=1.

Planner:
- Eligible tags: known + register region (HR/IR) + scalar + not String /
  BitInRegister / array + not CoalesceProhibited.
- Groups by (UnitId, Region) — never coalesces across slaves or regions.
- Sorts by start address; merges when (next.start - last.end - 1) ≤ MaxReadGap
  AND the resulting span ≤ MaxRegistersPerRead. Otherwise opens a new block.
- Single-tag blocks are deferred to the per-tag path so WriteOnChange cache
  semantics stay correct without duplication.
- Per-block failure marks every member tag Bad and degrades health — same
  semantics the per-tag path has, but at the block granularity.

Per-tag escape hatch ModbusTagDefinition.CoalesceProhibited (bool, default
false) — when true, the tag is read in isolation regardless of MaxReadGap.
For PLCs with protected register holes between adjacent tags.

Tests (7 new ModbusCoalescingTests):
- MaxReadGap=0 keeps the per-tag behavior (2 reads for 2 tags).
- MaxReadGap=2 merges 3 tags within 5 registers into 1 read of qty=5.
- MaxReadGap=10 splits T1+T2 from T3 when the gap exceeds the threshold.
- CoalesceProhibited tag reads alone even when neighbours are eligible.
- Coalescing never crosses UnitId boundaries (multi-slave gateway safety).
- MaxRegistersPerRead caps a would-be block; planner falls back to separate
  reads when the merged span would exceed the cap.
- Per-tag values surface independently after coalescing (slice-math sanity).

Existing 220 unit tests still green; total 224 pass with the new file (tests
are additive, no regressions).

Follow-up: auto-split-on-protected-hole isn't shipped — a coalesced read
that hits an Illegal Data Address right now marks every member Bad until
the operator sets CoalesceProhibited on the offending tag. Tracked
implicitly by #138's e2e drill against a pymodbus profile with a protected
hole mid-block.
2026-04-25 00:21:18 -04:00
Joseph Doherty ad7d811f69 Task #142 — Modbus multi-unit-ID per TCP connection (gateway support)
Lifts the previous "one driver = one slave" assumption so a single Modbus
driver instance can front N RTU slaves behind one Ethernet gateway (Anybus,
ProSoft, Lantronix style). Each tag carries an optional UnitId that drives
the MBAP unit-id byte per-PDU, and the IPerCallHostResolver contract surfaces
per-slave host strings so per-PLC circuit breakers fire per-slave (matches
the AB CIP template documented in docs/v2/multi-host-dispatch.md).

Changes:

- ModbusTagDefinition gains optional UnitId (byte?). Null = use driver-level
  ModbusDriverOptions.UnitId (preserves single-slave deployments verbatim).
- ResolveUnitId(tag) helper computed once per ReadOneAsync / WriteOneAsync
  call; passed through ReadRegisterBlockAsync / ReadBitBlockAsync /
  ReadRegisterBlockChunkedAsync / ReadBitBlockChunkedAsync explicitly. The
  probe loop continues using driver-level UnitId (the probe is a
  connection-health check, not slave-specific).
- ModbusDriver implements IPerCallHostResolver. ResolveHost(fullReference)
  returns "host:port/unitN" — distinct strings per slave so the resilience
  pipeline keys breakers on the right granularity. Unknown references fall
  back to the bare HostName (single-slave behaviour).
- BitInRegister RMW path also threads the per-tag UnitId through both the
  read and write halves so a multi-slave deployment stays correct under bit-
  level writes.
- Factory DTO + JSON binding extended with the per-tag UnitId field.

Tests (4 new ModbusMultiUnitTests):
- Per-tag UnitId routes to the correct slave in the MBAP header (driver-level
  UnitId=99 must NOT appear when both tags override).
- Tag without override falls back to driver-level UnitId.
- IPerCallHostResolver returns distinct "host:port/unitN" strings per slave.
- Unknown reference returns the bare HostName fallback.

Existing 220 unit tests + 107 addressing tests still green. Per-PLC breaker
isolation under simulated dead slaves is verifiable via the existing AB CIP
test infra; live coverage lands as an integration test in the #138 docs/e2e
refresh.
2026-04-25 00:16:41 -04:00
Joseph Doherty 4cf0b4eb73 Task #144 — Modbus family-native parser branch (DL205 / MELSEC)
Promotes DirectLogicAddress + MelsecAddress from "utility helpers an engineer
calls manually" to "first-class branch of ModbusAddressParser." Users can now
paste DL205-native (V2000, Y0, C100, X17, SP10) and MELSEC-native (D100, M50,
X20 hex/octal, Y0) addresses directly into TagConfig and the parser handles
the PLC-native → Modbus PDU translation.

Changes:

- Both helper files moved into the shared Driver.Modbus.Addressing assembly
  (same namespace, zero-churn for callers). Required because the parser
  needs to call them and the dependency direction is parser→helpers, not
  the other way.
- New ModbusFamily enum (Generic / DL205 / MELSEC) on
  ModbusDriverOptions.Family. Generic preserves pre-#144 behaviour exactly.
- ModbusDriverOptions.MelsecSubFamily picks the X/Y notation (Q_L_iQR hex
  vs F_iQF octal). Default Q_L_iQR.
- ModbusAddressParser.Parse now takes optional family + sub-family hints.
  When non-Generic, family-native parsing runs FIRST; on miss falls back to
  Modicon / mnemonic. Cross-family ambiguity (C100 = Modicon coil under
  Generic, DL205 control relay under DL205) is unambiguous within one
  driver instance.
- Suffix grammar composes with native addresses: V2000:F:CDAB:5 parses
  end-to-end as DL205 V-memory at PDU 1024 + Float32 + word-swap + array of 5.
- Bit suffix composes too: V2000.7 parses as bit 7 of HR[1024].
- Factory DTO fields Family / MelsecSubFamily flow through to BuildTag so
  the JSON binding can drive everything per-driver.

Tests: 16 new ModbusFamilyParserTests covering DL205 V/Y/C/X/SP, MELSEC
D/M/X/Y, sub-family hex-vs-octal disambiguation, cross-family C100 ambiguity,
fallback to Modicon when native misses, and grammar composition with bit/
byte-order/array modifiers. Existing 91 parser tests still green; 220 driver
tests still green.

Caveat: bank-base offsets for MELSEC X/Y/M default to 0 in the grammar
string. Sites with non-zero "Modbus Device Assignment Parameter" bases must
use the structured tag form to override — addressed in the docs refresh
(#138).
2026-04-25 00:10:43 -04:00
Joseph Doherty 4bffe879c5 Task #141 — Modbus subscribe-side knobs (deadband + write-on-change)
Two driver-side filters that ≥5 of 6 surveyed vendors expose:

1. Per-tag Deadband (double?, on ModbusTagDefinition) — when set, the
   PollGroupEngine onChange callback suppresses publishes whose distance
   from the last-published value is below the threshold. Reduces wire
   traffic to OPC UA clients on noisy analog signals (flow meters,
   temperatures). Numeric scalar types only — Bool / BitInRegister / String
   / array tags publish unconditionally.

2. WriteOnChangeOnly (bool, on ModbusDriverOptions) — when true, the driver
   short-circuits writes whose value matches the most recent successful
   write to that tag. Saves PLC bandwidth on clients that re-publish the
   same setpoint every scan. Cache invalidates on any read that returns a
   different value, so HMI-side changes don't get masked.

Both default off so existing deployments see no behaviour change.

Implementation:
- ShouldPublish guard wraps the existing OnDataChange invocation. First sample
  always passes through (no baseline); subsequent samples compare via
  Convert.ToDouble for the cross-numeric-type math.
- IsRedundantWrite check at the top of WriteAsync; on success the cache is
  populated. Object.Equals handles boxed-numeric equality; arrays are
  excluded (reference-equality would never match anyway).
- ReadAsync invalidates the WriteOnChangeOnly cache when the new value
  differs from the cached last-written value.

Tests (5 new ModbusSubscribeOptionsTests):
- Deadband suppresses sub-threshold changes (100 → 102 → 106 → 107 with
  deadband=5 publishes 100 and 106 only).
- Deadband=null still publishes every change.
- WriteOnChangeOnly suppresses 3 identical 42 writes (only first hits wire).
- WriteOnChangeOnly default false hits the wire every time.
- Read-divergence cache invalidation: external panel write to 99, our
  client's re-write of 42 must NOT be suppressed.

220/220 unit tests green; existing ProtocolOptions tests hardened against
probe-loop noise by disabling the probe in their fixtures.
2026-04-25 00:05:25 -04:00
Joseph Doherty 55f4044a69 Task #140 — Modbus protocol-behavior knobs
Adds ModbusDriverOptions knobs that ≥4 of 6 surveyed vendors expose:

1. MaxCoilsPerRead (ushort, default 2000) — separate from MaxRegistersPerRead
   because coil packing (1 bit per coil) and register packing (16 bits each)
   have different spec ceilings. Coil-array reads above the cap auto-chunk
   the same way register reads have always done. New ReadBitBlockChunkedAsync
   re-assembles per-chunk LSB-first bitmaps into one logical bitmap.

2. UseFC15ForSingleCoilWrites (default false) — forces FC15 (Write Multiple
   Coils with quantity=1) for single-coil writes instead of the default FC05
   (Write Single Coil). Safety / audit PLCs that only accept the multi-write
   codes need this.

3. UseFC16ForSingleRegisterWrites (default false) — same idea for FC16 vs
   FC06 on single holding-register writes.

4. DisableFC23 (default false) — placeholder no-op for the future block-read
   coalescing (#143) work that may opt into FC23 (Read/Write Multiple
   Registers). Lets deployments pre-disable FC23 for PLCs that won't accept
   it, before we ship the optimisation that emits it.

Defaults preserve the historical wire output bit-for-bit (FC05/FC06 for
singles, no chunking under 2000 coils, no FC23). Factory DTO + JSON-binding
extended with parallel fields.

6 new ModbusProtocolOptionsTests covering: defaults, FC05→FC15 forcing,
FC06→FC16 forcing, MaxCoilsPerRead chunking math (2500 coils / 2000 cap →
2 reads of 2000 + 500). Existing 209 unit tests still green.
2026-04-24 23:59:04 -04:00
Joseph Doherty 6cf20131fe Task #139 — Modbus connection-layer config knobs (keep-alive / idle / reconnect)
Promotes the previously hardcoded transport-layer settings to ModbusDriverOptions
so users can tune them through DriverConfig JSON without recompiling.

Three new option groups:

1. KeepAlive (ModbusKeepAliveOptions): Enabled / Time / Interval / RetryCount.
   Defaults preserve the historical PR 53 wire output exactly (Enabled=true,
   Time=30s, Interval=10s, RetryCount=3). Set Enabled=false for PLCs that
   reject SO_KEEPALIVE.

2. IdleDisconnectTimeout (TimeSpan?): when set, the transport tracks last-PDU-
   success and proactively closes + reconnects on the next request after the
   threshold. Defends against silent NAT / firewall socket reaping. Default
   null = disabled (no behaviour change).

3. Reconnect (ModbusReconnectOptions): InitialDelay / MaxDelay /
   BackoffMultiplier for the post-drop reconnect loop. Defaults
   (InitialDelay=0, MaxDelay=30s, Multiplier=2.0) preserve the historical
   immediate-retry behaviour for the first attempt and add geometric backoff
   only if the reconnect itself fails. Capped at 10 attempts before propagating.

ModbusTcpTransport ctor extended with optional keepAlive / idleDisconnect /
reconnect parameters; existing 4-arg call sites continue to compile. Factory
DTO gains parallel KeepAlive / IdleDisconnectMs / Reconnect fields with
default-aware binding.

5 new ModbusConnectionOptionsTests covering the default-preservation contract
(every default field matches pre-#139) and the JSON-binding round-trip for
each knob group. Existing 204 unit tests still green.
2026-04-24 23:53:26 -04:00
Joseph Doherty 850b816873 Task #137 — Modbus per-tag suffix grammar (type / bit / byte-order / array)
Adds the full Wonderware/Kepware/Ignition-style address suffix grammar so
users paste tag spreadsheets without per-tag manual translation:

  <region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]

Examples that now parse end-to-end:
  40001                          HoldingRegisters[0], Int16
  400001                         same, 6-digit form
  40001.5                        bit 5 of HR[0]
  40001:F                        Float32 (HR[0..1])
  40001:F:CDAB                   word-swapped Float32
  40001:STR20                    20-char ASCII string
  HR1:DI                         Int32 via mnemonic region
  C100                           Coils[99] (mnemonic)
  40001:F:5                      Float32[5] array (3-field shorthand)
  40001:I:CDAB:10                Int16[10] word-swapped (4-field strict)

Driver-side plumbing:
- ModbusAddressParser + ParsedModbusAddress in the shared Addressing
  assembly. 91 parser tests (every grammar variant + malformed shapes).
- ModbusDataType / ModbusByteOrder moved to shared (with the same namespace
  so callers compile unchanged). ModbusByteOrder gains ByteSwap (BADC) and
  FullReverse (DCBA) alongside the existing BigEndian (ABCD) and WordSwap
  (CDAB).
- NormalizeWordOrder extended to honor all four orders for both 4-byte and
  8-byte values. Old WordSwap behavior preserved bit-for-bit.
- ModbusTagDefinition gains optional ArrayCount.
- ReadOneAsync / WriteOneAsync handle array fan-out: one FC03/04 read covers
  N consecutive register-typed elements, decoded into a typed array (short[],
  float[], etc.). Coil arrays use FC01 reads + FC15 writes (FakeTransport
  in tests gains FC15 support to match).
- DriverAttributeInfo IsArray / ArrayDim flow from ArrayCount so the OPC UA
  address space surfaces ValueRank=1 + ArrayDimensions to clients.
- ModbusDriverFactoryExtensions gains AddressString DTO field. When
  present, the parser drives Region/Address/DataType/ByteOrder/Bit/
  StringLength/ArrayCount; structured fields (Writable, WriteIdempotent,
  StringByteOrder) still come from the DTO. Existing structured tag rows
  keep working unchanged.

Tests: 91 parser unit tests (Driver.Modbus.Addressing.Tests, all green) +
204 driver tests including new ModbusByteOrderTests (BADC/DCBA roundtrips
across Int32/Float32/Float64) and ModbusArrayTests (Int16[5], Float32[3]
CDAB, Coil[10], length-mismatch error, IsArray/ArrayDim discovery).
Solution-wide build clean.

Caveat: grammar names (type codes, byte-order mnemonics, the :count
shorthand) were synthesized from training-era vendor docs. Verify against
current Kepware Modbus Ethernet Driver Help and Ignition Modbus Addressing
manuals before freezing for production deployments — naming may need a
back-compat layer if vendor wording has shifted.
2026-04-24 23:49:22 -04:00
Joseph Doherty 501d8f494b Task #136 — Modicon address-string parser (5/6-digit) + shared addressing assembly
Foundation for the Modbus addressing-grammar work tracked in #137-#145. Adds
ModbusModiconAddress.Parse / TryParse that turns classic Modicon strings
(40001 / 400001 / 30001 / 00001 / 10001) into (Region, ushort PduOffset).

Also extracts ModbusRegion to a new Driver.Modbus.Addressing assembly so the
Admin UI (#145) can reference the addressing surface without taking a dep on
the wire driver. The new assembly intentionally extends the same
ZB.MOM.WW.OtOpcUa.Driver.Modbus namespace as the driver — callers see the
type as if it lived in one place; only the project layout changes. No
existing call site needed editing (zero-churn move).

Behaviour:
- Single leading digit selects region (0=Coils, 1=DiscreteInputs,
  3=InputRegisters, 4=HoldingRegisters).
- 5-digit form: trailing 4 digits are 1-based register, supports 1..9999.
- 6-digit form: trailing 5 digits are 1-based register, supports 1..65536
  (full PDU address space).
- Strict 5-or-6 length check; whitespace trimmed; clear FormatException
  diagnostics for every malformed shape (wrong length, non-digit body,
  illegal leading digit, register zero, register overflow).

29/29 new unit tests pass. Full Driver.Modbus suite (182 tests) and the
solution-wide build still green after the ModbusRegion move.
2026-04-24 23:34:18 -04:00
Joseph Doherty fb760bc465 Task #135 — update integration-test NodeIds for path-based scheme
7 integration tests in Server.Tests were left behind by the path-based
NodeId rename (#134). Each was constructing test NodeIds in the old
"FullReference" shape ("TestFolder.Var1", "raw.var", "AlphaFolder.Var1",
"plcaddr-temperature"), which the node manager no longer mints — the new
shape is `{driverId}/{folder-path}/{browseName}` per OPC UA Part 3 §5.2.2
NodeId immutability.

Fixed by re-deriving each test NodeId from the actual browse path the test
fixture's driver registers:

- OpcUaServerIntegrationTests: "TestFolder.Var1" → "fake/TestFolder/Var1"
- HistoryReadIntegrationTests (4 tests): "raw.var" → "history-driver/raw",
  "proc.var" → "history-driver/proc" (×2), "atTime.var" → "history-driver/atTime"
- MultipleDriverInstancesIntegrationTests: "AlphaFolder.Var1" →
  "alpha/AlphaFolder/Var1"; "BetaFolder.Var1" → "beta/BetaFolder/Var1"
- OpcUaEquipmentWalkerIntegrationTests: "plcaddr-temperature" →
  "galaxy-prod/warsaw/line-a/oven-3/Temperature" (the walker uses Tag.Name
  as the browseName; the FullReference lives in TagConfig but no longer
  surfaces in the NodeId path)

Server.Tests now 277/277 green excluding LiveLdap. Clears the regression
flagged during the #124 verification run.
2026-04-24 22:03:03 -04:00
Joseph Doherty 75c07149d4 Task #124 — Phase 6.2 multi-user authz interop matrix + close LdapGroups gap
The Phase 6.2 evaluator was wired but received no input in production:
RoleBasedIdentity (the IUserIdentity our LDAP path produces) implemented
IRoleBearer but not ILdapGroupsBearer, so AuthorizationGate.BuildSessionState
always returned null and the gate lax-mode-allowed every request. UserAuthResult
also never carried the resolved LDAP groups, only the role-mapped strings.

Closing the gap so the evaluator gets real data:

- UserAuthResult adds Groups alongside Roles. LdapUserAuthenticator now
  surfaces the raw RDN values (ReadOnly / WriteOperate / ...) it already
  collected during the directory query. Roles stay separate per decision #150
  (control-plane Admin role mapping vs data-plane NodeAcl key).
- RoleBasedIdentity implements ILdapGroupsBearer so AuthorizationGate sees
  the groups via the same seam unit tests already use.

ThreeUserInteropMatrixTests drives the closure end-to-end against the live
GLAuth dev directory:

- 5 distinct group memberships (readonly / writeop / writetune /
  writeconfig / alarmack) plus the multi-group admin user
- Each is bound through the real LdapUserAuthenticator
- Resolved groups feed an LdapBoundIdentity that goes through the strict-mode
  AuthorizationGate against a seeded TriePermissionEvaluator
- 31 InlineData rows assert the role × operation matrix; failures pinpoint
  the exact (user, op) cell

The remaining wire-level leg of #124 — a real OPC UA client driving UserName
tokens through an encrypted endpoint policy — still needs a deployment knob
and stays a manual cross-vendor smoke (#119 / #124 manual scope). The doc
audit note in admin-ui-phase-6-status.md is updated to reflect what's now
auto'd vs what stays manual.

33/33 new tests pass against live GLAuth; existing 270 non-LiveLdap tests
in Server.Tests still pass; Core.Tests 205/205, Admin.Tests 109/109. The 7
integration-test failures observed during this run pre-exist this commit
(NodeId-scheme regression from #134) and are tracked separately as #135.
2026-04-24 20:40:07 -04:00
Joseph Doherty d11d160395 Admin UI Phase 6 audit — close #128–#131 as already-shipped
Task-by-task audit of the Admin UI quartet shows every page listed in
the task descriptions is already built, routed, DI-wired, SignalR-live,
and covered by Admin.Tests (112/112 green):

- #128 /hosts — Hosts.razor 233 LOC with ConsecutiveFailures +
  LastCircuitBreakerOpenUtc + Stale/Faulted/Running cards
- #129 RoleGrants + AclsTab + Probe — RoleGrants.razor (192 LOC),
  AclsTab.razor (279 LOC) with the embedded Probe form at line 38
- #130 RedundancyTab — RedundancyTab.razor 175 LOC with peer
  reachability / ServiceLevel / apply-lease / failover button
- #131 Draft/Publish/Diff/Identification — DraftEditor (105 LOC) +
  Generations (73 LOC) + DiffViewer (87 LOC) + IdentificationFields
  (49 LOC), all wired to GenerationService / DraftValidationService

Shipping docs/v2/implementation/admin-ui-phase-6-status.md as the
canonical reference. Each task's required features are listed with the
exact file / LOC / routing + DI injection so future auditors don't
need to re-derive the status.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:07:05 -04:00
Joseph Doherty e5d1c9c9b9 Phase 6.1 multi-host dispatch — document shipped contract + per-driver status
Task #127 / decision #144. The resilience infrastructure for per-PLC
circuit breakers is shipped and fully tested — the task description's
"current pipeline keys on DriverInstanceId only" was stale. The actual
state:

- `DriverResiliencePipelineBuilder` keys on
  `(DriverInstanceId, HostName, DriverCapability)`.
- `CapabilityInvoker.ExecuteAsync` takes `hostName` per call.
- `IPerCallHostResolver` is the driver-side hook; AB CIP implements it.
- `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`
  proves the end-to-end isolation.

Remaining work is per-driver adoption, not shared infrastructure:
- AB CIP: live + tested
- Galaxy / FOCAS / OPC UA Client / AB Legacy: 1 device per instance by
  design, trivially isolated
- Modbus / S7 / TwinCAT: single-device today; multi-device refactor is
  per-driver surgery (Device row + options + resolver + transport
  fan-out), not a shared-infra change

Shipping docs/v2/multi-host-dispatch.md as the canonical reference:
contract + driver-author checklist + current fleet-wide status table.
Future driver authors follow the AB CIP template.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:01:47 -04:00
Joseph Doherty bd6568bcbd Phase 6.1 Stream B.4 — wire ScheduledRecycleHostedService into bootstrap
Task #125 / #137. The hosted service + scheduler classes already shipped;
this commit connects them to the published-generation driver list so a
Tier C driver with `RecycleIntervalSeconds` in its `ResilienceConfig`
actually gets an armed scheduler at bootstrap.

Wiring:

- `DriverFactoryRegistry.Register` gains an optional `DriverTier`
  parameter (default Tier.A). Existing call sites unchanged —
  `GalaxyProxyDriverFactoryExtensions.Register` explicitly passes
  Tier.C so the bootstrapper can identify out-of-process drivers
  without a per-driver-type allow-list.
- `DriverResilienceOptions` + parser grow `RecycleIntervalSeconds`.
  Tier A/B values are rejected with a diagnostic (decision #74 —
  recycling an in-process driver would kill every OPC UA session).
  Non-positive values are rejected the same way.
- `DriverInstanceBootstrapper` auto-arms a `ScheduledRecycleScheduler`
  after a successful driver register when: (1) the registered tier is
  C, (2) the row's ResilienceConfig carries a positive recycle interval,
  (3) DI has an `IDriverSupervisor` keyed by that `DriverInstanceId`.
  Missing supervisor → warn + skip (no crash). That keeps the wiring
  harmless by default: no driver ships a supervisor today, so the
  hosted service runs with zero schedulers out of the box.
- `Program.cs` registers `ScheduledRecycleHostedService` as singleton
  (shared with `DriverInstanceBootstrapper`) + hosted service (drives
  the tick loop). Constructor changes on the bootstrapper ripple into
  DI resolution automatically.

Tests: 4 new parser tests covering RecycleIntervalSeconds on Tier C
happy path, null default, Tier A/B rejection, non-positive rejection.
Existing 283 Server.Tests + 200 Core.Tests all still green.

No behavioural change for existing deployments: Galaxy driver + any
future Tier C driver gain the opt-in automatically; Tier A/B drivers
(FOCAS, Modbus, S7, AB CIP, AB Legacy, TwinCAT) are structurally
excluded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:58:13 -04:00
Joseph Doherty a52086efc5 Refresh phase-7-e2e-smoke.md to match current wiring
The runbook shipped at phase-7 close (2026-04-20) described the original
`Doubled = Source × 2` virtual tag, Float64 seed, and flat TagId-shaped
NodeIds. Four commits later the wiring has moved:

- Seed now targets `TestMachine_001.TestHistoryValue` (Int32, writable,
  historized) — no placeholder to fill in for the dev box.
- VirtualTag is `MachineStatus` (Boolean, `Source > 0`, historized).
- NodeIds are path-based per OPC UA Part 3 §5.2.2
  (`{driverId}/{folder-path}/{browseName}`).
- Seed inserts the ClusterNodeCredential row — without it the Server
  bootstrap fails `Unauthorized: caller X is not bound to NodeId`.

Changes:

1. Step 3 — replace "edit the placeholder" instructions with the ZB
   Galaxy-Repository query that finds writable historized attributes
   (dpc CTE + HistoryExtension EXISTS + `security_classification > 0`).
2. New step 4a — LDAP + `SecurityProfile = Basic256Sha256-Sign` recipe
   for the reverse-bridge + alarm-fires stages. Anonymous sessions are
   denied writes against `Operate`-classified attributes (PR 26 gate);
   `writeop / writeop123` against the dev-box GLAuth clears it.
3. Step 6 validation commands updated to the new NodeIds + reference
   the path-based scheme's Part-3 rationale.
4. Drive-the-alarm snippet now calls `otopcua-cli write … -U writeop`
   so operators see the explicit auth step.
5. Acceptance checklist updated for the new tag names + the
   test-galaxy.ps1 `-Username` invocation.
6. Added a 2026-04-24 second-run evidence section alongside the original
   — documents the 3/7 anonymous ceiling and what's needed to reach 7/7.

No code or seed changes in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:13:27 -04:00
Joseph Doherty ec1a5905bf Galaxy E2E — point at live writable historized attribute + MachineStatus
Pick a Galaxy attribute that actually exercises the full driver stack:
TestMachine_001.TestHistoryValue. Verified against the live dev-box ZB:
it's Int32, writable (security_classification = Operate), and historized
(HistoryExtension primitive). The query lives in
`gr/queries/attributes_extended.sql` — swap to any other writable
historized attribute via the same shape
(`WHERE is_historized = 1 AND security_classification > 0`).

Seed changes:
- Tag row: FullName = TestMachine_001.TestHistoryValue (Int32 / ReadWrite)
- VirtualTag renamed: `Doubled` → `MachineStatus` (Boolean), script returns
  `Source > 0`. Historized, so the write/subscribe exercise doubles as a
  historian-sink check once the alarm/write stages are enabled.
- Scripted alarm predicate reads the same Source and fires on `> 50`.
- Added ClusterNodeCredential(sa → p7-smoke-node) row so
  sp_GetCurrentGenerationForCluster's caller-binding check passes. Without
  this the server bootstrap fails with
  `Unauthorized: caller sa is not bound to NodeId p7-smoke-node`.

E2E script:
- Path-based NodeId defaults updated to match the new MachineStatus
  virtual tag.
- Added optional `-Username / -Password` parameters. Anonymous sessions
  still get denied against Operate-classified attributes (PR 26 /
  docs/Security.md); supplying `-Username writeop -Password writeop123`
  against the dev-box GLAuth exercises the reverse-bridge stage.
- Wired those credentials into every Invoke-Cli / Start-Process CLI
  invocation the script drives.

Anonymous smoke remains 3/7 pass (probe + source read + reverse-bridge
marked acl-expected INFO). A fuller run with
`-Username writeop -Password writeop123` requires also enabling LDAP +
a SecurityProfile that carries a UserName UserTokenPolicy — separate
config step tracked alongside #124 (3-user authz matrix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:04:39 -04:00
Joseph Doherty 69e1d320ac Cold-start guard for script engines — skip evaluation with empty upstream
Both VirtualTagEngine and ScriptedAlarmEngine share a pattern: the
BuildReadCache helper iterates the script's declared input set, reading
from _valueCache with a fallback to _upstream.ReadTag. When an upstream
tag hasn't yet delivered its first subscription push, ReadTag returns a
DataValueSnapshot with a null Value and BadNotConnected quality. User
scripts then cast `(double)ctx.GetTag(path).Value` unconditionally and
throw NullReferenceException — once per evaluation tick until the cache
fills, spamming the log with identical stack traces. The existing catch
block recovered (kept the prior state) but didn't silence the churn.

Add AreInputsReady(cache) to both engines: return true only when every
entry has a non-null Value and a non-Bad StatusCode (Good + Uncertain
are both considered ready). Skip script evaluation when the check
returns false — the engine holds the prior state (alarm) or the prior
snapshot (virtual tag) until upstream delivers. Eliminates the cold-
start NRE spam at root without changing the script-engine contract.

Also: fix $changeLines.Count in test-galaxy.ps1 — PowerShell's
Set-StrictMode -Version 3.0 errors on .Count when Where-Object returns
0 or 1 items. Wrap in `@(...)` to force an array; same pattern the
sibling _common.ps1 already uses in Write-Summary for the same reason.

Task #112 — the Galaxy live E2E now passes 3/7 stages (probe + source
read + reverse-bridge-ACL). The remaining 4 stages (virtual-tag,
subscribe-sees-change, alarm-fires, history-read) are deployment-
specific: MoveInBatchID is idle in this Galaxy + its AccessLevel blocks
writes + it's not historized. Cold-start behaviour is now correct, so
once the seed points at a live attribute those stages should light up.

Tests: 36/36 VirtualTags.Tests + 47/47 ScriptedAlarms.Tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 17:43:48 -04:00
Joseph Doherty 8be82e02c2 Path-based NodeIds — decouple client contract from driver address
The pre-refactor design minted OPC UA NodeIds directly from the driver's
FullReference (the native-address string). That had three long-term
problems:

1. OPC UA Part 3 §5.2.2 requires NodeIds to be immutable across a node's
   lifetime. A rename of the underlying device address — Galaxy attribute,
   S7 tag, Modbus register alias — changed the NodeId and broke every
   client that had pinned the previous identifier.
2. Two drivers with coincidentally-matching native addresses (e.g. `temp`
   in Modbus and `temp` in S7 under different Equipment rows) collided on
   the NodeId identifier.
3. TagConfig was being placed verbatim on the wire; for drivers whose
   TagConfig is JSON (every driver shipped today, per the
   CK_Tag_TagConfig_IsJson check constraint), clients saw the raw JSON
   blob as the NodeId string.

Refactor:

* DriverNodeManager.Variable now mints a stable path-based NodeId
  `{driverId}/{folder-path}/{browseName}` and records the driver-side
  FullReference in a new _fullRefByNodeId map. OnReadValue / OnWriteValue
  / ResolveFullRef look the FullReference up via that map instead of
  casting NodeId.Identifier. The old cast path is preserved as a
  fallback so any test fixture that still registers variables with
  FullRef-shaped NodeIds keeps working.

* EquipmentNodeWalker.AddTagVariable now extracts the cross-driver
  `FullName` field from Tag.TagConfig before handing the address to
  DriverAttributeInfo. Every shipped driver stores the wire reference in
  TagConfig[FullName]; falling back to the raw string covers any future
  driver that wants an opaque non-JSON address. ExtractFullName is
  exposed internal for unit coverage.

* scripts/e2e/test-galaxy.ps1 defaults updated to the new path-based
  NodeIds. Verified live against p7-smoke-galaxy on the dev box:
  `ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source` reads
  return Status=0x00000000 with a real Galaxy byte-array value.

Test suite: 195/195 Core.Tests + 283/283 Server.Tests green. Five new
ExtractFullName / FullName-passthrough tests added.

Task #112 GA-3 — golden-path read verified end-to-end; remaining E2E
script stages still blocked on pre-existing issues (ScriptedAlarm
predicate NRE on empty upstream cache, PowerShell $changeLines.Count
guard), tracked separately.
Task #134 — complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:57:20 -04:00
Joseph Doherty d11dd0520b Galaxy IPC unblock — live dev-box E2E path
Three root-cause fixes to get an elevated dev-box shell past session open
through to real MXAccess reads:

1. PipeAcl — drop BUILTIN\Administrators deny ACE. UAC's filtered token
   carries the Admins SID as deny-only, so the deny fired even from
   non-elevated admin-account shells. The per-connection SID check in
   PipeServer.VerifyCaller remains the real authorization boundary.

2. PipeServer — swap the Hello-read / VerifyCaller order. ImpersonateNamedPipeClient
   returns ERROR_CANNOT_IMPERSONATE until at least one frame has been read
   from the pipe; reading Hello first satisfies that rule. Previously the
   ACL deny-first path masked this race — removing the deny ACE exposed it.

3. GalaxyIpcClient — add a background reader + single pending-response
   slot. A RuntimeStatusChange event between OpenSessionRequest and
   OpenSessionResponse used to satisfy the caller's single ReadFrameAsync
   and fail CallAsync with "Expected OpenSessionResponse, got
   RuntimeStatusChange". The reader now routes response kinds (and
   ErrorResponse) to the pending TCS and everything else to a handler the
   driver registers in InitializeAsync. The Proxy was already set up to
   raise managed events from RaiseDataChange / RaiseAlarmEvent /
   OnHostConnectivityUpdate — those helpers had no caller until now.

4. RedundancyPublisherHostedService — swallow BadServerHalted while
   polling host.Server.CurrentInstance. StandardServer throws that code
   during startup rather than returning null, so the first poll attempt
   crashed the BackgroundService (and the host) before OnServerStarted
   ran. This race was latent behind the Galaxy init failure above.

Updates docs that described the Admins deny ACE + mandatory non-elevated
shells, and drops the admin-skip guards from every Galaxy integration +
E2E fixture that had them (IpcHandshakeIntegrationTests, EndToEndIpcTests,
ParityFixture, LiveStackFixture, HostSubprocessParityTests).

Adds GalaxyIpcClientRoutingTests covering the router's
request/response match, ErrorResponse, event-between-call, idle event,
and peer-close paths.

Verified live on the dev box against the p7-smoke cluster (gen 6):
driver registered=1 failedInit=0, Phase 7 bridge subscribed, OPC UA
server up on 4840, MXAccess read round-trip returns real data with
Status=0x00000000.

Task #112 — partial: Galaxy live stack is functional end-to-end. The
supplied test-galaxy.ps1 script still fails because the UNS walker
encodes TagConfig JSON as the tag's NodeId instead of the seeded TagId
(pre-existing; separate issue from this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:30:16 -04:00
Joseph Doherty fb6dd3478d Phase 6.2 Stream C wiring — AuthorizationBootstrap + OpcUaApplicationHost.SetAuthorization
Closes task #133 — the "authz gate is inert in production" blocker
surfaced during task #123. Before this commit, every ACL check on the
six dispatch surfaces (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) short-circuited to allow because Program.cs
constructed OpcUaApplicationHost without passing authzGate or
scopeResolver.

New pieces:

- `AuthorizationOptions` — bound to `Node:Authorization` in
  appsettings.json. `Enabled` (default false) is the master switch;
  `StrictMode` (default false) controls the anonymous / no-LDAP-groups
  fallback behaviour.
- `AuthorizationBootstrap` — singleton service that loads `NodeAcl`
  rows for the published generation, builds a `PermissionTrieCache` +
  `AuthorizationGate`, merges every registered driver's
  `EquipmentNamespaceContent` through `ScopePathIndexBuilder` into one
  full-path `NodeScopeResolver`. Returns `(null, null)` when disabled
  or when no generation is Published yet.
- `DriverEquipmentContentRegistry.Snapshot()` — new method returning a
  defensive copy of the driver → content map so the bootstrap can
  iterate without holding the lock.
- `OpcUaApplicationHost.SetAuthorization(gate, resolver)` — late-bind
  method matching the existing `SetPhase7Sources` pattern. Must run
  before `StartAsync`; rejects post-start rebinding with
  InvalidOperationException.
- `OpcUaServerService.ExecuteAsync` calls `AuthorizationBootstrap.BuildAsync`
  after `PopulateEquipmentContentAsync` and before `applicationHost.StartAsync`,
  in the same window that `SetPhase7Sources` runs.

Behaviour change
- Default (Enabled=false): no behaviour change — the gate stays null,
  all six dispatch surfaces run unchanged. Safe for any existing
  deployment on upgrade.
- Enabled=true with StrictMode=false: identities carrying LDAP groups
  are evaluated against the trie; anonymous / no-groups identities
  pass through (v1 legacy-client compatibility).
- Enabled=true with StrictMode=true: everything evaluates. Anonymous
  or no-groups identities are denied.

Follow-up not covered here: rebind the gate+resolver on generation
refresh (the `GenerationRefreshHostedService` that shipped earlier in
this session). Today the gate only reflects the bootstrap generation
— operators publishing new ACL changes need a process restart to see
them. Matches the current driver-hot-reload limitation and is tracked
in the existing 6.3 follow-up bullet.

Docs: v2-release-readiness.md Phase 6.2 Stream C.12 bullet flipped to
Closed with operator-facing config pointer (`Node:Authorization:Enabled`).

All 283/283 Server.Tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:35:46 -04:00
Joseph Doherty 1be0fb5a29 Phase 6.2 Stream C.12 — lock in ScopePathIndexBuilder semantics with tests
Closes task #123 (partial — builder semantics unit-tested; production
wiring is the new task #133).

ScopePathIndexBuilder + NodeScopeResolver indexed mode already exist —
they produce a full Cluster → Namespace → UnsArea → UnsLine → Equipment
→ Tag scope from the published generation's config rows. What was
missing: unit coverage of the Build semantics (the only consumers were
compile-time references) + explicit acknowledgement in the readiness
doc that the gate/resolver aren't yet wired into Program.cs.

Tests — 6 cases in ScopePathIndexBuilderTests.cs:
- Well-formed content emits full hierarchy.
- Tags with null EquipmentId skipped (SystemPlatform-namespace fallback).
- Tags with broken Equipment FK skipped (publish-time validation
  should have caught; builder is defensive).
- Equipment with broken Line FK skipped.
- Duplicate TagConfig throws InvalidOperationException.
- Resolver with index returns full-path scope; un-indexed ref falls
  through to cluster-only scope (pre-ADR-001 behaviour preserved).

Server.Tests 277 → 283.

Critical follow-up (task #133): Program.cs still constructs
OpcUaApplicationHost WITHOUT authzGate or scopeResolver, so all six
dispatch-layer gates (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) are currently inert in production. Wiring
them up — load NodeAcl + EquipmentNamespaceContent at bootstrap,
construct gate + resolver, pass into OpcUaApplicationHost, rebind on
generation refresh — is the last Phase 6.2 GA blocker.

Docs: v2-release-readiness.md Phase 6.2 Stream C hardening list marks
the scope-resolution bullet struck-through with a close-out note that
calls out the gate-inert-in-production gap + task #133.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:28:19 -04:00
Joseph Doherty ded292ecd7 Phase 6.2 Stream C — Call + Alarm Acknowledge/Confirm gating
Closes task #122 (Acknowledge + Confirm + generic Call — Shelve stays as
a follow-up pending per-instance method-NodeId resolution).

Before this commit any session with a connected channel could invoke
method nodes on driver-materialized equipment — including alarm
Acknowledge / Confirm. Combined with the Browse + CreateMonitoredItems
gates that landed earlier in Stream C, this was the last service-layer
entry point where a session could still affect state without passing
the authz trie.

Implementation on DriverNodeManager:
- `Call` override — pre-iterates methodsToCall, gates each through
  AuthorizationGate with the operation kind returned by
  MapCallOperation. Denied calls get errors[i] = BadUserAccessDenied
  before delegating to base.Call.
- `MapCallOperation(NodeId methodId)` — maps well-known Part 9 method
  NodeIds to dedicated operation kinds:
    MethodIds.AcknowledgeableConditionType_Acknowledge →
        OpcUaOperation.AlarmAcknowledge
    MethodIds.AcknowledgeableConditionType_Confirm →
        OpcUaOperation.AlarmConfirm
    everything else → OpcUaOperation.Call
  Lets the ACL distinguish "can acknowledge alarms" from "can invoke
  arbitrary methods" without conflating the two roles.
- Shelve dispatch paths through per-instance ShelvedStateMachine methods
  with dynamic NodeIds that can't be constant-matched — falls through
  to generic Call. Fine-grained OpcUaOperation.AlarmShelve is a follow-
  up when the method-invocation path grows a "method-role" annotation.

Extracted GateCallMethodRequests + MapCallOperation as static internal
for unit-testability. 8 new tests (MapCallOperation Acknowledge /
Confirm / generic; gate-null no-op, denied-Acknowledge, allowed-
Acknowledge, mixed-batch, pre-populated-error-preserved).
Server.Tests 269 → 277.

Known follow-ups:
- Shelve per-operation gating (see above).
- TranslateBrowsePathsToNodeIds gating (Browse follow-up from #120).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:22:19 -04:00
Joseph Doherty 6a6b0f56f2 Phase 6.2 Stream C — CreateMonitoredItems per-item gating
Closes task #121 (partial — creation-time gate; decision #153 per-item
revocation stamp is a follow-up).

Before this commit a session could subscribe to any node via
CreateMonitoredItems, even nodes where Read was denied — the
subscription would surface BadUserAccessDenied on each data-change
read, but the client saw a successful CreateMonitoredItems response
and held the subscription open, wasting resources and leaking the
address-space shape through the item metadata.

New override on DriverNodeManager.CreateMonitoredItems:
- Pre-iterates itemsToCreate, gates each through AuthorizationGate with
  OpcUaOperation.CreateMonitoredItems at the target node's scope.
- For denied slots: sets errors[i] = new ServiceResult(
  StatusCodes.BadUserAccessDenied). The OPC Foundation base stack
  honours pre-populated non-success errors and skips item creation for
  those slots — the subscription never holds a handle to a denied
  node.
- Preserves prior errors (e.g. BadNodeIdUnknown) — first diagnosis wins.
- Non-string-identifier references (stack-synthesized numeric ids)
  bypass the gate.

Extracted the pure filter logic into
GateMonitoredItemCreateRequests(items, errors, identity, gate,
scopeResolver) — static internal, unit-testable without the OPC UA
server stack.

Tests — 6 new in MonitoredItemGatingTests.cs (gate-null no-op,
denied-gets-BadUserAccessDenied, allowed-passes, mixed-batch-denies-
per-item, pre-populated-error-preserved, numeric-id-bypass). Server.Tests
263 → 269.

Known follow-ups:
- Per-item (AuthGenerationId, MembershipVersion) stamp (decision #153)
  for detecting revocation mid-subscription — needs subscription-layer
  plumbing.
- TransferSubscriptions not yet wired (same pattern, smaller scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:17:40 -04:00
Joseph Doherty e8b8541554 Phase 6.2 Stream C — Browse gating on DriverNodeManager
Closes task #120 (partial — strict point-check; ancestor-visibility
implication is a follow-up).

Before this commit DriverNodeManager exposed every materialized node to
every browsing session regardless of the user's ACL. Read + Write +
HistoryRead were already gated through AuthorizationGate in Phase 6.2
Stream C core; Browse was the one surface where the session could still
enumerate nodes it had no permission to touch, discovering structure
even when reads failed with BadUserAccessDenied.

Implementation
- New `Browse` override on DriverNodeManager that calls base.Browse
  first (lets the stack populate the reference list normally), then
  post-filters the IList<ReferenceDescription> so denied nodes are
  removed silently. OPC UA convention: Browse filtering is invisible to
  the client; no BadUserAccessDenied surfaces.
- Extracted the filter loop into the static internal
  `FilterBrowseReferences(references, userIdentity, gate, scopeResolver)`
  so the policy is unit-testable without standing up the full OPC UA
  server stack.
- Non-string NodeId identifiers (stack-synthesized standard-type
  references with numeric identifiers) bypass the gate — only driver-
  materialized nodes key into the authz trie.
- When AuthorizationGate or NodeScopeResolver is null, the filter is a
  no-op — preserves the pre-Phase-6.2 dispatch path for integration
  tests that construct DriverNodeManager without authz.

Tests — 6 new in BrowseGatingTests.cs (gate-null no-op, empty-list
no-op, denied-removed, allowed-passes-through, numeric-id bypass,
lax-mode null-identity keeps references). Server.Tests 257 → 263.

Known follow-up (tracked implicitly under #120 re-scope):
- Ancestor-visibility implication (acl-design.md §Browse line 111): a
  user with Read at `Line/Tag` should be able to Browse `Line` even
  without an explicit Browse grant. Current filter does a strict
  point-check. Proper fix needs TriePermissionEvaluator to expose a
  "subtree-has-any-grant" query.
- TranslateBrowsePathsToNodeIds not yet filtered (same extension
  pattern; small follow-up).

Docs: v2-release-readiness.md Phase 6.2 Stream C hardening list marks
the Browse bullet struck-through with "Partial" close-out note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:11:19 -04:00
Joseph Doherty a23de2a7e4 Phase 6.3 A.2 + D.1 — GenerationRefreshHostedService: poll + lease-wrap apply
Closes tasks #132 + #118 (GA hardening backlog).

Before this commit, the Server only observed the generation in force at
process start (SealedBootstrap). Peer-published generations accumulated
in the shared config DB while the running node kept serving the
generation it had sealed on boot. Two consequences:

1. Operator role-swaps required a process restart — Admin publishes a
   new generation, but the Server's RedundancyCoordinator never re-read
   the topology.
2. ApplyLeaseRegistry had no apply to wrap. ServiceLevelBand sat at
   PrimaryHealthy (255) during every publish because nothing opened a
   lease; PrimaryMidApply (200) was effectively dead code.

New GenerationRefreshHostedService (src/.../Server/Hosting/):
- Polls sp_GetCurrentGenerationForCluster every 5s (tunable).
- On change: opens leases.BeginApplyLease(newGenerationId, Guid.NewGuid()),
  calls coordinator.RefreshAsync inside the `await using`, releases on
  scope exit (success / exception / cancellation via IAsyncDisposable).
- Diagnostic properties: LastAppliedGenerationId, TickCount, RefreshCount.
- Delegate-injected currentGenerationQuery for test drive-through; real
  path is the private static DefaultQueryCurrentGenerationAsync.
- Registered as HostedService in Program.cs alongside the Phase 6.3
  redundancy / peer-probe stack.

Scope intentionally narrow: only the coordinator refreshes today. Driver
re-init, virtual-tag re-bind, script-engine reload remain as follow-up
wiring. The lease wrap is the right seam for those subscribers to hook
once they grow hot-reload support — the doc comments say so.

Tests
- 5 new unit tests in GenerationRefreshHostedServiceTests (first-apply,
  identity no-op, change-triggers-refresh, null-generation-is-no-op,
  lease-is-released-on-exit). Stub generation-query delegate; real
  coordinator backed by EF InMemory DB.
- Server.Tests total 252 → 257.

Docs
- v2-release-readiness.md Phase 6.3 follow-ups list marks the
  sp_PublishGeneration lease wrap bullet struck-through with close-out
  note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:02:33 -04:00
Joseph Doherty de77d42eab Phase 6.3 Stream B — peer-probe HostedServices populating PeerReachabilityTracker
Closes task #116 (GA hardening backlog). Before this commit the
RedundancyStatePublisher saw PeerReachability.Unknown for every peer
because the tracker had no writers — every healthy peer got
degraded to the Isolated-Primary band (230) even when fully reachable.
Not release-blocking (safe default), but not the full non-transparent-
redundancy UX either.

Two-layer probe model per docs/v2/implementation/phase-6-3-redundancy-runtime.md
§Stream B:

- PeerHttpProbeLoop (Stream B.1) — fast-fail layer at 2 s / 1 s timeout.
  Hits each peer's http://{Host}:{DashboardPort}/healthz via an injected
  IHttpClientFactory. Writes the HTTP bit of PeerReachability while
  preserving the UA bit from the last UA probe so a transient HTTP blip
  doesn't clobber the authoritative UA reading.

- PeerUaProbeLoop (Stream B.2) — authoritative layer at 10 s / 5 s
  timeout. Calls DiscoveryClient.GetEndpoints against opc.tcp://{Host}:
  {OpcUaPort} — cheap compared to a full Session.Create, no cert trust
  required. Short-circuits when the HTTP probe last reported the peer
  unhealthy (no wasted handshakes on a known-dead endpoint), clearing
  the stale UaHealthy bit in that case.

Both inherit from BackgroundService, follow the tick/delay/catch pattern
RedundancyPublisherHostedService + ResilienceStatusPublisherHostedService
established, and expose TickAsync() as internal for test drive-through.

New PeerProbeOptions class carries the four intervals/timeouts so
operators can tune cadence per site. Registered as singleton in Program.cs;
HTTP client registered by name so the OtOpcUa handler chain
(Serilog enrichers, potential future OpenTelemetry instrumentation) isn't
bypassed.

Tests — 9 new unit tests across PeerHttpProbeLoopTests (5) and
PeerUaProbeLoopTests (4). All pass. Server.Tests total 243 → 252.
Full solution build clean.

Docs: v2-release-readiness.md Phase 6.3 follow-ups list marks the
peer-probe bullet struck-through with a close-out note.

Still deferred in Phase 6.3:
  - OPC UA variable-node binding (task #117 — ServiceLevel + ServerUriArray)
  - sp_PublishGeneration lease wrap (task #118)
  - Client interop matrix (task #119)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:53:38 -04:00
Joseph Doherty 96918b148c Unblock phase-6 compliance meta-runner on task-galaxy-e2e
Two small fixes so `scripts/compliance/phase-6-all.ps1` exits 0 — this is
GA exit-criterion #1 from docs/v2/v2-release-readiness.md.

1. Admin csproj: bump OpenTelemetry.Extensions.Hosting 1.15.2 → 1.15.3 +
   OpenTelemetry.Exporter.Prometheus.AspNetCore 1.15.2-beta.1 →
   1.15.3-beta.1. Fixes NU1902 moderate-severity advisory
   (GHSA-g94r-2vxg-569j) on the transitive OpenTelemetry.Api 1.15.2 pull.
   TreatWarningsAsErrors on the Admin project promoted the advisory to an
   error and failed the whole `dotnet test` run at restore.

2. SchemaComplianceTests.All_expected_tables_exist: the expected-tables
   list drifted behind four Phase 7 migration additions — Script,
   ScriptedAlarm, ScriptedAlarmState, VirtualTag. The EF model + live
   migrations have carried these tables for a while; the compliance test
   just needed the four names added. Applied migrations against a scratch
   DB to confirm the list is exhaustive.

Verification: full solution test pass 2301 / 2301 (one tolerated
pre-existing CLI flake). Phase 6 aggregate compliance: all four phases
PASS with no test-count regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:36:20 -04:00
Joseph Doherty 69e0d02c72 task-galaxy-e2e branch — non-FOCAS work-in-progress snapshot
Catch-all commit for pending work on the task-galaxy-e2e branch that
wasn't part of the FOCAS migration. Grouping by topic so future per-topic
commits can be cherry-picked if needed.

TwinCAT
- src/.../Driver.TwinCAT/AdsTwinCATClient.cs + TwinCATDriverFactoryExtensions.cs:
  factory-registration extensions + ADS client refinements.
- src/.../Driver.TwinCAT.Cli/Commands/BrowseCommand.cs: new browse command
  for the TwinCAT test-client CLI.
- tests/.../Driver.TwinCAT.IntegrationTests/TwinCAT3SmokeTests.cs + TwinCatProject/:
  fixture scaffold with a minimal POU + README pointing at the TCBSD/ESXi
  VM for e2e.
- docs/Driver.TwinCAT.Cli.md + docs/drivers/TwinCAT-Test-Fixture.md:
  documentation for the above.
- docs/v3/twincat-backlog.md: forward-looking backlog seed.

Admin UI + fleet status
- src/.../Admin/Components/Pages/Clusters/DriversTab.razor + Hosts.razor:
  UI refresh for fleet-status rendering.
- src/.../Admin/Hubs/FleetStatusHub.cs + FleetStatusPoller.cs +
  Admin/Program.cs: SignalR hub + poller plumbing for live fleet data.
- tests/.../Admin.Tests/FleetStatusPollerTests.cs: poller coverage.

Server + redundancy runtime (Phase 6.3 follow-ups)
- src/.../Server/Hosting/RedundancyPublisherHostedService.cs: HostedService
  that owns the RedundancyStatePublisher lifecycle + wires peer reachability.
- src/.../Server/Redundancy/ServerRedundancyNodeWriter.cs: OPC UA
  variable-node writer binding ServiceLevel + ServerUriArray to the
  publisher's events.
- src/.../Server/Program.cs + Server.csproj: hosted-service registration.
- tests/.../Server.Tests/ServerRedundancyNodeWriterTests.cs +
  Server.Tests.csproj: coverage for the above.

Configuration
- src/.../Configuration/Validation/DraftValidator.cs +
  tests/.../Configuration.Tests/DraftValidatorTests.cs: draft-validation
  refinements.

E2E scripts (shared infrastructure)
- scripts/e2e/README.md + _common.ps1 + test-all.ps1: shared helpers + the
  all-drivers test-all runner.
- scripts/e2e/test-opcuaclient.ps1: OPC UA Client e2e runner.

Docs
- docs/v2/implementation/phase-6-{1,2,3,4}*.md + exit-gate-phase-{3,7}.md:
  phase-gate + implementation doc updates.
- docs/v2/plan.md: top-level plan refresh.
- docs/v2/redundancy-interop-playbook.md: client interop playbook for the
  Phase 6.3 redundancy-runtime work.

Two orphan FOCAS docs remain on disk but deliberately unstaged —
docs/v2/focas-deployment.md and docs/v2/implementation/focas-simulator-plan.md
describe the now-retired Tier-C topology and should either be rewritten
or deleted in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:12:19 -04:00
Joseph Doherty 4b0664bd55 FOCAS — retire Tier-C split, inline managed wire client, make read-only
Migration closes the FOCAS Tier-C architecture. OtOpcUa previously had
`Driver.FOCAS.Host` (NSSM-wrapped Windows service loading Fwlib64.dll via
P/Invoke) + `Driver.FOCAS.Shared` (MessagePack IPC contracts) + a C shim
DLL stand-in for unit tests. All of it is deleted; the driver is now a
single in-process managed assembly talking the FOCAS/2 Ethernet binary
protocol directly on TCP:8193.

Architecture

- Pure-managed `FocasWireClient` inlined at `src/.../Driver.FOCAS/Wire/`
  (owner-imported — see Wire/FocasWireClient.cs for the full surface).
  Opens two TCP sockets, runs the initiate handshake, serialises requests
  on socket 2 through a semaphore, closes cleanly with PDU + socket
  teardown. Both sync `IDisposable` and async `IAsyncDisposable`.
- `WireFocasClient` (same folder) adapts the wire client to OtOpcUa's
  `IFocasClient` surface — fixed-tree reads, PARAM/MACRO/PMC addresses,
  alarms. Writes return `BadNotWritable` by design — OtOpcUa is read-only
  against FOCAS.
- `FocasDriverFactoryExtensions` now accepts `"Backend": "wire"` (default)
  and `"Backend": "unimplemented"`. Legacy `ipc` and `fwlib` backends are
  rejected at startup with a diagnostic pointing at the migration doc.

Deletions

- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/` — whole project + Ipc/,
  Backend/, Stability/, Program.cs.
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared/` — Contracts/, FrameReader,
  FrameWriter, whole project.
- `tests/...Driver.FOCAS.Host.Tests/` + `.Shared.Tests/` — whole projects.
- `src/.../Driver.FOCAS/FwlibNative.cs` + `FwlibFocasClient.cs` — 21
  P/Invokes + 7 `Pack=1` marshalling structs + the Fwlib-backed
  `IFocasClient` implementation.
- `src/.../Driver.FOCAS/Ipc/` + `Supervisor/` — IPC client wrapper +
  Host-process supervisor (backoff, circuit breaker, heartbeat, post-
  mortem reader, process launcher).
- `scripts/install/Install-FocasHost.ps1` — NSSM service installer.
- `tests/.../Driver.FOCAS.Tests/{IpcFocasClientTests, IpcLoopback,
  FwlibNativeHelperTests, PostMortemReaderCompatibilityTests,
  SupervisorTests, FocasDriverFactoryExtensionsTests}.cs` — tests that
  exercised the retired surfaces.
- `tests/.../Driver.FOCAS.IntegrationTests/Shim/` — the zig-built C shim
  DLL that masqueraded as Fwlib64.dll.

Solution changes

- `ZB.MOM.WW.OtOpcUa.slnx` drops the 4 retired project refs.
- `src/.../Driver.FOCAS.csproj` drops the Shared ProjectReference, adds
  `Microsoft.Extensions.Logging.Abstractions` for the optional `ILogger`
  hook in `FocasWireClient`.
- `src/.../Driver.FOCAS.Cli.csproj` drops the six `<Content Include>`
  entries that copied `vendor/fanuc/*.dll` into the CLI bin. CLI now uses
  `WireFocasClient` directly.
- `FocasDriver` default factory flips to `Wire.WireFocasClientFactory`.

Integration tests

- New `tests/.../Driver.FOCAS.IntegrationTests/` project covering fixed-
  tree reads (identity, axes, dynamic, program, operation mode, timers,
  spindle load + max RPM, servo meters), user-authored PARAM / MACRO /
  PMC reads, `DiscoverAsync` emission, `SubscribeAsync` + `OnDataChange`,
  `IAlarmSource` raise/clear transitions, and `ProbeAsync` /
  `OnHostStatusChanged`. 9 e2e tests against the focas-mock fixture
  (Docker container with the vendored Python mock's native FOCAS/2
  Ethernet responder).
- `scripts/integration/run-focas.ps1` orchestrates compose up → tests →
  compose down. Dropped the shim-build stage + DLL-copy step + the split
  testhost workaround (the latter only existed because of native-DLL
  lifecycle bugs the shim tripped).
- Docker compose collapses from 11 per-series services to one `focas-sim`
  service. Tests seed per-series state via `mock_load_profile` at test
  start.
- Vendored focas-mock snapshot refreshed to pick up upstream's native
  FOCAS/2 Ethernet responder (was 660 lines, now 1018) — the
  pre-refresh snapshot only spoke the JSON admin protocol.

Tests

- 145/145 unit tests in `Driver.FOCAS.Tests` pass (was 208 pre-deletion;
  63 removed tests exercised the retired IPC/shim/supervisor/Fwlib
  surfaces).
- 9/9 integration tests pass against the refreshed mock.
- `FocasScaffoldingTests.Unimplemented_factory_throws_on_Create…` updated
  to assert the new diagnostic message pointing at
  `docs/drivers/FOCAS.md` rather than the now-gone `Fwlib64.dll`.

Docs

- `docs/drivers/FOCAS.md` rewritten for the managed wire topology —
  deployment collapses to one `"Backend": "wire"` config block, no
  separate service, no DLL deployment, no pipe ACL.
- `docs/drivers/FOCAS-Test-Fixture.md` updated — single TCP probe skip
  gate instead of TCP + shim probe; fewer moving parts.
- `docs/drivers/README.md` row for FOCAS reflects the Tier-A managed
  topology (previously listed Tier-C + `Fwlib64.dll` P/Invoke).
- `docs/Driver.FOCAS.Cli.md` drops the Tier-C architecture-note section.
- `docs/v2/implementation/focas-isolation-plan.md` marked historical —
  the plan it documents was executed then superseded by the wire client.
- `docs/v2/v2-release-readiness.md` re-audited 2026-04-24. Phase 5
  driver complement closed. FOCAS change-log entry added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:10:59 -04:00
Joseph Doherty 404b54add0 FOCAS — commit previously-orphaned support files
Brings seven FOCAS-related files into git that shipped as part of earlier
FOCAS work but were never staged. Adding them now so the tree reflects the
compilable state + pre-empts dead references from the migration commit that
follows:

- src/.../Driver.FOCAS/FocasAlarmProjection.cs — raise/clear diffing + severity
  mapping surfaced via IAlarmSource on FocasDriver. Referenced by committed
  FocasDriver.cs; tests in FocasAlarmProjectionTests.cs.
- src/.../Admin/Services/FocasDriverDetailService.cs — Admin UI per-instance
  detail page data source.
- src/.../Admin/Components/Pages/Drivers/FocasDetail.razor — Blazor page
  rendering the above (from task #69).
- tests/.../Admin.Tests/FocasDriverDetailServiceTests.cs — exercises the
  detail service.
- tests/.../Driver.FOCAS.Tests/FocasAlarmProjectionTests.cs — raise/clear
  diff semantics against FakeFocasClient.
- tests/.../Driver.FOCAS.Tests/FocasHandleRecycleTests.cs — proactive recycle
  cadence test.
- docs/v2/implementation/focas-wire-protocol.md — captured FOCAS/2 Ethernet
  wire protocol reference. Useful going forward even though the Tier-C /
  simulator plan docs are historical.

No runtime behaviour change — these files compile today and the solution
build/test pass already depends on them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:09:51 -04:00
386 changed files with 45271 additions and 5301 deletions
+10 -4
View File
@@ -12,14 +12,16 @@
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ZB.MOM.WW.OtOpcUa.Driver.Modbus.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.S7/ZB.MOM.WW.OtOpcUa.Driver.S7.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/ZB.MOM.WW.OtOpcUa.Driver.AbCip.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.Shared/ZB.MOM.WW.OtOpcUa.Client.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.CLI/ZB.MOM.WW.OtOpcUa.Client.CLI.csproj"/>
@@ -49,7 +51,12 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests.csproj"/>
@@ -65,8 +72,7 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests.csproj"/>
+17 -21
View File
@@ -5,27 +5,22 @@ protocol. Uses the **same** `FocasDriver` the OtOpcUa server does — PMC R/G/F
file registers, axis bits, parameters, and macro variables — all through
`FocasAddressParser` syntax.
Sixth of the driver test-client CLIs, added alongside the Tier-C isolation
work tracked in task #220.
Sixth of the driver test-client CLIs.
## Architecture note
FOCAS is a Tier-C driver: `Fwlib32.dll` is a proprietary 32-bit Fanuc library
with a documented habit of crashing its hosting process on network errors.
The target runtime deployment splits the driver into an in-process
`FocasProxyDriver` (.NET 10 x64) and an out-of-process `Driver.FOCAS.Host`
(.NET 4.8 x86 Windows service) that owns the DLL — see
[v2/implementation/focas-isolation-plan.md](v2/implementation/focas-isolation-plan.md)
and
[v2/implementation/phase-6-1-resilience-and-observability.md](v2/implementation/phase-6-1-resilience-and-observability.md)
for topology + supervisor / respawn / back-pressure design.
FOCAS is an in-process driver. The pure-managed `WireFocasClient`
speaks the FOCAS2 binary protocol directly over TCP:8193, removing the
Tier-C process-isolation split that the historical P/Invoke + out-of-
process Host arrangement required. The CLI loads `FocasDriver` with
`WireFocasClientFactory` and talks to the CNC without any native
components.
The CLI skips the proxy and loads `FocasDriver` directly (via
`FwlibFocasClientFactory`, which P/Invokes `Fwlib32.dll` in the CLI's own
process). There is **no public simulator** for FOCAS; a meaningful probe
requires a real CNC + a licensed `Fwlib32.dll` on `PATH` (or next to the
executable). On a dev box without the DLL, every wire call surfaces as
`BadCommunicationError` — still useful as a "CLI wire-up is correct" signal.
A dev-friendly mock is available — start
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml`
and point `--cnc-host` at `localhost` for end-to-end CLI exercises
without a real CNC. See
[drivers/FOCAS-Test-Fixture.md](drivers/FOCAS-Test-Fixture.md).
## Build + run
@@ -152,7 +147,8 @@ fails.
**"Why did this macro flip?"** → `subscribe` to the macro, let the
operator reproduce the cycle, watch the HH:mm:ss.fff timeline.
**"Is the Fwlib32 DLL wired up?"** → `probe` against any host. A
`DllNotFoundException` surfacing as `BadCommunicationError` with a
matching `Last error` line means the driver is loading but the DLL is
missing; anything else means a transport-layer problem.
**"Can I reach the CNC on TCP:8193?"** → `probe` against any host. A
`BadCommunicationError` means the wire client couldn't open a socket
(firewall / wrong host / FOCAS Ethernet option unlicensed on the CNC).
`BadDeviceFailure` after a successful connect means the CNC is rejecting
the session setup — check the CNC's FOCAS option and password settings.
+34
View File
@@ -119,3 +119,37 @@ address.
**"What's the right byte order for this family?"** → `read` with
`--byte-order BigEndian`, then with `--byte-order WordSwap`. The one that
gives plausible values is the correct one for that device.
## v2 addressing grammar
The driver accepts the industry-standard tag-address grammar so you can
paste tag spreadsheets from Wonderware / Kepware / Ignition without
per-row manual translation. Full reference + grammar rules:
[`docs/v2/modbus-addressing.md`](v2/modbus-addressing.md).
Quick examples:
```
40001 HoldingRegisters[0], Int16
400001 same, 6-digit form
40001:F Float32
40001:F:CDAB Float32 word-swapped
40001:STR20 20-char ASCII string
40001:S:5 Int16[5] array (3-field shorthand)
40001:F:CDAB:10 Float32[10] with explicit word-swap (4-field strict)
40001.5 bit 5 of HR[0]
HR1:I Int32 via mnemonic region prefix (matches Wonderware)
C100 Coil 100 (mnemonic, 1-based)
V2000:F:CDAB DL205 V-memory at PDU 1024 + Float32 + word-swap (Family=DL205)
D100:I MELSEC D-register 100, Int32 (Family=MELSEC)
```
**Type-code reminder** (post-#146): `:I` is **Int32** (matches Wonderware
DASMBTCP + Ignition `HRI`). The explicit Int16 code is `:S`. Bare HR/IR
with no type still defaults to Int16. Pre-#146 codes `:DI` / `:L` /
`:UDI` / `:UL` / `:LI` / `:ULI` / `:LBCD` are removed; configs that use
them get a clear "Unknown type code" diagnostic at parse time.
In `DriverConfig` JSON, set the per-tag `addressString` field instead of
the structured `region` + `address` + `dataType` fields. Both styles can
coexist within one driver instance.
+39 -4
View File
@@ -1,9 +1,9 @@
# `otopcua-twincat-cli` — Beckhoff TwinCAT test client
Ad-hoc probe / read / write / subscribe tool for Beckhoff TwinCAT 2 / TwinCAT 3
runtimes via ADS. Uses the **same** `TwinCATDriver` the OtOpcUa server does
(`Beckhoff.TwinCAT.Ads` package). Native ADS notifications by default;
`--poll-only` falls back to the shared `PollGroupEngine`.
Ad-hoc probe / read / write / subscribe / browse tool for Beckhoff TwinCAT 2 /
TwinCAT 3 runtimes via ADS. Uses the **same** `TwinCATDriver` the OtOpcUa
server does (`Beckhoff.TwinCAT.Ads` package). Native ADS notifications by
default; `--poll-only` falls back to the shared `PollGroupEngine`.
Fifth (final) of the driver test-client CLIs.
@@ -50,6 +50,13 @@ caller interpret semantics.
### `probe`
Per-command flags:
| Flag | Default | Purpose |
|---|---|---|
| `-s` / `--symbol` | **required** | Symbol path to probe (e.g. `MAIN.bRunning`) |
| `--type` | `DInt` | Declared data type — see the [Data types](#data-types) list |
```powershell
# Local TwinCAT 3, probe a canonical global
otopcua-twincat-cli probe -n 127.0.0.1.1.1 -s "TwinCAT_SystemInfoVarList._AppInfo.OnlineChangeCnt"
@@ -89,6 +96,14 @@ Structure writes refused — drop to driver config JSON for those.
### `subscribe`
Per-command flags:
| Flag | Default | Purpose |
|---|---|---|
| `-s` / `--symbol` | **required** | Symbol path — same format as `read` |
| `-t` / `--type` | `DInt` | Declared data type |
| `-i` / `--interval-ms` | `1000` | Publishing interval in **milliseconds** — native mode passes this as the ADS `NotificationSettings.CycleTime` |
```powershell
# Native ADS notifications (default) — PLC pushes on its own cycle
otopcua-twincat-cli subscribe -n 192.168.1.40.1.1 -s GVL.Counter -t DInt -i 500
@@ -99,3 +114,23 @@ otopcua-twincat-cli subscribe -n 192.168.1.40.1.1 -s GVL.Counter -t DInt -i 500
The subscribe banner announces which mechanism is in play — "ADS notification"
or "polling" — so it's obvious in screen-recorded bug reports.
### `browse`
Walks the controller's symbol table via ADS `SymbolLoaderFactory` (same path
`TwinCATDriver.DiscoverAsync` takes when `EnableControllerBrowse = true`).
Output filters to symbols whose type maps onto the driver's atomic surface —
UDTs / function-block instances don't appear.
| Flag | Default | Purpose |
|---|---|---|
| `--prefix` | _(none)_ | Case-sensitive instance-path prefix filter (e.g. `GVL_Fixture`) |
| `--max` | `500` | Max symbols to print. `0` = unbounded |
```powershell
# Everything under a single GVL
otopcua-twincat-cli browse -n 192.168.1.40.1.1 --prefix GVL_Fixture
# Full dump (beware: flat-mode walks on a real controller can top 10k symbols)
otopcua-twincat-cli browse -n 192.168.1.40.1.1 --max 0
```
+1 -1
View File
@@ -83,7 +83,7 @@ The host spins up `StaPump` (the STA thread with message pump), creates the MXAc
### Pipe security
`PipeServer` builds a `PipeAcl` from the provided `SecurityIdentifier` + uses `NamedPipeServerStream` with `maxNumberOfServerInstances: 1`. The handshake requires a matching shared secret in the first Hello frame; callers whose SID doesn't match `OTOPCUA_ALLOWED_SID` are rejected before any frame is processed. **By design the pipe ACL denies BUILTIN\Administrators** — live smoke tests must therefore run from a non-elevated shell that matches the allowed principal. The installed dev host (`OtOpcUaGalaxyHost`) runs as `dohertj2` with the secret at `.local/galaxy-host-secret.txt`.
`PipeServer` builds a `PipeAcl` from the provided `SecurityIdentifier` + uses `NamedPipeServerStream` with `maxNumberOfServerInstances: 1`. The handshake requires a matching shared secret in the first Hello frame; callers whose SID doesn't match `OTOPCUA_ALLOWED_SID` are rejected before any frame is processed via `NamedPipeServerStream.RunAsClient` + a SID comparison against the configured allow list. The DACL grants `ReadWrite | Synchronize` only to the allowed SID and denies `LocalSystem`. The installed dev host (`OtOpcUaGalaxyHost`) runs as `dohertj2` with the secret at `.local/galaxy-host-secret.txt`.
### Installation
+113 -96
View File
@@ -2,132 +2,149 @@
Coverage map + gap inventory for the FANUC FOCAS2 CNC driver.
**TL;DR: there is no integration fixture.** Every test uses a
`FakeFocasClient` injected via `IFocasClientFactory`. Fanuc's FOCAS library
(`Fwlib32.dll`) is closed-source proprietary with no public simulator;
CNC-side behavior is trusted from field deployments.
**Status:** as of 2026-04-24, OtOpcUa speaks FOCAS2 directly over TCP
via the pure-managed [`Focas.Wire`](https://github.com/Ladder99/focas-mock/tree/main/dotnet/Focas.Wire)
client. Integration tests run the managed driver end-to-end against the
vendored `focas-mock` Python server (at
[`tests/.../Docker/focas-mock/`](../../tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/focas-mock/VENDORED.md))
whose native FOCAS Ethernet responder is verified PDU-by-PDU against the
real `fwlibe64.dll`.
## What the fixture is
No shim DLL, no P/Invoke, no licensed binary — any dev box or CI runner
with Docker can run the full fixture end-to-end.
Nothing at the integration layer.
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` is unit-only. The driver ships
as Tier C (process-isolated) per `docs/v2/driver-stability.md` because the
FANUC DLL has known crash modes; tests can't replicate those in-process.
Hardware validation against a real CNC is still useful to catch
series-specific firmware quirks (see [§ Hardware-only gaps](#hardware-only-gaps))
but the mock's wire responder covers every FOCAS call OtOpcUa issues.
## What it actually covers (unit only)
## What the fixture covers
- `FocasCapabilityTests` — data-type mapping (PMC bit / word / float,
macro variable types, parameter types)
- `FocasCapabilityMatrixTests` — per-CNC-series range validation (macro
/ parameter / PMC letter + number) across 16i / 0i-D / 0i-F /
30i / PowerMotion. See [`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md)
for the authoritative matrix. 46 theory cases lock every documented
range boundary — widening a range without updating the doc fails a
test.
- `FocasReadWriteTests` — read + write against the fake, FOCAS native status
→ OPC UA StatusCode mapping
### Unit layer (no container required)
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` uses `FakeFocasClient`
injected via `IFocasClientFactory`:
- `FocasCapabilityTests` — data-type mapping (PMC bit / byte / word /
long / float / double, macro variable types, parameter types)
- `FocasCapabilityMatrixTests` — per-CNC-series range validation across
16i / 0i-D / 0i-F / 30i / Power Motion, 46 theory cases locking every
documented range boundary. See
[`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md).
- `FocasReadWriteTests` — read / write contract against the fake, FOCAS
native status → OPC UA `StatusCode` mapping
- `FocasScaffoldingTests``IDriver` lifecycle + multi-device routing
- `FocasPmcBitRmwTests` — PMC bit read-modify-write synchronization (per-byte
`SemaphoreSlim`, mirrors the AB / Modbus pattern from #181)
- `FwlibNativeHelperTests``Focas32.dll``Fwlib32.dll` bridge validation
+ P/Invoke signature validation
- `FocasPmcBitRmwTests` — PMC bit read-modify-write synchronisation
- `FocasAlarmProjectionTests` — raise / clear diffing, severity mapping
- `FocasHandleRecycleTests` — proactive session recycle cadence
Capability surfaces whose contract is verified: `IDriver`, `IReadable`,
`IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
`IPerCallHostResolver`.
`ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
`IPerCallHostResolver`, `IAlarmSource`. `IWritable` intentionally
returns `BadNotWritable` — OtOpcUa is read-only against FOCAS.
Pre-flight validation runs in `FocasDriver.InitializeAsync` — configs
referencing out-of-range addresses fail at load time with a diagnostic
message naming the CNC series + documented limit. This closes the
cheap half of the hardware-free stability gap; Tier-C process
isolation (task #220) closes the expensive half — see
[`docs/v2/implementation/focas-isolation-plan.md`](../v2/implementation/focas-isolation-plan.md).
message naming the CNC series + documented limit.
## What it does NOT cover
### Integration layer (mock only, no CNC, no shim)
### 1. FOCAS wire traffic
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/` drives the
managed `FocasDriver` end-to-end. A single gate:
No FOCAS TCP frame is sent. `Fwlib32.dll`'s TCP-to-FANUC-gateway exchange is
closed-source; the driver trusts the P/Invoke layer per #193. Real CNC
correctness is trusted from field deployments.
**Docker compose up** — tests skip when the TCP probe to
`localhost:8193` fails with a pointer to the compose command.
### 2. Alarm / parameter-change callbacks
When the mock is up, `WireFocasClient` dials it over TCP exactly like a
real CNC, and the mock's native FOCAS Ethernet responder replies with
binary PDUs against the documented command IDs. Covered assertions:
FOCAS has no push model — the driver polls via the shared `PollGroupEngine`.
There are no CNC-initiated callbacks to test; the absence is by design.
- Session open / close (`cnc_allclibhndl3` + `cnc_freelibhndl`)
- Parameter read-back after `mock_patch` seed → `cnc_rdparam`
- Macro read-back after seed → `cnc_rdmacro` (scaled-decimal
translation verified)
- PMC range read after seed → `pmc_rdpmcrng`
- `IAlarmSource` raise + clear transitions after `mock_patch`
alarm-list changes → `cnc_rdalmmsg2`
- Fixed-tree bootstrap: identity / axes / spindle / program / timers /
servo meters populate via `cnc_sysinfo`, `cnc_rdaxisname`,
`cnc_rdspdlname`, `cnc_rddynamic2`, `cnc_exeprgname2`,
`cnc_rdblkcount`, `cnc_rdopmode`, `cnc_rdsvmeter`, `cnc_rdspload`,
`cnc_rdspmaxrpm`, `cnc_rdtimer`
- Per-series profile selection via `mock_load_profile` — tests can
pin one profile and assert series-gated capability suppression
### 3. Macro / ladder variable types
### E2E script (CLI)
FANUC has CNC-specific extensions (macro variables `#100-#999`, system
variables `#1000-#5000`, PMC timers / counters / keep-relays) whose
per-address semantics differ across 0i-F / 30i / 31i / 32i Series. Driver
covers the common address shapes; per-model quirks are not stressed.
`scripts/e2e/test-focas.ps1` drives the Client.CLI against a running
OtOpcUa server. Accepts:
### 4. Model-specific behavior
- `-CncHost` / `-CncPort` for real hardware
- `-ProfileName <compose-profile>` for the Docker mock
- `-Series <csv>` for per-series matrix mode
- `-HandleLeakCycles <N>` for handle-leak stress
- Alarm retention across power cycles (model-specific CNC behavior)
- Parameter range enforcement (CNC rejects out-of-range writes)
- MTB (machine tool builder) custom screens that expose non-standard data
## Hardware-only gaps
### 5. Tier-C process isolation — architecture shipped, Fwlib32 integration hardware-gated
The mock has parity with the real `fwlibe64.dll` for the calls OtOpcUa
issues, but a real CNC can still surface things a reference
implementation can't:
The Tier-C architecture is now in place as of PRs #169#173 (FOCAS
PR AE, task #220):
1. **Series-specific firmware quirks** — alarm retention across power
cycles, parameter range enforcement by the CNC (not the driver),
MTB custom screens, series-specific option bits. Each series has
documented behaviours that only a bench CNC exercises.
2. **Wire-level stress** — burst reads, concurrent device writes,
network-partition recovery under load. The mock handles these
correctly but production behaviour is the source of truth.
3. **Transient operational states** — alarm floods, emergency-stop
transitions, power-on resync. These are easy to stub but hard to
cover comprehensively in the mock.
- `Driver.FOCAS.Shared` carries MessagePack IPC contracts
- `Driver.FOCAS.Host` (.NET 4.8 x86 Windows service via NSSM) accepts
a connection on a strictly-ACL'd named pipe + dispatches frames to
an `IFocasBackend`
- `Driver.FOCAS.Ipc.IpcFocasClient` implements the `IFocasClient` DI
seam by forwarding over IPC — swap the DI registration and the
driver runs Tier-C with zero other changes
- `Driver.FOCAS.Supervisor.FocasHostSupervisor` owns the spawn +
heartbeat + respawn + 3-in-5min crash-loop breaker + sticky alert
- `Driver.FOCAS.Host.Stability.PostMortemMmf`
`Driver.FOCAS.Supervisor.PostMortemReader` — ring-buffer of the
last ~1000 IPC operations survives a Host crash
Track the close-out under task #54 (live-CNC smoke). When the rig
lands, the hardware path runs alongside the mock path; the mock
stays as the CI quality gate.
The one remaining gap is the production `FwlibHostedBackend`: an
`IFocasBackend` implementation that wraps the licensed
`Fwlib32.dll` P/Invoke. That's hardware-gated on task #222 — we
need a CNC on the bench (or the licensed FANUC developer kit DLL
with a test harness) to validate it. Until then, the Host ships
`FakeFocasBackend` + `UnconfiguredFocasBackend`. Setting
`OTOPCUA_FOCAS_BACKEND=fake` lets operators smoke-test the whole
Tier-C pipeline end-to-end without any CNC.
## When to trust each layer
## When to trust FOCAS tests, when to reach for a rig
| Question | Unit | Integration (mock) | Real CNC |
| --- | :---: | :---: | :---: |
| "Does PMC address `R100.3` route to the right bit?" | ✅ | ✅ | ✅ |
| "Does the Fanuc status → OPC UA StatusCode map cover every documented code?" | ✅ (contract) | ✅ | ✅ |
| "Does `FocasDriver.ReadAsync` correctly decode a seeded parameter?" | no | ✅ | ✅ |
| "Does `IAlarmSource` fire raise + clear events?" | ✅ (Fake) | ✅ (wire) | ✅ |
| "Does a real read against a 30i Series return correct bytes?" | no | ✅ (via profile) | ✅ (required) |
| "Do series-specific firmware quirks behave as documented?" | no | no | ✅ (required) |
| "Does the driver survive real network partitions?" | no | partial (socket kill) | ✅ (required) |
| Question | Unit tests | Real CNC |
| --- | --- | --- |
| "Does PMC address `R100.3` route to the right bit?" | yes | yes |
| "Does the FANUC status → OPC UA StatusCode map cover every documented code?" | yes (contract) | yes |
| "Does a real read against a 30i Series return correct bytes?" | no | yes (required) |
| "Does `Fwlib32.dll` crash on concurrent reads?" | no | yes (stress) |
| "Do macro variables round-trip across power cycles?" | no | yes (required) |
## Running the integration fixture
## Follow-up candidates
```powershell
# 1) Start the mock on a chosen profile.
docker compose -f tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml up -d
1. **Nothing public** — Fanuc's FOCAS Developer Kit ships an emulator DLL
but it's under NDA + tied to licensed dev-kit installations; can't
redistribute for CI.
2. **Lab rig** — used FANUC 0i-F simulator controller (or a retired machine
tool) on a dedicated network; only path that covers real CNC behavior.
3. **Process isolation first** — before trusting FOCAS in production at
scale, shipping the Tier-C out-of-process Host architecture (similar to
Galaxy) is higher value than a CI simulator.
# 2) Run the tests. No shim build, no DLL copy — the driver dials the mock directly.
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/
```
Or use `scripts/integration/run-focas.ps1` which wraps compose up / test
/ compose down and accepts `-Profile <name>` to pin a per-series run.
## Key fixture / config files
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/focas-mock/`
— vendored `focas-mock` Python source + Dockerfile
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml`
— per-series compose profiles
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/FocasSimFixture.cs`
— collection fixture + mock admin API client
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Series/FixedTreePopulatesTests.cs`
— fixed-tree end-to-end tests
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Series/WireBackendTests.cs`
— pure-wire-backend end-to-end tests
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/FakeFocasClient.cs`
in-process fake implementing `IFocasClient`
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/FocasCapabilityMatrixTests.cs`
— parameterized theories locking the per-series matrix
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasDriver.cs` — ctor takes
`IFocasClientFactory`
in-process unit fake
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/Wire/WireFocasClient.cs` — the
managed wire client backing production deployments
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasCapabilityMatrix.cs`
per-CNC-series range validator (the matrix the doc describes)
per-series range validator
- `docs/v2/focas-version-matrix.md` — authoritative range reference
- `docs/v2/implementation/focas-isolation-plan.md` — Tier-C isolation
plan (task #220)
- `docs/v2/driver-stability.md` — Tier C scope + process-isolation rationale
+238
View File
@@ -0,0 +1,238 @@
# FOCAS Driver
Getting-started guide for the FANUC FOCAS2 driver. This is the short path — for
the exhaustive per-node mapping read [`docs/v2/driver-specs.md §7`](../v2/driver-specs.md),
for deployment details read [`docs/v2/focas-deployment.md`](../v2/focas-deployment.md),
for the test-harness map read [FOCAS-Test-Fixture.md](FOCAS-Test-Fixture.md).
## What it talks to
FANUC CNCs (0i-D / 0i-F / 0i-MF / 0i-TF / 16i / 30i / 31i / 32i / Power Motion i)
over the proprietary FOCAS2 protocol on TCP port 8193. The wire is spoken
directly by the pure-managed [`Focas.Wire`](https://github.com/Ladder99/focas-mock)
client — no Fwlib64.dll, no P/Invoke, no out-of-process isolation needed.
OtOpcUa is **read-only** against FOCAS; all reads go over the native wire
protocol using the documented command IDs. Writes return
`BadNotWritable` by design.
## Project split
| Project | Target | Role |
|---------|--------|------|
| `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/` | net10.0 | In-process driver — hosts `WireFocasClient` which speaks FOCAS2 over TCP directly |
Previous `Driver.FOCAS.Host` / `Driver.FOCAS.Shared` Tier-C split has been
retired — the managed wire client removes the native-crash blast radius
that justified the out-of-process service.
## Minimum deployment
Register the driver instance in the main server's `appsettings.json`. No
separate service, no DLL deployment, no shared-secret handshake:
```jsonc
"Drivers": {
"focas-cnc-1": {
"Type": "FOCAS",
"Config": {
"Backend": "wire",
"Devices": [
{ "HostAddress": "focas://10.20.30.40:8193", "Series": "ThirtyOne_i" }
],
"Tags": [
{ "Name": "Mode", "DeviceHostAddress": "focas://10.20.30.40:8193",
"Address": "PARAM:3402", "DataType": "Int32", "Writable": false },
{ "Name": "SpndLoad", "DeviceHostAddress": "focas://10.20.30.40:8193",
"Address": "MACRO:500", "DataType": "Float64", "Writable": false }
]
}
}
}
```
The main server opens two TCP sockets per configured device and speaks the
FOCAS2 binary protocol directly. No local privileged components, no
platform bitness constraint — the driver runs on every host OtOpcUa runs
on.
## Address forms
| Form | Example | Meaning |
|------|---------|---------|
| `X0.0` / `R100` / `R100.3` | PMC bit or byte | Letter + number; optional `.bit` for bit access |
| `PARAM:1815` / `PARAM:1815/0` | CNC parameter | Number + optional axis index |
| `MACRO:500` | Custom macro variable | System / user macro variable number |
Addresses are validated against the per-device `Series` at `InitializeAsync`
a config referencing a number outside the documented range for that series
fails at load time with an error message naming the limit. See
[`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md) for the
authoritative range table.
## Backend selection
The driver picks its client from `Config.Backend`:
| Value | Client | Use it for |
|-------|--------|------------|
| `wire` (default) | `WireFocasClient` | Production — pure-managed FOCAS2 over TCP |
| `unimplemented` / `none` / `stub` | `UnimplementedFocasClientFactory` | Scaffolding a DriverInstance row before the CNC endpoint is reachable |
Previous backends (`fwlib`, `fwlib32`, `ipc`) have been retired along
with `Driver.FOCAS.Host` and the Fwlib P/Invoke path. Configs that still
reference them will throw at startup with a message pointing here.
## Capability surface
| Capability | Wire path | Notes |
|------------|-----------|-------|
| `IReadable` | `ReadAsync``cnc_rdpmcrng` / `cnc_rdparam` / `cnc_rdmacro` | One TCP request/response per read; `Focas.Wire` serializes requests on socket 2 internally |
| `IWritable` | returns `BadNotWritable` | OtOpcUa is read-only against FOCAS by design — no `cnc_wrparam` / `pmc_wrpmcrng` / `cnc_wrmacro` path is implemented |
| `ITagDiscovery` | `DiscoverAsync` | Emits `FOCAS/{device}/{tag}` folders per configured device |
| `ISubscribable` | polled via shared `PollGroupEngine` | FOCAS has no push model — subscriptions turn into per-tag polling groups |
| `IHostConnectivityProbe` | periodic `cnc_rdcncstat` | Probe cadence is `Probe.Interval`; transitions fire `OnHostStatusChanged` |
| `IPerCallHostResolver` | lookup in `_tagsByName` | Each call routes to the device of the referenced tag |
| `IAlarmSource` | polled `cnc_rdalmmsg2` via `FocasAlarmProjection` | Opt-in — set `AlarmProjection.Enabled=true`; diffs `(AlarmNumber, Type)` between ticks |
Ack is a no-op — FANUC clears alarms on its own once the underlying condition
resolves, so `AcknowledgeAsync` swallows the batch rather than surfacing
`BadNotSupported`.
## Fixed node tree
Enable a pre-defined hierarchy of CNC nodes populated automatically from
`cnc_sysinfo` + `cnc_rdaxisname` + `cnc_rddynamic2` + related FWLIB calls,
so operators get an out-of-the-box view of identity / axes / program /
timers without declaring per-address tags.
```jsonc
"Config": {
"Devices": [ ... ],
"Tags": [ ... ],
"FixedTree": {
"Enabled": true,
"PollInterval": "00:00:00.250", // fast — per-axis dynamic reads
"ProgramPollInterval": "00:00:01", // medium — program + mode changes
"TimerPollInterval": "00:00:30" // slow — cumulative counters
}
}
```
What gets populated (all under `FOCAS/{deviceHostAddress}/`):
| Subtree | Nodes | Source call |
|---------|-------|-------------|
| `Identity/` | `SeriesNumber`, `Version`, `MaxAxes`, `CncType`, `MtType`, `AxisCount` | `cnc_sysinfo` once at bootstrap |
| `Axes/{name}/` | `AbsolutePosition`, `MachinePosition`, `RelativePosition`, `DistanceToGo`, `ServoLoad` — one folder per discovered axis | `cnc_rdaxisname` once + `cnc_rddynamic2` + `cnc_rdsvmeter` per tick |
| `Axes/FeedRate/Actual`, `Axes/SpindleSpeed/Actual` | Current feed + spindle RPM | `cnc_rddynamic2` |
| `Spindle/{name}/` | `Load` (percentage), `MaxRpm` — one folder per discovered spindle | `cnc_rdspdlname` once + `cnc_rdspload` + `cnc_rdspmaxrpm` |
| `Program/` | `Name` (filename), `ONumber`, `Number`, `MainNumber`, `Sequence`, `BlockCount` | `cnc_exeprgname2` + `cnc_rdblkcount` + cached `cnc_rddynamic2` |
| `OperationMode/` | `Mode` (int), `ModeText` ("AUTO", "MDI", "EDIT", …) | `cnc_rdopmode` |
| `Timers/` | `PowerOnSeconds`, `OperatingSeconds`, `CuttingSeconds`, `CycleSeconds` | `cnc_rdtimer` × 4 |
### Per-series node suppression
The driver probes each optional call once at bootstrap. If the target CNC
returns `EW_FUNC` / `EW_NOOPT` / `EW_VERSION` on the wire, the
corresponding subtree is **not emitted** — the operator doesn't see nodes
that will only ever return `BadDeviceFailure`. Capability suppression
covers `Spindle/`, `Program/` + `OperationMode/`, `Timers/`, and
per-axis `ServoLoad` independently. Identity + `Axes/*` position reads
(which every Fanuc CNC supports) are always emitted.
Position values are scaled integers (matching FOCAS's convention). The
managed side exposes them as `Float64` OPC UA nodes; a future
`cnc_getfigure` integration will add per-axis decimal scaling. Until
then, treat the raw integer as the value the CNC reports and scale on
the client side if decimal precision matters.
**Still user-authored**: `PARAM:6711`, `MACRO:500`, `R100` etc. — specific
numbers whose meaning is MTB-specific. Those go under the device folder
alongside the fixed subtree.
## Alarm projection
Alarm surfacing is **disabled by default** because the polling cost is wasted
on sites that don't consume CNC alarms. Opt in per driver instance:
```jsonc
"Config": {
"Devices": [ ... ],
"Tags": [ ... ],
"AlarmProjection": {
"Enabled": true,
"PollInterval": "00:00:02"
}
}
```
Every alarm transition fires `OnAlarmEvent` with:
- `SourceNodeId` = the device host address (FOCAS has no per-node alarm model;
the CNC exposes a single flat active-alarm list per session)
- `ConditionId` = `"{host}#{Type}:{AlarmNumber}"`
- `AlarmType` = projected from FANUC's `ALM_TYPE_*` (e.g. `Overtravel`, `Servo`,
`Parameter`, `MacroAlarm`)
- `Severity` = Overtravel / Servo / PulseCode → `Critical`; Parameter / Macro
`Medium`; everything else → `High`
Cleared alarms fire a second event with `" (cleared)"` appended to the message
so downstream consumers can ignore the clear if they only care about raises.
## Handle recycling
FANUC CNCs have a finite FWLIB handle pool (~510 concurrent connections) and
certain series have documented handle-leak bugs that manifest after long uptime.
The driver can proactively close + reopen each device's session on a cadence to
release its slot back to the pool:
```jsonc
"Config": {
"Devices": [ ... ],
"HandleRecycle": {
"Enabled": true,
"Interval": "01:00:00"
}
}
```
Disabled by default — a healthy CNC + driver doesn't need it. Enable when field
experience shows handle exhaustion. Typical tuning: 30 min for sites running
multiple OtOpcUa instances against the same CNC (they share the pool); 6 h for a
single-client deployment. Reads / writes during recycle simply wait for the
reconnect rather than failing — worst case, an operator sees a brief read
latency spike once per cadence.
## Testing
- **Unit tests** — `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` cover the
driver surface via `FakeFocasClient`. Includes the alarm-projection raise /
clear diffing tests.
- **Integration tests** — `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/`
hold the Docker simulator scaffold (Stream B / C of the simulator plan —
`docs/v2/implementation/focas-simulator-plan.md`).
- **E2E script** — `scripts/e2e/test-focas.ps1` stages Host + Proxy + a real
CNC (or the simulator) and exercises connect → read → write → subscribe
round-trips. See [`docs/drivers/FOCAS-Test-Fixture.md`](FOCAS-Test-Fixture.md)
for the coverage map.
## Troubleshooting
| Symptom | Likely cause | Fix |
|---------|--------------|-----|
| `BadCommunicationError` on every read | CNC unreachable on TCP:8193 | Check firewall / LAN reachability; FOCAS Ethernet option must be licensed on the CNC side |
| Every read returns `BadNotWritable` on writes | Expected — OtOpcUa is read-only against FOCAS | If you actually need writes, open a feature request — the driver's managed wire client doesn't expose the write commands |
| `BadOutOfRange` on reads for a macro/parameter | Config address outside the declared `Series` range | Check `docs/v2/focas-version-matrix.md` — either fix the address or widen the `Series` |
| Alarm events never fire | `AlarmProjection.Enabled` left at default (false) | Set it to `true` in the driver config |
## Further reading
- [`docs/v2/driver-specs.md §7`](../v2/driver-specs.md) — full OPC UA node
mapping, pre-defined tag set, per-API notes
- [`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md) —
per-series macro / parameter / PMC range table
- [`docs/v2/implementation/focas-wire-protocol.md`](../v2/implementation/focas-wire-protocol.md)
— captured FOCAS2 wire semantics (magic prefix, handshake, command-id table)
- [upstream `Focas.Wire`](https://github.com/Ladder99/focas-mock/tree/main/dotnet/Focas.Wire)
— the managed client implementation OtOpcUa consumes as a NuGet dependency
+6 -5
View File
@@ -35,13 +35,14 @@ Multi-project test topology:
## How tests skip
- **E2E parity**: `ParityFixture.SkipIfUnavailable()` runs at class init and
checks Windows-only, non-admin user, ZB SQL reachable on
`localhost:1433`, Host EXE built in the expected `bin/` folder. Any miss
→ tests skip.
checks Windows-only, ZB SQL reachable on `localhost:1433`, Host EXE built
in the expected `bin/` folder. Any miss → tests skip.
- **Live-smoke** (`GalaxyRepositoryLiveSmokeTests`): `Assert.Skip` when ZB
unreachable. A `per project_galaxy_host_installed` memory on this repo's
dev box notes the MXAccess runtime is installed + pipe ACL denies Admins,
so live tests must run from a non-elevated shell.
dev box notes the MXAccess runtime is installed. The pipe ACL allows the
configured SID outright; elevation of the caller doesn't matter because
the per-connection SID check in `PipeServer.VerifyCaller` only compares
user SIDs (not group membership or integrity level).
- **Unit** tests (Shared, Proxy contract, most Host.Tests) have no skip —
they run anywhere.
+1 -1
View File
@@ -31,7 +31,7 @@ The same Tier-C isolation story applies to FOCAS (decision record in `docs/v2/pl
- Pipe name: `otopcua-galaxy-{DriverInstanceId}` (localhost-only, no TCP surface)
- Wire format: MessagePack-CSharp, length-prefixed frames
- ACL: pipe is created with a DACL that grants only the Server's service identity; the Admins group is explicitly denied so a live-smoke test running from an elevated shell fails fast rather than silently bypassing the handshake
- ACL: pipe is created with a DACL that grants `ReadWrite | Synchronize` only to the configured Server service-principal SID + denies `LocalSystem`. The per-connection SID check in `PipeServer.VerifyCaller` is the real authorization boundary — any caller whose impersonated token SID doesn't match the allowed SID is dropped before the first frame is read.
- Handshake: Proxy presents a shared secret at `OpenSessionRequest`; Host rejects anything else with `MessageKind.OpenSessionResponse{Success=false}`
- Heartbeat: Proxy sends a periodic ping; missed heartbeats trigger the Proxy-side crash-loop supervisor to restart the Host
+4 -1
View File
@@ -26,7 +26,7 @@ Driver type metadata is registered at startup in `DriverTypeRegistry` (`src/ZB.M
| AB CIP | `Driver.AbCip` | A | libplctag CIP | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IAlarmSource | ControlLogix / CompactLogix. Tag discovery uses the `@tags` walker to enumerate controller-scoped + program-scoped symbols; UDT member resolution via the UDT template reader |
| AB Legacy | `Driver.AbLegacy` | A | libplctag PCCC | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | SLC 500 / MicroLogix. File-based addressing (`N7:0`, `F8:0`) — no symbol table, tag list is user-authored in the config DB |
| TwinCAT | `Driver.TwinCAT` | B | Beckhoff `TwinCAT.Ads` (`TcAdsClient`) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | The only native-notification driver outside Galaxy — ADS delivers `ValueChangedCallback` events the driver forwards straight to `ISubscribable.OnDataChange` without polling. Symbol tree uploaded via `SymbolLoaderFactory` |
| FOCAS | `Driver.FOCAS` | C | FANUC FOCAS2 (`Fwlib32.dll` P/Invoke) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | Tier C — FOCAS DLL has crash modes that warrant process isolation. CNC-shaped data model (axes, spindle, PMC, macros, alarms) not a flat tag map |
| [FOCAS](FOCAS.md) | `Driver.FOCAS` | A | Pure-managed `FocasWireClient` FOCAS/2 Ethernet binary protocol on TCP:8193, inlined into the driver assembly | IDriver, ITagDiscovery, IReadable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IAlarmSource | Read-only by design (WriteAsync returns `BadNotWritable`). CNC-shaped data model (axes, spindle, PMC, macros, alarms) not a flat tag map. Previously Tier-C (Host + P/Invoke + shim DLL); retired in the 2026-04-24 migration when the managed wire client landed |
| OPC UA Client | `Driver.OpcUaClient` | B | OPCFoundation `Opc.Ua.Client` | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IAlarmSource, IHistoryProvider, IHostConnectivityProbe | Gateway/aggregation driver. Opens a single `Session` against a remote OPC UA server and re-exposes its address space. Owns its own `ApplicationConfiguration` (distinct from `Client.Shared`) because it's always-on with keep-alive + `TransferSubscriptions` across SDK reconnect, not an interactive CLI |
## Per-driver documentation
@@ -35,6 +35,9 @@ Driver type metadata is registered at startup in `DriverTypeRegistry` (`src/ZB.M
- [Galaxy.md](Galaxy.md) — COM bridge, STA pump, IPC, runtime probes
- [Galaxy-Repository.md](Galaxy-Repository.md) — ZB SQL reader, `LocalPlatform` scope filter, change detection
- **FOCAS** has a short getting-started doc because the Tier-C two-project deployment + backend-selection env var + alarm projection opt-in all need explaining up front:
- [FOCAS.md](FOCAS.md) — deployment, config, capability surface, alarm projection, troubleshooting
- **All other drivers** share a single per-driver specification in [docs/v2/driver-specs.md](../v2/driver-specs.md) — addressing, data-type maps, connection settings, and quirks live there. That file is the authoritative per-driver reference; this index points at it rather than duplicating.
## Test-fixture coverage maps
+122 -94
View File
@@ -2,60 +2,85 @@
Coverage map + gap inventory for the Beckhoff TwinCAT ADS driver.
**TL;DR:** Integration-test scaffolding lives at
`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/` (task #221).
`TwinCATXarFixture` probes TCP 48898 on an operator-supplied VM; three
smoke tests (read / write / native notification) run end-to-end through
the real ADS stack when the VM is reachable, skip cleanly otherwise.
**Remaining operational work**: stand up a TwinCAT 3 XAR runtime in a
Hyper-V VM, author the `.tsproj` project documented at
`TwinCatProject/README.md`, rotate the 7-day trial license (or buy a
paid runtime). Unit tests via `FakeTwinCATClient` still carry the
exhaustive contract coverage.
**TL;DR:** Integration-test suite lives at
`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/`. `TwinCATXarFixture`
probes TCP 48898 on an operator-supplied runtime; the suite runs **14
`[TwinCATFact]` methods + one 16-case `[TwinCATTheory]` = 30 test cases** end-to-end
through the real ADS stack when the runtime is reachable, skips cleanly
otherwise. The runtime can be a Hyper-V XAR VM or a TCBSD VM
(`TwinCatProject/README.md` covers both). Unit tests via `FakeTwinCATClient`
still carry the exhaustive contract coverage alongside.
TwinCAT is the only driver outside Galaxy that uses **native
notifications** (no polling) for `ISubscribable`, and the fake exposes a
fire-event harness so notification routing is contract-tested rigorously
at the unit layer.
TwinCAT is the only driver outside Galaxy that uses **native notifications**
(no polling) for `ISubscribable`. The integration suite verifies that path on
the wire; the fake exposes a fire-event harness so notification routing is
also contract-tested rigorously at the unit layer.
## What the fixture is
**Integration layer** (task #221, scaffolded):
`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/`
`TwinCATXarFixture` TCP-probes ADS port 48898 on the host specified by
`TWINCAT_TARGET_HOST` + requires `TWINCAT_TARGET_NETID` (AmsNetId of the
VM). No fixture-owned lifecycle — XAR can't run in Docker because it
bypasses the Windows kernel scheduler, so the VM stays
operator-managed. `TwinCatProject/README.md` documents the required
`.tsproj` project state; the file itself lands once the XAR VM is up +
the project is authored. Three smoke tests:
`Driver_reads_seeded_DINT_through_real_ADS`,
`Driver_write_then_read_round_trip_on_scratch_REAL`, and
`Driver_subscribe_receives_native_ADS_notifications_on_counter_changes`
— all skip cleanly via `[TwinCATFact]` when the runtime isn't
reachable.
**Integration layer**: `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/`
`TwinCATXarFixture` TCP-probes ADS port 48898 on the host supplied by
`TWINCAT_TARGET_HOST` (defaults to `localhost`) + requires
`TWINCAT_TARGET_NETID` (AmsNetId of the runtime). Optionally takes
`TWINCAT_TARGET_PORT` (default `851` = TC3 PLC runtime 1). No fixture-owned
lifecycle — XAR / TCBSD can't run in Docker because they bypass the host
kernel scheduler, so the runtime stays operator-managed.
`TwinCatProject/README.md` documents the required project state; the tests
gate on `[TwinCATFact]` / `[TwinCATTheory]` and skip cleanly when
`TWINCAT_TARGET_NETID` is unset or the probe fails.
**Unit layer**: `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` is
still the primary coverage. `FakeTwinCATClient` also fakes the
`AddDeviceNotification` flow so tests can trigger callbacks without a
running runtime.
**Unit layer**: `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` remains the
primary contract coverage. `FakeTwinCATClient` fakes the
`AddDeviceNotification` flow so tests can trigger callbacks without a running
runtime.
## What it actually covers
### Integration (XAR VM, task #221 — code scaffolded, needs VM + project)
### Integration (live runtime)
- `TwinCAT3SmokeTests.Driver_reads_seeded_DINT_through_real_ADS` — real AMS
handshake + ADS read of `GVL_Fixture.nCounter` (seeded at 1234, MAIN
increments each cycle)
- `TwinCAT3SmokeTests.Driver_write_then_read_round_trip_on_scratch_REAL`
real ADS write + read on `GVL_Fixture.rSetpoint`
- `TwinCAT3SmokeTests.Driver_subscribe_receives_native_ADS_notifications_on_counter_changes`
— real `AddDeviceNotification` against the cycle-incrementing counter;
observes `OnDataChange` firing within 3 s of subscribe
Every capability the driver implements is exercised on the wire:
All three gated on `TWINCAT_TARGET_HOST` + `TWINCAT_TARGET_NETID` env
vars; skip cleanly via `[TwinCATFact]` when the VM isn't reachable or
vars are unset.
- **Read** — `Driver_reads_seeded_DINT_through_real_ADS` (AMS handshake +
symbolic read of `GVL_Fixture.nCounter`)
- **Write + read round-trip** — `Driver_write_then_read_round_trip_on_scratch_REAL`
on `GVL_Fixture.rSetpoint`
- **Array element round-trip** — `Driver_round_trips_array_element_write_and_read`
on `GVL_Arrays.aReal1D[5]` (exercises `TwinCATSymbolPath` subscript
rendering)
- **Subscribe (native ADS notifications)** —
`Driver_subscribe_receives_native_ADS_notifications_on_counter_changes`;
observes `OnDataChange` firing within 10 s of subscribe
- **Symbol browse (direct client path)** —
`Driver_browses_committed_symbol_hierarchy_via_real_ADS` via
`ITwinCATClient.BrowseSymbolsAsync`
- **Symbol browse (through DiscoverAsync + `IAddressSpaceBuilder` pipeline)** —
`DiscoverAsync_renders_declared_tags_and_controller_browse_hits_address_space_builder`
verifies the real `TwinCAT/ → device/ → Discovered/` folder tree
- **Auto-reconnect** — `Driver_auto_reconnects_after_underlying_client_is_disposed`
disposes the `AdsClient` mid-flight; next read must re-establish
- **Primitive type coverage** — `Driver_reads_every_primitive_type_with_correct_mapping`
runs as a `[Theory]` against the 16 primitives in `GVL_Primitives`
(Bool, SInt, USInt, Int, UInt, DInt, UDInt, LInt, ULInt, Real, LReal,
String, Time, TimeOfDay, Date, DateTime) — asserts status + CLR type +
seed value where ergonomic
- **Bit-indexed BOOL** — `Driver_reads_bit_indexed_BOOL_from_word` against
`GVL_Primitives.vWord.3` + `.4` (bits of `0xBEEF`)
- **Nested UDT navigation** — `Driver_reads_deeply_nested_UDT_path` reads
`GVL_Plant.Line1.Stations[1].Axes[1].Motor.Temperature` (LREAL) + `.Running` (BOOL)
- **Multi-device routing + isolation** —
`Driver_routes_reads_per_device_and_isolates_unreachable_peers` pairs the
real runtime with a bogus AmsNetId; healthy device reads still succeed
- **Probe loop + `IHostConnectivityProbe`** —
`Probe_loop_raises_host_status_transition_to_Running_on_reachable_target`
asserts `OnHostStatusChanged → Running` and snapshot parity
- **Negative error mappings** —
`Driver_reports_errors_for_unknown_tag_and_nonexistent_symbol_and_readonly_write`
covers `BadNodeIdUnknown`, ghost-symbol communication errors, and the
`BadNotWritable` short-circuit
All tests gate on `TWINCAT_TARGET_NETID` (required) via `[TwinCATFact]` /
`[TwinCATTheory]`; `TWINCAT_TARGET_HOST` (default `localhost`) and
`TWINCAT_TARGET_PORT` (default `851`) are optional overrides.
### Unit
@@ -65,54 +90,69 @@ vars are unset.
- `TwinCATReadWriteTests` — read + write through the fake, status mapping
- `TwinCATSymbolPathTests` — symbol-path routing for nested struct members
- `TwinCATSymbolBrowserTests``ITagDiscovery.DiscoverAsync` via
`ReadSymbolsAsync` (#188) + system-symbol filtering
- `TwinCATNativeNotificationTests``AddDeviceNotification` (#189)
registration, callback-delivery-to-`OnDataChange` wiring, unregister on
unsubscribe
`BrowseSymbolsAsync` + system-symbol filtering
- `TwinCATNativeNotificationTests``AddDeviceNotification` registration,
callback-delivery-to-`OnDataChange` wiring, unregister on unsubscribe
- `TwinCATDriverTests``IDriver` lifecycle
Capability surfaces whose contract is verified: `IDriver`, `IReadable`,
`IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
`IPerCallHostResolver`.
Capability surfaces whose contract is verified at the unit layer: `IDriver`,
`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`,
`IHostConnectivityProbe`, `IPerCallHostResolver`. The integration suite now
verifies `ITagDiscovery` + `IHostConnectivityProbe` on the wire as well.
## Bugs caught by live runs
The integration suite surfaced three driver defects that `FakeTwinCATClient`
couldn't, since each lived below the abstraction boundary:
1. **Notification cycle time unit**`NotificationSettings(cycleTime, maxDelay)`
takes **milliseconds** per Beckhoff InfoSys
(`tcadsnetref/7313319051`), but the driver was multiplying by `10_000`
under a "100 ns units" assumption. A requested 250 ms cycle was being set
to ~41 minutes — subscribe never fired. Fix in `AdsTwinCATClient.AddNotificationAsync`.
2. **`STRING(N)` / `WSTRING(N)` type mapper** — `MapSymbolTypeName` only
matched bare `"STRING"` / `"WSTRING"`, so sized strings (the common case)
fell off `BrowseSymbolsAsync` entirely. Fix: strip the `(…)` bound before
the switch.
3. **Bit-indexed BOOL path** — driver was sending `"GVL.vWord.3"` to ADS as
a BOOL read. TwinCAT's symbol table doesn't expose bit-access paths; the
read returned `DeviceSymbolNotFound`. Fix: strip the `.N` suffix, read
the parent word as `uint`, extract the bit locally via `ExtractBit`.
All three paths are now pinned by live-wire tests.
## What it does NOT cover
### 1. AMS / ADS wire traffic
### 1. AMS / ADS wire framing
No real AMS router frame is exchanged. Beckhoff's `TwinCAT.Ads` NuGet (their
own .NET SDK, not libplctag-style OSS) has no in-process fake; tests stub
the `ITwinCATClient` abstraction above it.
No raw AMS packet is inspected. Beckhoff's `TwinCAT.Ads` NuGet (their own
.NET SDK, not libplctag-style OSS) has no in-process fake at the frame
level; tests run against a real router.
### 2. Multi-route AMS
ADS supports chained routes (`<localNetId> → <routerNetId> → <targetNetId>`)
for PLCs behind an EC master / IPC gateway. Parse coverage exists; wire-path
coverage doesn't.
coverage is single-hop only.
### 3. Notification reliability under jitter
### 3. Notification coalescing under jitter
`AddDeviceNotification` delivers at the runtime's cycle boundary; under high
CPU load or network jitter real notifications can coalesce. The fake fires
one callback per test invocation — real callback-coalescing behavior is
untested.
`AddDeviceNotification` delivers at the runtime's cycle boundary; under
sustained CPU load or network jitter real notifications can coalesce. The
live test only asserts at-least-one delivery within a generous window —
coalescing behavior under stress isn't verified.
### 4. TC2 vs TC3 variant handling
TwinCAT 2 (ADS v1) and TwinCAT 3 (ADS v2) have subtly different
`GetSymbolInfoByName` semantics + symbol-table layouts. Driver targets TC3;
TC2 compatibility is not exercised.
`GetSymbolInfoByName` semantics + symbol-table layouts. Driver + tests target
TC3; TC2 compatibility is not exercised.
### 5. Cycle-time alignment for `ISubscribable`
### 5. Alarms / history
Native ADS notifications fire on the PLC cycle boundary. The fake test
harness assumes notifications fire on a timer the test controls;
cycle-aligned firing under real PLC control is not verified.
### 6. Alarms / history
Driver doesn't implement `IAlarmSource` or `IHistoryProvider` — not in
scope for this driver family. TwinCAT 3's TcEventLogger could theoretically
back an `IAlarmSource`, but shipping that is a separate feature.
Driver doesn't implement `IAlarmSource` or `IHistoryProvider` — not in scope
for this driver family. TwinCAT 3's TcEventLogger could theoretically back
an `IAlarmSource`, but shipping that is a separate feature.
## When to trust TwinCAT tests, when to reach for a rig
@@ -122,37 +162,25 @@ back an `IAlarmSource`, but shipping that is a separate feature.
| "Does notification → `OnDataChange` wire correctly?" | yes (contract) | yes |
| "Does symbol browsing filter TwinCAT internals?" | yes | yes |
| "Does a real ADS read return correct bytes?" | no | yes (required) |
| "Do notifications coalesce under load?" | no | yes (required) |
| "Does auto-reconnect work on router restart?" | no (contract only) | yes (required) |
| "Do notifications coalesce under sustained load?" | no | yes (required) |
| "Does a TC2 PLC work the same as TC3?" | no | yes (required) |
## Follow-up candidates
1. **XAR VM live-population** — scaffolding is in place (this PR); the
remaining work is operational: stand up the Hyper-V VM, install XAR,
author the `.tsproj` per `TwinCatProject/README.md`, configure the
bilateral ADS route, set `TWINCAT_TARGET_HOST` + `TWINCAT_TARGET_NETID`
on the dev box. Then the three smoke tests transition skip → pass.
Tracked as #221.
2. **License-rotation automation** — XAR's 7-day trial expires on
schedule. Either automate `TcActivate.exe /reactivate` via a
scheduled task on the VM (not officially supported; reportedly works
for some TC3 builds), or buy a paid runtime license (~$1k one-time
per runtime per CPU) to kill the rotation. The doc at
`TwinCatProject/README.md` §License rotation walks through both.
3. **Lab rig** — cheapest IPC (CX7000 / CX9020) on a dedicated network;
the only route that covers TC2 + real EtherCAT I/O timing + cycle
jitter under CPU load.
Deferred to v3 — see [`docs/v3/twincat-backlog.md`](../v3/twincat-backlog.md).
Covers TC2 coverage, notification-coalescing-under-load, multi-hop AMS,
license-rotation automation, and a dedicated lab IPC.
## Key fixture / config files
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCATXarFixture.cs`
— TCP probe + skip-attributes + env-var parsing
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCAT3SmokeTests.cs`
three wire-level smoke tests
— wire-level test suite (14 `[TwinCATFact]` + 16-case `[TwinCATTheory]`)
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md`
— project spec + VM setup + license-rotation notes
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/FakeTwinCATClient.cs`
in-process fake with the notification-fire harness used by
`TwinCATNativeNotificationTests`
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs` — ctor takes
`ITwinCATClientFactory`
in-process fake with the notification-fire harness
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs` — ctor is
`(TwinCATDriverOptions, string driverInstanceId, ITwinCATClientFactory? = null)`
+1 -1
View File
@@ -174,7 +174,7 @@ Common contract for the proxy in the main server:
Named pipes default to allowing connections from any local user. Without explicit ACLs, any process on the host machine that knows the pipe name could connect, bypass the OPC UA server's authentication and authorization layers, and issue reads, writes, or alarm acknowledgments directly against the driver host. **This is a real privilege-escalation surface** — a service account with no OPC UA permissions could write field values it should never have access to. Every Tier C driver enforces the following:
1. **Pipe ACL**: the host creates the pipe with a `PipeSecurity` ACL that grants `ReadWrite | Synchronize` only to the OtOpcUa server's service principal SID. All other local users — including LocalSystem and Administrators — are explicitly denied. The ACL is set at pipe-creation time so it's atomic with the pipe being listenable.
1. **Pipe ACL**: the host creates the pipe with a `PipeSecurity` ACL that grants `ReadWrite | Synchronize` only to the OtOpcUa server's service principal SID. `LocalSystem` is explicitly denied. The ACL is set at pipe-creation time so it's atomic with the pipe being listenable. Administrators are **not** added to the deny list — UAC's filtered token carries the Admins group SID as deny-only, so a deny ACE on Administrators would fire even for non-elevated callers whose user account happens to be a member (common on dev boxes). The per-connection SID check in §2 remains the authorization boundary.
2. **Caller identity verification**: on each new pipe connection, the host calls `NamedPipeServerStream.GetImpersonationUserName()` (or impersonates and inspects the token) and verifies the connected client's SID matches the configured server service SID. Mismatches are logged and the connection is dropped before any RPC frame is read.
3. **Per-message authorization context**: every RPC frame includes the operation's authenticated OPC UA principal (forwarded by the Core after it has done its own authn/authz). The host treats this as input only — the driver-level authorization (e.g. "is this principal allowed to write Tune attributes?") is performed by the Core, but the host's own audit log records the principal so post-incident attribution is possible.
4. **No anonymous endpoints**: the heartbeat pipe has the same ACL as the data-plane pipe. There are no "open" pipes a generic client can probe.
+2 -2
View File
@@ -6,7 +6,7 @@ enforces at driver init time. Every row cites the Fanuc FOCAS Developer
Kit function whose documented input range determines the ceiling.
**Why this exists** — we have no FOCAS hardware on the bench and no
working simulator. Fwlib32 returns `EW_NUMBER` / `EW_PARAM` when you
working simulator. FWLIB (Fwlib64, or Fwlib32 on legacy deployments) returns `EW_NUMBER` / `EW_PARAM` when you
hand it an address outside the controller's supported range; the
driver would map that to a per-read `BadOutOfRange` at steady state.
Catching at `InitializeAsync` with this matrix surfaces operator
@@ -140,6 +140,6 @@ matrix: Macro variable #50000 is outside the documented range
This validation closes the cheap half of the FOCAS hardware-free
stability gap — config errors now fail at load instead of per-read.
The expensive half is Tier-C process isolation so that a crashing
`Fwlib32.dll` doesn't take the main OPC UA server down with it. See
`Fwlib64.dll` doesn't take the main OPC UA server down with it. See
[`docs/v2/implementation/focas-isolation-plan.md`](implementation/focas-isolation-plan.md)
for that plan (task #220).
@@ -0,0 +1,38 @@
# Admin UI Phase 6 status audit (2026-04-24)
Audit pass that closes the Phase 6 Admin-UI tasks that were tracked as still-open (#128#131) but already had their Blazor pages shipped. Every page listed below compiles against the current `OtOpcUaConfigDbContext` schema + the current Admin service surface, has substantive (non-stub) content, and is covered by `ZB.MOM.WW.OtOpcUa.Admin.Tests` (112/112 green).
## Task #128 — /hosts column refresh (Phase 6.1 Stream E.2/E.3)
`Components/Pages/Hosts.razor` — 233 LOC. Route `/hosts`. Ships:
- Per-driver circuit-breaker columns (`ConsecutiveFailures`, `LastCircuitBreakerOpenUtc`).
- Stale-row detection via `HostStatusService.IsStale` (publisher heartbeat ≥ 30 s stale threshold).
- Summary cards: Running / Stale / Faulted / total.
- Auto-refresh every `RefreshIntervalSeconds` driven by the `FleetStatusHub` SignalR feed.
- Health band via `DriverHostState` enum colour coding.
## Task #129 — RoleGrantsTab + AclsTab + Probe (Phase 6.2 Stream D)
- `Components/Pages/RoleGrants.razor` — 192 LOC. Route `/role-grants`. Edits LDAP-group → OPC-UA-role mappings with live reload over `AclChangeNotifier` SignalR.
- `Components/Pages/Clusters/AclsTab.razor` — 279 LOC. NodeAcl CRUD + the **"Probe this permission"** form (task #196 slice 1, embedded at line 38 onward). Binds `_probeGroup` / `_probeNamespaceId` / `_probeUnsAreaId` / `_probeUnsLineId` / `_probeEquipmentId` / `_probeTagId` / `_probePermission` through `PermissionProbeService`.
## Task #130 — RedundancyTab (Phase 6.3 Stream E)
`Components/Pages/Clusters/RedundancyTab.razor` — 175 LOC. Topology table, per-peer reachability (via `FleetStatusHub`), ServiceLevel band + `ApplyLeaseRegistry` / `RecoveryStateManager` state surfaces, failover action button. Live updates over the same SignalR hub `RedundancyPublisherHostedService` ticks.
## Task #131 — Draft / publish / diff / identification (Phase 6.4 Streams AD)
- `Components/Pages/Clusters/DraftEditor.razor` — 105 LOC. Route `/clusters/{ClusterId}/draft/{GenerationId:long}`. Calls `DraftValidationService` + `GenerationService`.
- `Components/Pages/Clusters/Generations.razor` — 73 LOC. Publish flow (generation state transitions through `sp_PublishGeneration`).
- `Components/Pages/Clusters/DiffViewer.razor` — 87 LOC. Route `/clusters/{ClusterId}/draft/{GenerationId:long}/diff`. Renders `sp_ComputeGenerationDiff` output.
- `Components/Pages/Clusters/IdentificationFields.razor` — 49 LOC. OPC 40010 Identification folder editor bound to the `Equipment` entity.
## What's NOT in this audit
- `#124` — Phase 6.2 3-user interop matrix. Authz layer is now covered by `ThreeUserInteropMatrixTests` in `ZB.MOM.WW.OtOpcUa.Server.Tests` (drives the 5 GLAuth users + admin through `LdapUserAuthenticator``AuthorizationGate.IsAllowed` for the role × operation matrix). The wire-level OPC UA-client cross-vendor leg still needs a UserName-token endpoint policy + manual client drill — that part stays a manual deliverable.
- `#119` — Phase 6.3 client interop matrix. Manual Ignition/Kepware/Aveva drills.
- `#113` — OPC UA CTT conformance pass. Manual CTT run.
- `#114` / `#115` — Redundancy cutover + deployment checklist. Manual.
Those remain GA-gating but require a human at a console, not a code change.
+129
View File
@@ -0,0 +1,129 @@
# Phase 3 Exit Gate — Driver Fleet (reconstructed retroactively)
> **Status**: **CLOSED (reconstructed 2026-04-23)**. The original plan split the
> driver work across Phases 3 / 4 / 5 (Modbus alone → four PLC drivers → two
> specialty drivers). In execution, all seven non-Galaxy drivers shipped under
> one umbrella against `Core.Abstractions` + `Core`'s generic driver-hosting
> machinery. This doc captures the closure retroactively; no forward work
> remains under these three original phase numbers.
>
> **Plan doc**: none — phases 3/4/5 were intentionally not split out into
> separate plan docs once it was clear the capability-interface contract
> introduced in Phase 1 (`Core.Abstractions` — plan decision #4) was stable
> enough that each driver could land as its own stream rather than as a
> gated mini-phase. See `docs/v2/plan.md` §6 for the now-consolidated
> migration strategy.
## Scope
All seven drivers in the v2 target list (Decision #5) minus Galaxy (closed
separately under Phase 2). The Galaxy Proxy+Host+Shared split exited under
`exit-gate-phase-2-final.md`; this gate does not re-cover it.
## What shipped
### Drivers
| Driver | Project | Capability surface | Test projects |
|---|---|---|---|
| Modbus TCP | `Driver.Modbus` + `Driver.Modbus.Cli` | `IDriver` + `ITagDiscovery` + `IReadable` + `IWritable` + `ISubscribable` + `IHostConnectivityProbe` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| AB CIP | `Driver.AbCip` + `Driver.AbCip.Cli` | all of the above + `IPerCallHostResolver` + `IAlarmSource` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| AB Legacy (PCCC / DF1) | `Driver.AbLegacy` + `Driver.AbLegacy.Cli` | `IDriver` + `IReadable` + `IWritable` + `ITagDiscovery` + `ISubscribable` + `IHostConnectivityProbe` + `IPerCallHostResolver` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| Siemens S7 | `Driver.S7` + `Driver.S7.Cli` | `IDriver` + `ITagDiscovery` + `IReadable` + `IWritable` + `ISubscribable` + `IHostConnectivityProbe` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| Beckhoff TwinCAT (ADS) | `Driver.TwinCAT` + `Driver.TwinCAT.Cli` | `IDriver` + `IReadable` + `IWritable` + `ITagDiscovery` + `ISubscribable` + `IHostConnectivityProbe` + `IPerCallHostResolver` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| FANUC FOCAS | `Driver.FOCAS` + `Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `Driver.FOCAS.Cli` | `IDriver` + `IReadable` + `IWritable` + `ITagDiscovery` + `ISubscribable` + `IHostConnectivityProbe` + `IPerCallHostResolver`; Tier-C out-of-process backend mirrors the Galaxy Proxy/Host split. `Fwlib64FocasBackend` shipped 2026-04-23 as the production backend (P/Invoke against `Fwlib64.dll`); Host retargeted from net48 x86 to net10.0-windows x64 at the same time. | `Tests`, `Host.Tests`, `Shared.Tests`, `Cli.Tests` |
| OPC UA Client (gateway) | `Driver.OpcUaClient` | `IDriver` + `ITagDiscovery` + `IReadable` + `IWritable` + `ISubscribable` + `IHostConnectivityProbe` + `IAlarmSource` + `IHistoryProvider` (richest surface in the fleet — it's bridging another UA server) | `Tests`, `IntegrationTests` |
### Supporting infrastructure
| PR / Task | Summary |
|---|---|
| #248 | `DriverFactoryRegistry` + `DriverInstanceBootstrapper` — central DB `DriverInstance` rows materialise into live `IDriver` instances at server startup. |
| #210 | Modbus server-side factory + seed SQL (closed first child of umbrella #209). |
| #211 #212 #213 | AB CIP / S7 / AB Legacy server-side factories + seed SQL. |
| #220 (FOCAS) | FOCAS factory wired into the bootstrap pipeline; Tier-C split (`Driver.FOCAS.Host` process launcher, named-pipe IPC, NSSM install scripts, post-mortem MMF) shipped across the five-PR series. |
| (this session) | TwinCAT factory wired in + Server project reference added; all seven driver factories now register uniformly in `Server/Program.cs`. |
| #249 #250 #251 | Per-driver test-client CLI suite (`otopcua-<driver>-cli`) — shared lib + one CLI per driver for direct-to-PLC smoke testing independent of the server. |
| #253 + follow-ups | E2E CLI test scripts (`scripts/e2e/test-<driver>.ps1`) — five-stage bidirectional bridge + subscribe-sees-change assertions per driver, plus `test-all.ps1` matrix runner. |
| (this session) | OPC UA Client e2e script shipped (`test-opcuaclient.ps1`, 8 stages) — the only driver that was missing an e2e script. |
### Docs
Per-driver test-fixture documentation:
- `docs/drivers/Modbus-Test-Fixture.md`
- `docs/drivers/AbServer-Test-Fixture.md` (covers AB CIP fixture)
- `docs/drivers/AbLegacy-Test-Fixture.md`
- `docs/drivers/S7-Test-Fixture.md`
- `docs/drivers/TwinCAT-Test-Fixture.md`
- `docs/drivers/FOCAS-Test-Fixture.md`
- `docs/drivers/OpcUaClient-Test-Fixture.md`
Driver-level ops docs:
- `docs/Driver.Modbus.Cli.md`, `docs/Driver.AbCip.Cli.md`, `docs/Driver.AbLegacy.Cli.md`, `docs/Driver.S7.Cli.md`, `docs/Driver.TwinCAT.Cli.md`, `docs/Driver.FOCAS.Cli.md`
- `docs/v2/driver-specs.md` — unified capability-matrix spec for all eight drivers (Galaxy + seven).
## Compliance evidence
No dedicated `phase-3-compliance.ps1` exists — scope was too broad to fit the
single-script pattern that worked for Phases 6.x and 7. Verification instead
takes the form of the per-driver test suites + e2e scripts:
- [x] **Unit tests** — every driver has a `Tests` project with capability-interface contract tests; `dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.*.Tests` is green.
- [x] **Integration tests**`Driver.*.IntegrationTests` stands up Docker-hosted simulators (pymodbus, ab_server, python-snap7, opc-plc) at collection init and exercises real wire-level read/write/subscribe/probe per driver.
- [x] **CLI tests**`Driver.*.Cli.Tests` covers the per-driver test-client CLIs (#249#251).
- [x] **E2E scripts**`scripts/e2e/test-<driver>.ps1` covers the driver-CLI → PLC → OtOpcUa server → OPC UA client round-trip for all seven drivers + Galaxy; `test-all.ps1` aggregates; README status section (rewritten this session) summarises live-boot evidence.
- [x] **Factory registration** — all seven factories plus Galaxy register in `src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` inside the `DriverFactoryRegistry` composition; the `DriverInstanceBootstrapper` can materialise any configured row.
- [x] **Seed SQL**#210#213 provide per-driver Config DB seed scripts so a fresh Config DB is populatable without Admin UI interaction.
### Live-boot verification
Recorded across the session-level tracking tasks:
| Driver | Fixture | Stages | Tracking |
|---|---|---|---|
| Modbus | pymodbus (dl205 profile) | 5/5 | #209 exit gate; bidirectional + subscribe-sees-change added in #253 follow-ups |
| AB CIP | `ab_server` ControlLogix | 5/5 | #220 |
| S7 | python-snap7 | 5/5 | #220 |
| AB Legacy | `ab_server` SLC500 / MicroLogix / PLC-5 (requires `/1,0` cip-path for Docker fixture) | 5/5 | #222 partial |
| OPC UA Client | opc-plc Docker fixture | 5/8 (probe, remote read, forward bridge, subscribe, browse) | (this session) |
| TwinCAT | TCBSD VM @ 10.100.0.128 (AmsNetId `41.169.163.43.1.1`) — real TwinCAT runtime under FreeBSD on ESXi; bypasses the Hyper-V/RTIME conflict that blocks XAR on this dev box | features validated | fixture is the TCBSD VM; `TWINCAT_TRUST_WIRE=1` still gates the e2e script by default so unintentional runs against cold fixtures don't false-pass |
| FOCAS | Lab-rig CNC + `Fwlib64.dll` | — | **deferred**`Fwlib64FocasBackend` shipped 2026-04-23; wire-level live-boot gated `FOCAS_TRUST_WIRE=1`, lab rig tracked under #222 follow-up |
| Galaxy | Live Galaxy + `OtOpcUaGalaxyHost` (this dev box) | 7/7 (read / write / subscribe / alarms / history) | closed under Phase 2 |
## Deferred to post-gate follow-ups
Items intentionally not blocking closure of this umbrella — each is hardware-
dependent and tracked separately:
- [ ] **FOCAS wire-level live-boot**`test-focas.ps1` against a real CNC once `Fwlib64.dll` is on PATH and `FOCAS_TRUST_WIRE=1` (#222 follow-up). The `Fwlib64FocasBackend` shipped 2026-04-23 — code exists, unit-tests green; only the live-CNC smoke test remains.
- [x] **FOCAS `Fwlib64FocasBackend`****CLOSED 2026-04-23**. The production backend in `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/Backend/Fwlib64FocasBackend.cs` wraps `FwlibFocasClient` to fulfil `IFocasBackend` against the licensed `Fwlib64.dll`. Host project retargeted to `net10.0-windows` x64. Default when `OTOPCUA_FOCAS_BACKEND` is unset. 6 new backend tests green. Only wire-level live-boot against real hardware remains — see item above.
- [ ] **OPC UA Client stages 5/7/8** — reverse-bridge, alarm, history stages are opt-in via sidecar NodeId params because opc-plc's default image has no writable nodes and doesn't historize. Against a richer upstream (Prosys, UA Expert sample server) all eight stages can run.
## Completion checklist
- [x] Modbus driver shipped + unit + integration + CLI tests green
- [x] AB CIP driver shipped + tests green + live-boot 5/5
- [x] AB Legacy driver shipped + tests green + live-boot 5/5
- [x] S7 driver shipped + tests green + live-boot 5/5
- [x] TwinCAT driver shipped + tests green + features validated against the TCBSD VM virtual-PLC fixture
- [x] FOCAS driver shipped (Tier-C split) + tests green (wire-live deferred)
- [x] OPC UA Client driver shipped + tests green + live-boot 5/8
- [x] `DriverFactoryRegistry` + `DriverInstanceBootstrapper` shipped
- [x] All seven factories registered in `Server/Program.cs`
- [x] Per-driver test-client CLI suite shipped
- [x] E2E test scripts shipped + `test-all.ps1` aggregator green
- [x] Per-driver test-fixture docs present
- [x] `docs/v2/driver-specs.md` unified capability spec present
- [x] `scripts/e2e/README.md` status section reflects current live-boot matrix
- [x] Exit gate doc checked in (this file)
- [x] TwinCAT validated against the TCBSD VM virtual-PLC fixture — `TWINCAT_TRUST_WIRE=1` + e2e script still gated by default to prevent false-pass against cold fixtures
- [ ] FOCAS lab-rig follow-up filed + tracked (#222)
## Why no compliance script
The Phases 6.1/6.2/6.3/6.4/7 pattern of a single `phase-N-compliance.ps1`
worked because each of those phases touched a narrow slice of server-side
runtime. A "phase-3-compliance.ps1" would have had to boot seven simulators,
configure seven DriverInstance rows, and run seven e2e scripts — which is
exactly what `scripts/e2e/test-all.ps1` already does. The aggregate runner
+ its README is the compliance artefact for this umbrella.
+9 -9
View File
@@ -1,6 +1,6 @@
# Phase 7 Exit Gate — Scripting, Virtual Tags, Scripted Alarms, Historian Sink
> **Status**: Open. Closed when every compliance check passes + every deferred item either ships or is filed as a post-v2-release follow-up.
> **Status**: **FULLY CLOSED** 2026-04-23 audit — the three original follow-ups (#239 / #240 / #241) were all shipped under later branches but this exit-gate doc wasn't updated at the time. All three verified against the repo + tests green.
>
> **Compliance script**: `scripts/compliance/phase-7-compliance.ps1`
> **Plan doc**: `docs/v2/implementation/phase-7-scripting-and-alarming.md`
@@ -45,13 +45,13 @@ Covered by `scripts/compliance/phase-7-compliance.ps1`:
- [x] Walker emits `NodeSourceKind.Virtual` + `NodeSourceKind.ScriptedAlarm` variables
- [x] `DriverNodeManager` dispatch routes Reads by source; Writes to non-Driver rejected with `BadUserAccessDenied` (plan #6)
## Deferred to Post-Gate Follow-ups
## Deferred to Post-Gate Follow-ups (all closed as of 2026-04-23 audit)
Kept out of the capstone so the gate can close cleanly while the less-critical wiring lands in targeted PRs:
Originally kept out of the capstone so the gate could close cleanly. Each landed as a targeted follow-up PR; audit this session verified them against the repo:
- [ ] **SealedBootstrap composition root** (task #239) — instantiate `VirtualTagEngine` + `ScriptedAlarmEngine` + `SqliteStoreAndForwardSink` in `Program.cs`; pass `VirtualTagSource` + `ScriptedAlarmSource` as the new `IReadable` parameters on `DriverNodeManager`. Without this, the engines are dormant in production even though every piece is tested.
- [ ] **Live OPC UA end-to-end smoke** (task #240) — Client.CLI browse + read a virtual tag computed by Roslyn; Client.CLI acknowledge a scripted alarm via the Part 9 method node; historian-disabled deployment returns `BadNotFound` for virtual nodes rather than silent failure.
- [ ] **sp_ComputeGenerationDiff extension** (task #241) — emit Script / VirtualTag / ScriptedAlarm sections alongside the existing Namespace/DriverInstance/Equipment/Tag/NodeAcl rows so the Admin DiffViewer shows Phase 7 changes between generations.
- [x] **SealedBootstrap composition root** (task #239) — **CLOSED**. `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs` instantiates `VirtualTagEngine` + `ScriptedAlarmEngine` via `Phase7EngineComposer.Compose`, and `SqliteStoreAndForwardSink` in `ResolveHistorianSink` when a registered driver provides `IAlarmHistorianWriter` (today: `GalaxyProxyDriver`). `OpcUaServerService.ExecuteAsync` calls `Phase7Composer.PrepareAsync` then `OpcUaApplicationHost.SetPhase7Sources` **before** `applicationHost.StartAsync` so `OtOpcUaServer` + `DriverNodeManager` capture the `VirtualReadable` / `ScriptedAlarmReadable` at construction. 38 tests green under `tests/ZB.MOM.WW.OtOpcUa.Server.Tests/Phase7/` + `SealedBootstrapIntegrationTests`. The work landed under the label "Phase 7 follow-up #246" and was never re-labelled against #239.
- [x] **Live OPC UA end-to-end smoke** (task #240) — **CLOSED**. `scripts/e2e/test-phase7-virtualtags.ps1` drives a full Client.CLI read of a driver-sourced input, reads the VirtualTag computed off it, triggers a scripted alarm by writing the trigger value, and subscribes to the alarm condition — all through a running OtOpcUa server. Covered in `scripts/e2e/test-all.ps1` + `scripts/e2e/README.md` matrix.
- [x] **sp_ComputeGenerationDiff extension** (task #241) — **CLOSED**. Migration `20260420232000_ExtendComputeGenerationDiffWithPhase7.cs` extends the stored proc to emit Script / VirtualTag / ScriptedAlarm sections alongside the existing NodeAcl / Tag / Equipment / DriverInstance / Namespace output. Admin DiffViewer picks them up through its existing section-plugin architecture (Phase 6.4 Stream C).
## Completion Checklist
@@ -66,9 +66,9 @@ Kept out of the capstone so the gate can close cleanly while the less-critical w
- [x] `phase-7-compliance.ps1` present and passes
- [x] Full solution `dotnet test` passes (no new failures beyond pre-existing tolerated CLI flake)
- [x] Exit-gate doc checked in
- [ ] `SealedBootstrap` composition follow-up filed + tracked
- [ ] Live end-to-end smoke follow-up filed + tracked
- [ ] `sp_ComputeGenerationDiff` extension follow-up filed + tracked
- [x] `SealedBootstrap` composition follow-up shipped (#239 / Phase 7 follow-up #246)
- [x] Live end-to-end smoke follow-up shipped (#240`scripts/e2e/test-phase7-virtualtags.ps1`)
- [x] `sp_ComputeGenerationDiff` extension follow-up shipped (#241 — migration `ExtendComputeGenerationDiffWithPhase7`)
## How to run
+16 -5
View File
@@ -1,10 +1,21 @@
# FOCAS Tier-C isolation — plan for task #220
> **Status**: PRs AE shipped. Architecture is in place; the only
> remaining FOCAS work is the hardware-dependent production
> integration of `Fwlib32.dll` into a real `IFocasBackend`
> (`FwlibHostedBackend`), which needs an actual CNC on the bench
> and is tracked as a follow-up on #220.
> **Status**: **FULLY SHIPPED** (code). PRs AE shipped the architecture; the
> 2026-04-23 follow-up shipped the production `Fwlib64FocasBackend` wrapping
> the licensed `Fwlib64.dll`. Only the wire-level live-boot against real
> hardware remains (task #222 / requires a bench CNC).
>
> **Major update 2026-04-23 — Host retargeted to .NET 10 x64 + Fwlib64**:
> Both `Fwlib32.dll` and `Fwlib64.dll` are licensed for this project. The
> original plan put the Host on .NET 4.8 x86 because Fwlib32 was assumed.
> With Fwlib64 available, the Host moves to `net10.0-windows` x64 — same
> runtime as the rest of the fleet. **Tier-C isolation stays anyway** — the
> blast-radius argument against a closed-source vendor P/Invoke is independent
> of bitness. Galaxy (forced x86 by MXAccess COM) is a pure bitness forcing;
> FOCAS is a pure blast-radius choice. Body of this document still reflects
> the original x86 assumptions in a few places — read them as historical
> design context; the current shape is in `docs/drivers/FOCAS-Test-Fixture.md`
> and `exit-gate-phase-3.md`.
>
> **Pre-reqs shipped**: version matrix + pre-flight validation
> (PR #168 — the cheap half of the hardware-free stability gap).
@@ -0,0 +1,291 @@
# FOCAS wire protocol — what's authoritative vs. what's guessed
Companion to [`focas-simulator-plan.md`](focas-simulator-plan.md). Written during
Stream B on 2026-04-23 after a research pass through `strangesast/fwlib` +
public FOCAS documentation. Purpose: separate what we *know* about the FOCAS
wire protocol (can quote with confidence) from what we're *guessing* (will need
Wireshark traces to validate in Stream C).
This document directly informs `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/server/`.
## Authoritative — from Fanuc's public `fwlib32.h`
The header file is distributed with the FOCAS Developer Kit and mirrored in OSS
repos (notably `strangesast/fwlib`). The **struct layouts** documented there
are stable across FOCAS versions and authoritative for the payload shapes our
Python mock has to emit.
### ODBM — macro variable read buffer
```c
typedef struct odbm {
short datano; // macro variable number
short dummy; // reserved / alignment padding
long mcr_val; // 32-bit signed macro value
short dec_val; // decimal-point count (0-9)
} ODBM;
```
With `#pragma pack(push, 4)` (the FOCAS default), total size is **10 bytes** on
Windows: 2 + 2 + 4 + 2. Our `FwlibNative.cs` matches this exactly.
Our mock's `_READ_RESP_STRUCT = struct.Struct(">iH")` is **only 6 bytes**
missing `datano` + `dummy`. A real Fwlib decoding the scaffold response will
read garbage. Stream C fix: prepend two `short` fields.
### IODBPSD — CNC parameter read/write buffer
```c
typedef struct iodbpsd {
short datano; // parameter number
short type; // axis index (0 for non-axis parameters)
union {
char cdata;
short idata;
long ldata;
char cdatas[MAX_AXIS]; // MAX_AXIS varies — 8 on 0i, 32 on 30i
short idatas[MAX_AXIS];
long ldatas[MAX_AXIS];
} u;
} IODBPSD;
```
With `pack(4)` and `MAX_AXIS=8`, total size = 2 + 2 + 32 = **36 bytes**. Our
`FwlibNative.cs` matches this (`[SizeConst = 32]` data buffer).
Our mock's current param handler doesn't return bytes in IODBPSD shape —
response payload is just the raw value. Stream C fix: wrap in 4-byte header
+ union-padded data.
### ODBST — status info
```c
typedef struct odbst {
short dummy; // reserved
short tmmode; // Memory / Tape / MDI / EDIT / DNC
short aut; // automatic mode
short run; // running state
short motion; // motion state
short mstb; // M/S/T/B finish signal
short emergency; // emergency stop
short alarm; // alarm state
short edit; // edit mode sub-state
} ODBST;
```
9 × short = **18 bytes**. Our mock already emits 18 bytes via
`struct.Struct(">9h")`. ✓ correct.
### IODBPMC — PMC range read/write buffer
```c
typedef struct iodbpmc {
short type_a; // PMC address letter encoded as ADR_* numeric code
short type_d; // data type: 0=byte, 1=word, 2=long, 4=float, 5=double
unsigned short datano_s; // start address number
unsigned short datano_e; // end address number
union {
char cdata[5];
short idata[5];
long ldata[5];
float fdata[5];
double dbdata[5];
} u; // 40-byte union (widest = dbdata = 5×8 bytes)
} IODBPMC;
```
With `pack(4)` the union is 40 bytes; struct total = 8 + 40 = **48 bytes**.
Our `FwlibNative.cs` matches this.
Our mock's PMC handler takes a different layout (uint16 handle + uint8 letter
+ ...). Stream C fix: rewrite to IODBPMC shape.
## Reference trace findings (2026-04-23 dev-box reversing)
**Good news** — we don't need a bench CNC for first-pass reversing. Loading
`Fwlib64.dll` in `otopcua-focas-cli` + pointing it at our Python simulator on
`127.0.0.1:8193` + enabling `OTOPCUA_FOCAS_RAW_CAPTURE=1` on the sim lets us
observe Fwlib's outbound bytes + iterate on reply shapes. Each cycle is ~5s;
progress measure is "Fwlib sends more bytes before disconnecting".
### Confirmed wire facts
**Magic prefix** — every frame Fwlib sends begins with `0xA0 0xA0 0xA0 0xA0`
(4 bytes). This is NOT a length prefix — our scaffold tried to decode it as
uint32-big-endian = 2.7 GB and died. It's a fixed protocol marker.
**Handshake request** — `cnc_allclibhndl3` produces this 8-byte frame:
```
a0 a0 a0 a0 00 01 01 01
└─ magic ─┘ └── negotiation ──┘
```
The 4-byte negotiation field is stable across our observations (always
`00 01 01 01`). Interpretation TBD — possibly `(version_major=0x0001,
version_minor=0x0101)` or `(protocol=0x01, subtype=0x010101)`.
**Handshake reply that Fwlib accepts** (empirically confirmed — doesn't
disconnect):
```
a0 a0 a0 a0 00 01 01 01 00 XX 00 YY
└─ magic ─┘ └── echo ──┘ handle api_version
```
12 bytes: magic + echoed negotiation + 2-byte handle + 2-byte api_version code.
### Post-handshake frame shape — decoded via drain mode
The simulator's `OTOPCUA_FOCAS_DRAIN_AFTER_HANDSHAKE=1` mode reads all inbound
bytes for 1000 ms after the handshake reply without attempting any decode.
Captured payload from `cnc_allclibhndl3`:
```
00 02 00 02 a0 a0 a0 a0 00 01 21 01 00 00
└── prefix ─┘ └── magic ─┘ └─── body ────┘
4 bytes 4 bytes 6 bytes (total = 14 bytes)
```
**Key discovery**: post-handshake frames have a **4-byte prefix BEFORE the
magic**, not magic-first. Frame shape:
```
uint16 msg_counter // starts at 2; handshake was #1 implicitly
uint16 handle_echo // matches the handle our open reply returned
4 bytes FOCAS_MAGIC // 0xA0A0A0A0
N bytes body // function-specific
```
Session 1's drain captured only the prefix (`00 02 00 01`) before timing
out — TCP multiplexed the two test sessions's bytes differently. Session 2
caught the full 14-byte frame.
### Body bytes — first post-handshake request
Body on `cnc_allclibhndl3` first post-handshake frame:
```
00 01 21 01 00 00
```
Informed guesses (unvalidated):
- `00 01` = body length (1 useful byte?) or sub-request count
- `21 01` = function code / operation tag — `0x21` is seen in public FOCAS
reverse-engineering notes associated with "system info" / "controller
identification" queries
- `00 00` = padding / reserved
Likely this is Fwlib's "tell me what CNC you are" query — part of
`cnc_allclibhndl3`'s internal handshake continuation before the handle is
fully established. Returning an empty or malformed response causes Fwlib
to declare the far end "not a CNC" and error with `EW_FUNC` (16).
### Iteration 3 — echo response, error-code advances
Sending back `<prefix><magic><echoed body>` (14 bytes matching request shape)
advances Fwlib's client-side error code from **`EW_-16` (socket-level)** to
**`EW_-17` (protocol-level rejection)**. Fwlib reads our response in full
before disconnecting with `peer closed mid-frame`.
Meaning: our **frame structure is correct enough** that Fwlib parses it as a
valid FOCAS frame; the **body content** (the 6 bytes after magic) is where
the semantic mismatch now lives. Fwlib expects specific bytes back for the
`0x2101` system-info query and an echo doesn't match.
### Current iteration block
Going deeper without reference requires either:
- **A bench CNC** (#54) to capture a real response to the `0x2101` query.
Stream C.2 Wireshark trace gives us the exact byte pattern Fwlib expects.
- **Published FOCAS response specs** for sub-function `0x2101` — not present
in `strangesast/fwlib` headers; likely only in the licensed Developer Kit
binary docs.
- **Blind enumeration** — try N variations of the 6-byte body response until
Fwlib's error code changes again. High cost, low signal.
The first two are both blocked on resources we don't have. The third is
~hundreds of cycles with no guarantee of convergence.
### Diminishing-returns checkpoint
**What we've proven without hardware**:
1. Magic prefix `0xA0A0A0A0` confirmed
2. Handshake request format decoded (`magic + 4-byte negotiation`)
3. Handshake response format that Fwlib accepts (`magic + echo + handle + api`)
4. Post-handshake frame format decoded (`prefix + magic + body`)
5. First post-handshake function code observed (`0x2101` — likely system-info)
6. Error code progression `EW_SOCKET``EW_PROTOCOL` confirms our framing is
structurally correct
**What we can't prove without bench CNC or reference docs**:
1. The exact 6-byte response body Fwlib expects for `0x2101`
2. The full list of post-handshake function codes + their body shapes
3. Whether subsequent frames use length prefixes or fixed body sizes
**Recommendation**: checkpoint here. The framing discoveries above are
preserved in `server/frames.py` + `server/state.py` + `server/focas_server.py`
+ `server/handlers/__init__.py`. When bench-CNC access unblocks Stream C.2's
reference trace, the iteration loop (with the framing work already done)
should converge in hours rather than days.
### Still unknown
- **Response shape** for the post-handshake body request — we can frame the
prefix + magic correctly now, but what the 6-byte body response should
carry (CNC series ID? version? capability flags?) needs further iteration.
- **Function-id numeric values** for the 9 FWLIB calls our driver makes —
one per call, need to be observed separately.
- **Error encoding** on the wire.
### Next iteration cycles
With the handshake working, each subsequent function gets its own probe-and-observe
loop. The simulator now has a `RAW_FRAME_MARKER = 0xFFFF` sentinel that lets a
handler return exact wire bytes (bypassing the scaffold envelope) — use that to
try different post-handshake replies and watch Fwlib's reaction.
## Stream C work order
Given what's authoritative vs. guessed, here's the most efficient path:
### Phase 1 — payload shapes (no hardware required)
- [ ] Rewrite `server/handlers/macro.py` response to return 10-byte ODBM:
`short datano, short dummy, int32 mcr_val, short dec_val`
- [ ] Rewrite `server/handlers/param.py` response to return 36-byte IODBPSD:
`short datano, short type, bytes[32] u`
- [ ] Rewrite `server/handlers/pmc.py` response to return 48-byte IODBPMC:
`short type_a, short type_d, uint16 datano_s, uint16 datano_e, bytes[40] u`
- [ ] Add unit tests asserting byte-exact sizes
- [ ] Update validate_harness.py to match the new shapes
Effect: when Stream C gets its first Wireshark trace, the payload-layer of the
mock is already correct. Only the framing layer needs iteration.
### Phase 2 — framing (requires hardware)
This is the iterative Wireshark loop — no point starting until the Windows rig
+ licensed Fwlib64.dll + real CNC are all available. See the implementer's
checklist in
[`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/README.md`](../../../tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/README.md).
### Phase 3 — flip the C# test gate
Once Phase 2 proves Fwlib64 can talk to the mock:
- [ ] Flip `OTOPCUA_FOCAS_SIM_WIRE_COMPAT=1` in the CI env
- [ ] Expand `tests/.../IntegrationTests/Series/WireCompatGatedTests.cs` with
real per-series assertions
- [ ] Update `scripts/e2e/test-focas.ps1` to accept `-ProfileName`
- [ ] Close Stream D
## References
- [`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs`](../../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs) — P/Invoke surface, authoritative struct layouts
- [`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibFocasClient.cs`](../../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibFocasClient.cs) — reference C# implementation of each FWLIB call
- [`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasStatusMapper.cs`](../../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasStatusMapper.cs) — EW_* → OPC UA status mapping
- Fanuc FOCAS Developer Kit (licensed, not in repo) — ultimate source of truth
- `strangesast/fwlib` on GitHub — redistributes `fwlib32.h` + runtime binaries; no wire protocol docs
@@ -172,7 +172,7 @@ Lift the existing `GalaxyRuntimeProbeManager` into the new project. Behaviors pe
#### Task B.6 — Named-pipe IPC server with mandatory ACL
Per decision #76 + `driver-stability.md` §"IPC Security":
- Pipe ACL on creation: `ReadWrite | Synchronize` granted only to the OtOpcUa server's service principal SID; LocalSystem and Administrators **explicitly denied**
- Pipe ACL on creation: `ReadWrite | Synchronize` granted only to the OtOpcUa server's service principal SID; LocalSystem **explicitly denied**. Administrators was dropped from the deny list so non-elevated admins on dev boxes aren't blocked via UAC-filtered-token deny-only semantics — the per-connection SID check (§2 of driver-stability.md) remains the real authorization boundary.
- Caller identity verification on each new connection: `GetImpersonationUserName()` cross-checked against configured server service SID; mismatches dropped before any RPC frame is read
- Per-process shared secret: passed by the supervisor at spawn time, required on first frame of every connection
- Heartbeat pipe: separate from data-plane pipe, same ACL
@@ -1,6 +1,8 @@
# Phase 6.1 — Resilience & Observability Runtime
> **Status**: **SHIPPED** 2026-04-19 — Streams A/B/C/D + E data layer merged to `v2` across PRs #78-82. Final exit-gate PR #83 turns the compliance script into real checks (all pass) and records this status update. One deferred piece: Stream E.2/E.3 SignalR hub + Blazor `/hosts` column refresh lands in a visual-compliance follow-up PR on the Phase 6.4 Admin UI branch.
> **Status**: **SHIPPED** 2026-04-19 — Streams A/B/C/D + E data layer merged to `v2` across PRs #78-82. Final exit-gate PR #83 turns the compliance script into real checks (all pass) and records this status update.
>
> **Stream E.2/E.3 closed 2026-04-23**`FleetStatusPoller` now polls `DriverInstanceResilienceStatus`, detects per-`(DriverInstanceId, HostName)` deltas, and pushes `ResilienceStatusChangedMessage` via `FleetStatusHub` on the fleet group. Admin `/hosts` page subscribes on load and upserts the matching `HostStatusRow` in-memory on receipt, so operator-visible resilience state now reflects the runtime within one poller tick (~5 s) instead of the Admin page's own 10-second refresh. `FleetStatusPollerTests.Poller_pushes_ResilienceStatusChanged_on_delta` covers the first-observation push, the no-delta-no-push invariant, and the mutated-row re-push.
>
> Baseline: 906 solution tests → post-Phase-6.1: 1042 passing (+136 net). One pre-existing Client.CLI Subscribe flake unchanged.
>
@@ -129,7 +131,7 @@ Closes these gaps flagged in the 2026-04-19 audit:
- [ ] Stream B: Tier registry + generalised watchdog + scheduled recycle + wedge detector
- [ ] Stream C: `/healthz` + `/readyz` + structured logging + JSON Serilog sink
- [ ] Stream D: LiteDB cache + Polly fallback in Configuration
- [ ] Stream E: Admin `/hosts` page refresh
- [x] Stream E: Admin `/hosts` page refresh (E.1 in PRs #78-82 with the data layer; E.2/E.3 closed 2026-04-23)
- [ ] Cross-cutting: `phase-6-1-compliance.ps1` exits 0; full solution `dotnet test` passes; exit-gate doc recorded
## Adversarial Review — 2026-04-19 (Codex, thread `019da489-e317-7aa1-ab1f-6335e0be2447`)
@@ -1,10 +1,9 @@
# Phase 6.2 — Authorization Runtime (ACL + LDAP grants)
> **Status**: **SHIPPED (core)** 2026-04-19 — Streams A, B, C (foundation), D (data layer) merged to `v2` across PRs #84-87. Final exit-gate PR #88 turns the compliance stub into real checks (all pass, 2 deferred surfaces tracked).
> **Status**: **FULLY SHIPPED** (updated 2026-04-23 audit). Streams A-D core merged to `v2` across PRs #84-87 + exit-gate PR #88 on 2026-04-19; both named deferrals landed separately and were confirmed against the repo this session:
>
> Deferred follow-ups (tracked separately):
> - Stream C dispatch wiring on the 11 OPC UA operation surfaces (task #143).
> - Stream D Admin UI — RoleGrantsTab, AclsTab Probe-this-permission, SignalR invalidation, draft-diff ACL section + visual-compliance reviewer signoff (task #144).
> - **Task #143 Stream C dispatch wiring**`DriverNodeManager` calls `AuthorizationGate.IsAllowed(context.UserIdentity, OpcUaOperation.<Op>, scope)` on Read (line 249), Write (line 536) with per-classification `OpcUaOperation.WriteOperate` / `WriteTune` / `WriteConfigure` routed via `WriteAuthzPolicy`, and HistoryRead (4 call sites). `TriePermissionEvaluator` + `PermissionTrieCache` back the gate.
> - **Task #144 Stream D Admin UI**`RoleGrants.razor` (LDAP group → Admin role mapping) + `AclsTab.razor` (per-cluster node-ACL editor with a probe-this-permission surface via `PermissionProbeService`) + `AclChangeNotifier` SignalR hub for cache invalidation all present and wired.
>
> Baseline pre-Phase-6.2: 1042 solution tests → post-Phase-6.2 core: 1097 passing (+55 net). One pre-existing Client.CLI Subscribe flake unchanged.
>
@@ -1,13 +1,20 @@
# Phase 6.3 — Redundancy Runtime
> **Status**: **SHIPPED (core)** 2026-04-19 — Streams B (ServiceLevelCalculator + RecoveryStateManager) and D core (ApplyLeaseRegistry) merged to `v2` in PR #89. Exit gate in PR #90.
> **Status**: **SHIPPED (core + Stream C)** — original body merged 2026-04-19; audit 2026-04-23 promoted **Stream C (task #147)** into shipped state.
>
> Deferred follow-ups (tracked separately):
> - Stream A — RedundancyCoordinator cluster-topology loader (task #145).
> - Stream COPC UA node wiring: ServiceLevel + ServerUriArray + RedundancySupport (task #147).
> - Stream E — Admin UI RedundancyTab + OpenTelemetry metrics + SignalR (task #149).
> - Stream Fclient interop matrix + Galaxy MXAccess failover test (task #150).
> - sp_PublishGeneration pre-publish validator rejecting unsupported RedundancyMode values (task #148 part 2 — SQL-side).
> **In** (verified in repo):
> - Stream A — `ClusterTopologyLoader`, `RedundancyCoordinator`, `RedundancyTopology`, `PeerReachability` all present under `src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/`. Coordinator is now also hosted by `Program.cs` via the new `RedundancyPublisherHostedService`, which calls `RefreshAsync` on startup.
> - Stream B`ServiceLevelCalculator` + `RecoveryStateManager`.
> - **Stream C (task #147) — OPC UA node wiring**. `ServerRedundancyNodeWriter` maintains `Server.ServiceLevel` (i=2267), `Server.ServerRedundancy.RedundancySupport` (i=2994), and `Server.ServerRedundancy.ServerUriArray` (non-transparent subtype) by writing the `PropertyState.Value` + calling `ClearChangeMasks`. `RedundancyPublisherHostedService` drives the publisher on a 1 s tick and fans `OnStateChanged` / `OnServerUriArrayChanged` into the writer. Mapping of `Configuration.RedundancyMode` → Part 4 `RedundancySupport` is Warm/Hot/None (v2 clusters don't enumerate Cold / HotAndMirrored per decision #85). Idempotent per-value dedupe prevents spurious OPC UA notifications. Unit coverage: `ServerRedundancyNodeWriterTests` (4 tests, green).
> - Stream D`ApplyLeaseRegistry`.
> - Stream E — `RedundancyTab.razor` with SignalR `RoleChanged` wiring (via `FleetStatusPoller` + `FleetStatusHub`) — stale-flag + role-swap banner.
>
> **Closed this session (2026-04-23)**:
> - **Task #148 part 2**`DraftValidator.ValidateClusterTopology(cluster, nodes)` now catches three pre-publish invariants the SQL CHECK can't see: (a) unsupported `NodeCount`/`RedundancyMode` pairs; (b) `Enabled`-node count vs. declared `NodeCount` mismatch (catches disabled-node drift with mode still Hot/Warm); (c) multiple-Primary per decision #84. Returns every failure in one pass — same shape as `Validate`. 8 new tests in `DraftValidatorTests` green.
> - **Task #150 Stream F**`docs/v2/redundancy-interop-playbook.md` captures the manual validation matrix against UaExpert + Kepware + AVEVA MXAccess failover. Automating these closed-source GUI clients in PR-CI is out of scope; the automatable half is already covered by `ServiceLevelCalculatorTests` / `RedundancyStatePublisherTests` / `ClusterTopologyLoaderTests` / `ServerRedundancyNodeWriterTests`.
>
> **Remaining (documented limitation, not blocking v2.0)**:
> - Non-transparent redundancy-state node upgrade — the SDK's default `Server.ServerRedundancy` object is the base `ServerRedundancyState`, so `ApplyServerUriArray` currently logs-and-skips. Operators on the rare deployment that needs `ServerUriArray` read-back get a clear warning with the upgrade path. Documented in the interop playbook's "Known limitations" section.
>
> Baseline pre-Phase-6.3: 1097 solution tests → post-Phase-6.3 core: 1137 passing (+40 net).
>
@@ -1,12 +1,17 @@
# Phase 6.4 — Admin UI Completion
> **Status**: **SHIPPED (data layer)** 2026-04-19 — Stream A.2 (UnsImpactAnalyzer + DraftRevisionToken) and Stream B.1 (EquipmentCsvImporter parser) merged to `v2` in PR #91. Exit gate in PR #92.
> **Status**: **SHIPPED (mostly)** 2026-04-19; audit 2026-04-23 confirms what landed separately after the data-layer PR #91:
>
> Deferred follow-ups (Blazor UI + staging tables + address-space wiring):
> - Stream A UI — UnsTab MudBlazor drag/drop + 409 concurrent-edit modal + Playwright smoke (task #153).
> - Stream B follow-up — EquipmentImportBatch staging + FinaliseImportBatch transaction + CSV import UI (task #155).
> - Stream C — DiffViewer refactor into base + 6 section plugins + 1000-row cap + SignalR paging (task #156).
> - Stream D — IdentificationFields.razor + DriverNodeManager OPC 40010 sub-folder exposure (task #157).
> **In** (verified in repo):
> - **Task #153 Stream A UI**`UnsTab.razor` with drag/drop handlers + concurrent-edit via `DraftRevisionToken` + `UnsImpactAnalyzer`; Playwright smoke test in `tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests/UnsTabDragDropE2ETests.cs`.
> - **Task #155 Stream B**`EquipmentImportBatch` entity + migration, `EquipmentImportBatchService.CreateBatchAsync` / `FinaliseBatchAsync` / `DropBatchAsync` / `ListByUserAsync`, `ImportEquipment.razor` UI.
> - **Task #156 Stream C**`DiffViewer.razor` + `DiffSection.razor` refactor in place.
> - Admin UI `IdentificationFields.razor` surface shipped (part of #157).
>
> **Closed this session (2026-04-23)**:
> - **Task #157 Stream D server-side half** was a stale audit claim. `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/IdentificationFolderBuilder.cs` ships the OPC 40010 Identification sub-folder materializer (Manufacturer / Model / SerialNumber / HardwareRevision / SoftwareRevision / YearOfConstruction / AssetLocation / ManufacturerUri / DeviceManualUri); `EquipmentNodeWalker.Walk` calls it per equipment; `IdentificationFolderBuilderTests` (158 lines) + two walker-level tests (`Walk_Materializes_Identification_Subfolder_When_AnyFieldPresent`, `Walk_Omits_Identification_Subfolder_When_AllFieldsNull`) cover the null-handling branches. The initial audit grepped only `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/`; the builder lives in `Core/OpcUa/`.
>
> **Phase 6.4 is now FULLY SHIPPED — no deferred surfaces remain.**
>
> Baseline pre-Phase-6.4: 1137 solution tests → post-Phase-6.4 data layer: 1159 passing (+22).
>
+111 -14
View File
@@ -14,7 +14,7 @@ End-to-end validation that the Phase 7 production wiring chain (#243 / #244 / #2
| SQL Server reachable, `OtOpcUaConfig` DB exists with all migrations applied | `sqlcmd -S "localhost,14330" -d OtOpcUaConfig -U sa -P "..." -Q "SELECT COUNT(*) FROM dbo.__EFMigrationsHistory"` returns ≥ 11 |
| Server's `appsettings.json` `Node:ConfigDbConnectionString` matches your SQL Server | `cat src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json` |
> **Galaxy.Host pipe ACL.** Per `docs/ServiceHosting.md`, the pipe ACL deliberately denies `BUILTIN\Administrators`. **Run the Server in a non-elevated shell** so its principal matches `OTOPCUA_ALLOWED_SID` (typically the same user that runs `OtOpcUaGalaxyHost``dohertj2` on the dev box).
> **Galaxy.Host pipe ACL.** The pipe allows the configured `OTOPCUA_ALLOWED_SID` (typically the user that runs `OtOpcUaGalaxyHost``dohertj2` on the dev box). Run the Server under the same user; elevation doesn't matter — `PipeAcl.cs` no longer denies `BUILTIN\Administrators` since UAC's deny-only Admins SID would have blocked non-elevated dev-box admins too.
## Setup
@@ -36,11 +36,49 @@ sqlcmd -S "localhost,14330" -d OtOpcUaConfig -U sa -P "OtOpcUaDev_2026!" `
Expected output ends with `Phase 7 smoke seed complete.` plus a Cluster / Node / Generation summary. Idempotent — re-running wipes the prior smoke state and starts clean.
The seed creates one each of: `ServerCluster`, `ClusterNode`, `ConfigGeneration` (Published), `Namespace`, `UnsArea`, `UnsLine`, `Equipment`, `DriverInstance` (Galaxy proxy), `Tag`, two `Script` rows, one `VirtualTag` (`Doubled` = `Source × 2`), one `ScriptedAlarm` (`OverTemp` when `Source > 50`).
The seed creates one each of: `ServerCluster`, `ClusterNode`, `ClusterNodeCredential` (binds the SQL login to the node — without this `sp_GetCurrentGenerationForCluster` returns `Unauthorized: caller X is not bound to NodeId p7-smoke-node`), `ConfigGeneration` (Published), `Namespace`, `UnsArea`, `UnsLine`, `Equipment`, `DriverInstance` (Galaxy proxy), `Tag`, two `Script` rows, one `VirtualTag` (`MachineStatus` = `Source > 0`, Boolean, historized), one `ScriptedAlarm` (`OverTemp` when `Source > 50`).
### 3. Replace the Galaxy attribute placeholder
### 3. (Optional) Swap the Galaxy attribute
`scripts/smoke/seed-phase-7-smoke.sql` inserts a `dbo.Tag.TagConfig` JSON with `FullName = "REPLACE_WITH_REAL_GALAXY_ATTRIBUTE"`. Edit the SQL + re-run, or `UPDATE dbo.Tag SET TagConfig = N'{"FullName":"YourReal.GalaxyAttr","DataType":"Float64"}' WHERE TagId='p7-smoke-tag-source'`. Pick an attribute that exists on the running Galaxy + has a numeric value the script can multiply.
The shipped seed points `dbo.Tag.TagConfig` at `TestMachine_001.TestHistoryValue` — the dev-box Galaxy ships it as Int32, writable (`security_classification = Operate`), and historized (`HistoryExtension` primitive), so every E2E stage has a real live target. To swap to another attribute on a different Galaxy, pick a candidate via the same shape:
```sql
-- Run against the Galaxy Repository DB (ZB).
;WITH dpc AS (
SELECT g.gobject_id, p.package_id, p.derived_from_package_id, 0 AS depth
FROM gobject g INNER JOIN package p ON p.package_id = g.deployed_package_id
WHERE g.is_template = 0 AND g.deployed_package_id <> 0
UNION ALL
SELECT c.gobject_id, p.package_id, p.derived_from_package_id, c.depth + 1
FROM dpc c INNER JOIN package p ON p.package_id = c.derived_from_package_id
WHERE c.derived_from_package_id <> 0 AND c.depth < 10
)
SELECT DISTINCT g.tag_name + '.' + da.attribute_name AS full_ref,
dt.description AS dtype, da.security_classification
FROM dpc
INNER JOIN dynamic_attribute da ON da.package_id = dpc.package_id
INNER JOIN gobject g ON g.gobject_id = dpc.gobject_id
LEFT JOIN data_type dt ON dt.mx_data_type = da.mx_data_type
WHERE da.attribute_name NOT LIKE '[_]%'
AND da.attribute_name NOT LIKE '%.Description'
AND da.mx_data_type IN (1, 2, 3, 4)
AND da.security_classification > 0 -- writable
AND EXISTS (
SELECT 1 FROM primitive_instance pi
INNER JOIN primitive_definition pd
ON pd.primitive_definition_id = pi.primitive_definition_id
AND pd.primitive_name = 'HistoryExtension'
WHERE pi.package_id = dpc.package_id AND pi.primitive_name = da.attribute_name)
ORDER BY full_ref;
```
Then update the seed:
```sql
UPDATE dbo.Tag
SET TagConfig = N'{"FullName":"YourReal.GalaxyAttr","DataType":"Int32"}'
WHERE TagId = 'p7-smoke-tag-source';
```
### 4. Point Server.appsettings at the smoke node
@@ -54,9 +92,38 @@ The seed creates one each of: `ServerCluster`, `ClusterNode`, `ConfigGeneration`
}
```
### 4a. (Optional) Enable LDAP + SecurityProfile for the write stage
Anonymous OPC UA sessions are denied writes against `Operate`-classified tags by the PR 26 server-layer classification gate. To exercise the reverse-bridge + alarm-fires stages fully, the Server has to advertise a `UserName` UserTokenPolicy (any profile other than `None`) and authenticate against LDAP.
```json
{
"OpcUa": {
"SecurityProfile": "Basic256Sha256-Sign",
"Ldap": {
"Enabled": true,
"Server": "localhost",
"Port": 3893,
"SearchBase": "dc=lmxopcua,dc=local",
"ServiceAccountDn": "cn=serviceaccount,dc=lmxopcua,dc=local",
"ServiceAccountPassword": "serviceaccount123",
"GroupToRole": {
"ReadOnly": "ReadOnly",
"WriteOperate": "WriteOperate",
"WriteTune": "WriteTune",
"WriteConfigure": "WriteConfigure",
"AlarmAck": "AlarmAck"
}
}
}
}
```
Dev-box GLAuth ships `writeop` / `writeop123` in the `WriteOperate` group, `admin` / `admin123` across all write groups. See `C:\publish\glauth\auth.md`.
## Run
### 5. Start the Server (non-elevated shell)
### 5. Start the Server
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Server
@@ -82,27 +149,39 @@ Any line missing = follow up the failure surface (each step has its own log sign
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- browse -u opc.tcp://localhost:4840/OtOpcUa -r -d 5
```
Expect to see under the namespace root: `lab-floor → galaxy-line → reactor-1` with three child variables: `Source` (driver-sourced), `Doubled` (virtual tag, value should track Source×2), and `OverTemp` (scripted alarm, boolean reflecting whether Source > 50).
Expect to see under the namespace root: `lab-floor → galaxy-line → reactor-1` with three child variables: `Source` (driver-sourced Int32), `MachineStatus` (virtual tag Boolean, `Source > 0`), and `OverTemp` (scripted alarm Boolean, `Source > 50`). NodeIds are path-based per OPC UA Part 3 §5.2.2 — the walker mints them from `{driverId}/{folder-path}/{browseName}` and stores the driver-side FullReference in an internal NodeId→FullRef map, so client subscriptions survive backend address renames.
#### Read the virtual tag
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read -u opc.tcp://localhost:4840/OtOpcUa -n "ns=2;s=p7-smoke-vt-derived"
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840/OtOpcUa `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/MachineStatus"
```
Expected: a `Float64` value approximately equal to `2 × Source`. Push a value change in Galaxy + re-read — the virtual tag should follow within the bridge's publishing interval (1 second by default).
Expected: `Boolean`. Push a value change into the Source Galaxy attribute and re-read — `MachineStatus` should follow within the bridge's publishing interval (1 second by default).
#### Read the scripted alarm
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read -u opc.tcp://localhost:4840/OtOpcUa -n "ns=2;s=p7-smoke-al-overtemp"
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840/OtOpcUa `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/OverTemp"
```
Expected: `Boolean``false` when Source ≤ 50, `true` when Source > 50.
#### Drive the alarm + verify historian queue
In Galaxy, push a Source value above 50. Within ~1 second, `OverTemp.Read` flips to `true`. The alarm engine emits a transition to `Phase7EngineComposer.RouteToHistorianAsync``SqliteStoreAndForwardSink.EnqueueAsync` → drain worker (every 2s) → `GalaxyHistorianWriter.WriteBatchAsync` → Galaxy.Host pipe → Aveva Historian alarm schema.
Push a Source value above 50 — either from Galaxy itself, or via the Server's OPC UA write path using LDAP credentials (step 4a). Within ~1 second, `OverTemp.Read` flips to `true`. The alarm engine emits a transition to `Phase7EngineComposer.RouteToHistorianAsync``SqliteStoreAndForwardSink.EnqueueAsync` → drain worker (every 2s) → `GalaxyHistorianWriter.WriteBatchAsync` → Galaxy.Host pipe → Aveva Historian alarm schema.
```powershell
# OPC UA write path — requires LDAP from step 4a + a writeop-class user.
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- write `
-u opc.tcp://localhost:4840/OtOpcUa -S sign `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source" `
-v 75 -U writeop -P writeop123
```
Verify the queue absorbed the event:
@@ -120,14 +199,32 @@ Open the Historian Client (or InTouch alarm summary) — the `OverTemp` activati
- [ ] EF migrations applied through `20260420232000_ExtendComputeGenerationDiffWithPhase7`
- [ ] Smoke seed completes without errors + creates exactly 1 Published generation
- [ ] Server starts in non-elevated shell + logs the Phase 7 composition lines
- [ ] Client.CLI browse shows the UNS tree with Source / Doubled / OverTemp under reactor-1
- [ ] Read on `Doubled` returns `2 × Source` value
- [ ] Server starts + logs the Phase 7 composition lines
- [ ] Client.CLI browse shows the UNS tree with Source / MachineStatus / OverTemp under reactor-1
- [ ] Read on `Source` returns a Good-quality Int32 value (proves MXAccess round-trip)
- [ ] Read on `MachineStatus` returns the live boolean truth of `Source > 0`
- [ ] Read on `OverTemp` returns the live boolean truth of `Source > 50`
- [ ] Pushing Source past 50 in Galaxy flips `OverTemp` to `true` within 1 s
- [ ] `test-galaxy.ps1 -Username writeop -Password writeop123` drives Source past 50 and flips `OverTemp` to `true` within 1 s
- [ ] SQLite queue drains (`COUNT(*)` returns to 0 within 2 s of an alarm transition)
- [ ] Historian shows the `OverTemp` activation event with the rendered message
## Second-run evidence (2026-04-24 dev box)
Full live stack ran end-to-end once the IPC unblocks (commit `d11dd05`), path-based NodeIds (commit `8be82e0`), cold-start engine guards (commit `69e1d32`), and seed retarget to `TestMachine_001.TestHistoryValue` (commit `ec1a590`) landed. Anonymous `scripts/e2e/test-galaxy.ps1` run reaches 3/7:
```
[PASS] source NodeId readable (Galaxy pipe → proxy → server → client chain up)
[PASS] source value = System.Byte[]
[INFO] BadUserAccessDenied — attribute's Galaxy-side ACL blocks writes for this session.
```
The `INFO` stage is correct behaviour — Source is `Operate`-classified and the anonymous session carries no LDAP roles. The Virtual-tag / Subscribe / Alarm / History stages stay at `[FAIL]` for two further environmental reasons once write is unblocked:
1. `TestMachine_001.TestHistoryValue` is driven by whatever Galaxy code runs on the object — idle in the default dev-box state, so no subscription pushes fire.
2. Historian writes require the Aveva Historian SDK to accept the alarm schema event — dev box doesn't have that path live.
Running `./test-galaxy.ps1 -Username writeop -Password writeop123` with step 4a's LDAP + `SecurityProfile = Basic256Sha256-Sign` applied unblocks the reverse-bridge + alarm-fires stages. The virtual-tag, subscribe, and history stages depend on further deployment choices (pick an attribute Galaxy is actively writing to, wire Aveva Historian SDK).
## First-run evidence (2026-04-20 dev box)
Ran the smoke against the live dev environment. Captured log signatures prove the Phase 7 wiring chain executes in production:
+233
View File
@@ -0,0 +1,233 @@
# Modbus tag-addressing reference
Foundational doc for the Modbus addressing grammar shipped across #136#144.
Covers the address-string parser (`ModbusAddressParser`) that the wire driver
and the Admin UI both consume, the per-tag suffix modifiers, and the family-
native branch.
## Grammar
```
<region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]
```
Each field is optional from left to right; the parser fills defaults.
### Region + offset
Three accepted forms — pick whichever matches your tag spreadsheet's
convention. All three resolve to the same `(Region, ushort PduOffset)`
on the wire.
| Form | Example | Means |
|---|---|---|
| Modicon 5-digit | `40001` | Holding register 1 (PDU 0) |
| Modicon 6-digit | `400001` | Holding register 1 (PDU 0); supports up to `465536` (PDU 65535) |
| Mnemonic | `HR1`, `IR1`, `C100`, `DI1` | Same regions; `1`-based register number |
Modicon leading-digit → region:
| Digit | Region | OPC UA wire FC |
|---|---|---|
| `0` | Coils | FC01 / FC05 / FC15 |
| `1` | DiscreteInputs | FC02 (read-only) |
| `3` | InputRegisters | FC04 (read-only) |
| `4` | HoldingRegisters | FC03 / FC06 / FC16 |
### Bit suffix `.N`
`40001.5` = bit 5 (LSB-first) of HR[0]. Implies `DataType=BitInRegister`;
mixing with an explicit type or array-count is rejected.
### Type code `:T`
Codes verified 2026-04-25 against [Wonderware DASMBTCP user
guide](https://cdn.logic-control.com/media/DASMBTCP.pdf) and the
[Ignition Modbus addressing
manual](https://www.docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/opc-ua-drivers/modbus/modbus-addressing).
The `I` / `UI` / `I_64` / `UI_64` / `BCD_32` shapes match Wonderware's
suffix convention and Ignition's underscore-N prefix variants where
those vendors agree.
| Code | Type | Registers | Vendor reference |
|---|---|---|---|
| `BOOL` | Boolean | 1 (region must be Coils / DiscreteInputs) | universal |
| `S` | Int16 | 1 | Wonderware DASMBTCP `S` = 16-bit signed |
| `US` | UInt16 | 1 | Ignition `HRUS` = Unsigned Short |
| `I` | Int32 | 2 | Wonderware DASMBTCP `I` = 32-bit signed; Ignition `HRI` |
| `UI` | UInt32 | 2 | Ignition `HRUI` |
| `I_64` | Int64 | 4 | Ignition `HRI_64` |
| `UI_64` | UInt64 | 4 | Ignition `HRUI_64` |
| `F` | Float32 | 2 | Wonderware `F`; Ignition `HRF` |
| `D` | Float64 | 4 | Ignition `HRD` |
| `BCD` | 16-bit BCD | 1 | Ignition `HRBCD` |
| `BCD_32` | 32-bit BCD | 2 | Ignition `HRBCD_32` |
| `STR<len>` | ASCII string, `len` chars (2 chars / register) | `ceil(len/2)` | analogous to Ignition `HRS<addr>:<len>` |
Default when omitted:
- Coils / DiscreteInputs → `BOOL`
- HoldingRegisters / InputRegisters → `S` (Int16) — matches Ignition's bare-`HR` default
**Codes removed in #146** (silent wrong-data risk, never compatible with the
two reference vendors): `:DI`, `:L`, `:UDI`, `:UL`, `:LI`, `:ULI`, `:LBCD`.
Pre-#146 configs that use these get a clear "Unknown type code" diagnostic at
parse time; rewrite to the post-#146 codes per the table above.
### Byte order `:O`
| Mnemonic | Meaning | Wire |
|---|---|---|
| `ABCD` | Big-endian (Modbus spec default) | `[A,B,C,D]` |
| `CDAB` | Word swap (Siemens, several AB) | `[C,D,A,B]` |
| `BADC` | Byte swap (legacy little-endian-internal devices) | `[B,A,D,C]` |
| `DCBA` | Full reverse (some EtherNet/IP gateways) | `[D,C,B,A]` |
For 8-byte values (Int64 / Float64) the same labels apply pairwise.
### Array count `:N`
`40001:F:5` = `Float32[5]` (consumes HR[0..9]). Array + bit suffix is
rejected. Strings are not arrays.
### Composition
The 3-field shorthand `40001:F:5` is parsed as `(type=F, count=5)` because
`5` isn't a valid byte-order mnemonic. Use the explicit 4-field form
`40001:F:CDAB:5` when you need a non-default order.
## Family-native syntax (#144)
When the driver instance has `Family != Generic`, the parser tries the
family's native syntax FIRST, then falls back to Modicon / mnemonic.
### DL205 (AutomationDirect DirectLOGIC)
| Form | Example | Mapping |
|---|---|---|
| `Vnnnn` (octal) | `V2000` | HoldingRegisters[1024] (octal 2000 = decimal 1024) |
| `Ynn` (octal) | `Y17` | Coils[2048 + 15] (Y-output base + offset) |
| `Cnn` (octal) | `C100` | Coils[3072 + 64] (C-relay base + offset) |
| `Xnn` (octal) | `X17` | DiscreteInputs[15] |
| `SPnn` (octal) | `SP10` | DiscreteInputs[1024 + 8] |
**Cross-family ambiguity**: `C100` means Coils[99] under `Generic`
(mnemonic) but Coils[3136] under `DL205`. Per-driver Family choice
disambiguates.
### MELSEC (Mitsubishi)
| Form | Example | Mapping (sub-family Q_L_iQR / F_iQF) |
|---|---|---|
| `Dnnn` (decimal) | `D100` | HoldingRegisters[100] |
| `Mnnn` (decimal) | `M50` | Coils[50] |
| `Xnn` | `X20` | DiscreteInputs[32 hex / 16 octal] |
| `Ynn` | `Y20` | Coils[32 hex / 16 octal] |
X / Y digit interpretation depends on `MelsecSubFamily`:
- `Q_L_iQR` → hex (default)
- `F_iQF` → octal
Bank-base offsets default to 0 in the grammar string. Sites with non-zero
"Modbus Device Assignment" bases use the structured tag form.
## Driver-instance options
Beyond per-tag addressing, `ModbusDriverOptions` exposes (#139#143):
### Connection (#139)
- `KeepAlive { Enabled, Time, Interval, RetryCount }` — TCP-level probes.
Defaults match the historical PR 53 wire output (Enabled=true, Time=30s,
Interval=10s, RetryCount=3).
- `IdleDisconnectTimeout` — proactively close + reconnect after this much
socket idle time. Default null = disabled.
- `Reconnect { InitialDelay, MaxDelay, BackoffMultiplier }` — geometric
backoff for the post-drop reconnect loop. Default
`(0, 30s, 2.0)` = immediate first retry, geometric thereafter.
### Protocol (#140)
- `MaxCoilsPerRead` (default 2000) — separate cap for FC01/FC02 coil reads.
- `UseFC15ForSingleCoilWrites` — force FC15 (write multiple coils
qty=1) for single-coil writes. Safety/audit PLCs may require this.
- `UseFC16ForSingleRegisterWrites` — same for FC16 vs FC06.
- `DisableFC23` — kill switch for FC23 (currently unused; reserved).
### Subscribe (#141)
- Per-tag `Deadband` — suppress sub-threshold publishes on numeric tags.
- `WriteOnChangeOnly` (driver-level) — short-circuit identical-value
writes. Cache invalidates on read-divergence.
### Multi-unit (#142)
- Per-tag `UnitId` — overrides the driver-level UnitId in the MBAP
header. Required for one-Ethernet-gateway / N-RTU-slave deployments.
- `IPerCallHostResolver.ResolveHost` returns `host:port/unitN` per tag so
per-PLC circuit breakers fire per slave.
- Per-tag `CoalesceProhibited` — escape hatch for #143's planner (read
this tag in isolation regardless of `MaxReadGap`).
### Block-read coalescing (#143)
- `MaxReadGap` (default 0 = off) — gap budget the planner is willing to
bridge between adjacent register tags. With `MaxReadGap=10`, three tags
at HR 100/102/110 collapse into one FC03 of quantity 11.
### Coalescing auto-recovery (#148 / #150 / #151 / #152)
- A coalesced read that fails with a Modbus exception (write-only or
protected register mid-block) records the failed range as
auto-prohibited. The planner stops re-coalescing across the range; the
per-tag fallback path keeps healthy members working in the same scan.
- **Bisection (#150)**: every re-probe pass narrows multi-register
prohibitions by trying the two halves separately. Over log2(span)
ticks the prohibition pins at the actual offending register(s);
intermediate halves that succeed get cleared.
- **Periodic re-probe (#151)**: opt in via
`AutoProhibitReprobeInterval` (TimeSpan?). Default null = disabled
(prohibitions persist for the driver lifetime; clear on
`ReinitializeAsync`).
- **Per-tag escape hatch**: `CoalesceProhibited` (bool, default false)
on `ModbusTagDefinition`. The planner reads such tags in isolation
regardless of `MaxReadGap`. Use for known-bad addresses you want to
exclude from the auto-discovery loop.
- **Diagnostics (#152)**: `ModbusDriver.GetAutoProhibitedRanges()`
returns a snapshot of every active prohibition as
`ModbusAutoProhibition` records (UnitId / Region / StartAddress /
EndAddress / LastProbedUtc / BisectionPending). Surface in the
driver-diagnostics RPC channel when that wiring lands; for now
consumable by in-process callers (Server health endpoints, log
aggregation).
## JSON DTO shape
The factory accepts both the structured form (legacy) and the new
`AddressString` form per-tag. Mix freely — newer pasted rows use the
grammar string; legacy rows keep the structured fields.
```json
{
"host": "10.1.2.3",
"port": 502,
"unitId": 1,
"family": "DL205",
"keepAlive": { "enabled": true, "timeMs": 30000, "intervalMs": 10000, "retryCount": 3 },
"idleDisconnectMs": 120000,
"reconnect": { "initialDelayMs": 0, "maxDelayMs": 30000, "backoffMultiplier": 2.0 },
"maxCoilsPerRead": 2000,
"writeOnChangeOnly": false,
"maxReadGap": 8,
"tags": [
{ "name": "Temp", "addressString": "V2000:F:CDAB" },
{ "name": "Setpoint", "addressString": "40001:I" },
{ "name": "Outputs", "addressString": "Y0:5" },
{ "name": "AlarmCount", "region": "HoldingRegisters", "address": 200, "dataType": "Int16", "deadband": 5.0 }
]
}
```
## Vendor compatibility caveat
The exact spelling of type codes (e.g. `I` vs `INT`, `BCD` vs `L_BCD`) and
the byte-order mnemonics were synthesised from training-era vendor docs
(Wonderware DASMBTCP, Kepware KEPServerEX, Ignition, Matrikon, OAS).
Before locking the grammar for a production deployment, verify against
the current Kepware "Modbus Ethernet Driver Help" PDF and Ignition's
"Modbus Addressing" user-manual page — if a critical tool's mnemonics
have shifted, add aliases in `ModbusAddressParser.TryParseType` rather
than asking users to rewrite spreadsheets.
+46
View File
@@ -0,0 +1,46 @@
# Multi-host dispatch — per-PLC circuit breakers
Phase 6.1 decision #144 / task #135. Motivation: a single DriverInstance that fronts N PLCs (Modbus with multiple slaves, AB CIP with multiple ControlLogix chassis, etc.) must not let one dead PLC trip the resilience breaker for its healthy siblings.
This note documents the shipped contract so future driver authors don't re-derive it.
## Contract
The resilience pipeline keys on `(DriverInstanceId, HostName, DriverCapability)`. One dead PLC opens only the pipeline keyed on its HostName; healthy sibling PLCs keep their own pipelines intact.
Three participants:
1. **`DriverResiliencePipelineBuilder.GetOrCreate(driverInstanceId, hostName, capability, options)`** — the pipeline cache. First call per key builds a Polly pipeline (timeout → retry → breaker). Subsequent calls return the cached instance. Covered by `DriverResiliencePipelineBuilderTests.Pipeline_IsIsolated_PerHost`.
2. **`CapabilityInvoker.ExecuteAsync(capability, hostName, callSite, ct)`** — takes `hostName` per-call. Threads it straight through to the pipeline builder. Covered by `CapabilityInvokerTests`.
3. **`IPerCallHostResolver.ResolveHost(fullReference)`** — an optional interface a multi-device driver implements. `DriverNodeManager.ResolveHostFor` calls it on every capability dispatch so the host flowing into the invoker comes from the tag's per-PLC metadata, not the driver instance. Single-device drivers don't implement it — `DriverNodeManager` falls back to `DriverInstanceId` as the hostname, which still flows through the same `(instance, host, capability)` key shape (one pipeline per single-device instance).
End-to-end `dead PLC, healthy PLC` scenario proven by `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`.
## Driver author checklist
To light up per-PLC circuit breakers on a multi-device driver:
1. **Options model** — extend the driver's options type with an explicit device list. See `AbCipDriverOptions.Devices : IReadOnlyList<AbCipDeviceConfig>`.
2. **Tag → device mapping** — parse the tag's `DeviceId` from `TagConfig`. The driver's per-tag definition records the device HostAddress alongside the wire address. See `AbCipTagDefinition.DeviceHostAddress`.
3. **`IPerCallHostResolver`** — implement it on the driver. `ResolveHost(fullReference)` looks up the tag's definition and returns the device HostAddress. Unknown references should return a deterministic fallback (e.g. the first configured device's host) rather than throw — the invoker handles the mislookup at capability level when the actual read surfaces `BadNodeIdUnknown`.
4. **Health surface**`IHostConnectivityProbe.GetHostStatuses()` returns one `HostConnectivityStatus` per configured device so the Admin UI fleet page lights the per-PLC status distinctly.
5. **Transport per device** — one network connection per PLC, serialized per device via `SemaphoreSlim` (or equivalent). Do not share a transport across PLCs; the breaker-isolation guarantee disappears if they share a queue.
## Current fleet status (2026-04-24)
| Driver | Per-tag device | `IPerCallHostResolver` | Per-PLC breaker isolation |
|---|---|---|---|
| AB CIP | ✅ `DeviceId` | ✅ | ✅ live |
| AB Legacy | 1 device / instance | — (not needed) | trivial |
| Modbus | 1 device / instance today | — | trivial — multi-device refactor tracked separately |
| S7 | 1 device / instance today | — | trivial — same |
| TwinCAT | 1 device / instance today | — | trivial — same |
| FOCAS | 1 CNC / instance | — (not needed) | trivial |
| Galaxy | 1 Galaxy Host / instance | — (not needed) | trivial — Host recycle runs per instance |
| OPC UA Client | 1 upstream / instance | — (not needed) | trivial |
"Trivial" above means the pipeline key ends up as `(DriverInstanceId, DriverInstanceId, capability)` via `DriverNodeManager.ResolveHostFor`'s fallback — one pipeline per driver instance, which is correct for single-device drivers.
Extending Modbus / S7 / TwinCAT to multi-device follows the AB CIP template verbatim; it's per-driver surgery (schema row + options model + resolver implementation + transport fan-out) rather than shared-infrastructure work.
+19 -13
View File
@@ -689,7 +689,7 @@ Galaxy.Proxy ──→ Galaxy.Shared ←── Galaxy.Host
**Decided:**
- Mono-repo (Decision #31 above).
- `Core.Abstractions` is **internal-only for now** — no standalone NuGet. Keep the contract mutable while the first 8 drivers are being built; revisit publishing after Phase 5 when the shape has stabilized. Design the contract *as if* it will eventually be public (no leaky types, stable names) to minimize churn later.
- `Core.Abstractions` is **internal-only for now** — no standalone NuGet. Keep the contract mutable while the first 8 drivers are being built; revisit publishing after the driver fleet (originally Phase 5, folded into the Phase 3 umbrella — see exit gate) once the shape has stabilized. Design the contract *as if* it will eventually be public (no leaky types, stable names) to minimize churn later.
---
@@ -742,24 +742,30 @@ Each step leaves the system runnable. The generic extraction is effectively free
10. **Build `Galaxy.Proxy`** — .NET 10 in-process proxy implementing IDriver interfaces, forwarding over IPC
11. **Validate parity** — v2 Galaxy driver must pass the same integration tests as v1
**Phase 3 — Modbus TCP driver (prove the abstraction)**
12. **Build `Driver.ModbusTcp`** — NModbus, config-driven tags from central DB, internal poll loop, device-as-folder hierarchy
13. **Add Modbus config screens to Admin** (first driver-specific config UI)
**Phase 3 — Driver fleet (all seven non-Galaxy drivers) — ✅ CLOSED 2026-04-23** (see [`implementation/exit-gate-phase-3.md`](implementation/exit-gate-phase-3.md))
**Phase 4 PLC drivers**
14. **Build `Driver.AbCip`** — libplctag, ControlLogix/CompactLogix symbolic tags + Admin config screens
15. **Build `Driver.AbLegacy`** — libplctag, SLC 500/MicroLogix file-based addressing + Admin config screens
16. **Build `Driver.S7`** — S7netplus, Siemens S7-300/400/1200/1500 + Admin config screens
17. **Build `Driver.TwinCat`** — Beckhoff.TwinCAT.Ads v6, native ADS notifications, symbol upload + Admin config screens
Originally split across Phase 3 (Modbus alone), Phase 4 (PLC drivers), and
Phase 5 (specialty drivers). In execution, once `Core.Abstractions` had
stabilised under Phase 1 + Phase 2, each driver landed as its own stream
rather than as a gated mini-phase; the phase numbers were folded into a
single umbrella. Shipped:
**Phase 5 — Specialty drivers**
18. **Build `Driver.Focas`**FANUC FOCAS2 P/Invoke, pre-defined CNC tag set, PMC/macro config + Admin config screens
19. **Build `Driver.OpcUaClient`**OPC UA client gateway/aggregation, namespace remapping, subscription proxying + Admin config screens
12. **`Driver.Modbus`** — NModbus, config-driven tags, internal poll loop, device-as-folder hierarchy (umbrella closure #210)
13. **`Driver.AbCip`** — libplctag, ControlLogix/CompactLogix symbolic tags (#211, live-booted under #220)
14. **`Driver.AbLegacy`** — libplctag, SLC 500 / MicroLogix / PLC-5 file-based addressing (#213, live-booted under #222)
15. **`Driver.S7`** — S7netplus, Siemens S7-300/400/1200/1500 (#212, live-booted under #220)
16. **`Driver.TwinCAT`** — Beckhoff.TwinCAT.Ads v7, native ADS notifications, symbol upload (factory wired 2026-04-23; wire-live deferred, #221)
17. **`Driver.FOCAS`** — FANUC FOCAS2 P/Invoke via Tier-C out-of-process `Driver.FOCAS.Host` (#220 five-PR split; wire-live deferred, #222 follow-up)
18. **`Driver.OpcUaClient`** — OPC UA client gateway / aggregation, namespace remapping, subscription proxying (scaffold #66; live-boot 5/8 stages via `test-opcuaclient.ps1`)
Supporting infrastructure: `DriverFactoryRegistry` + `DriverInstanceBootstrapper`
(#248); per-driver test-client CLI suite (#249#251); e2e test scripts with
aggregate runner (#253); server-side factory + seed SQL per driver (#210#213).
**Decided:**
- **Parity test for Galaxy**: existing v1 IntegrationTests suite + scripted Client.CLI walkthrough (see Section 4 above).
- **Timeline**: no hard deadline. Each phase ships when it's right — tests passing, Galaxy parity bar met. Quality cadence over calendar cadence.
- **FOCAS SDK**: license already secured. Phase 5 can proceed as scheduled; `Fwlib64.dll` available for P/Invoke.
- **FOCAS SDK**: license already secured. FOCAS driver shipped as part of the Phase 3 umbrella with Tier-C host; `Fwlib64.dll` available for P/Invoke (wire-level live-boot gated on lab rig, #222 follow-up).
---
+128
View File
@@ -0,0 +1,128 @@
# Redundancy Interop Playbook (Phase 6.3 Stream F — task #150)
> **Scope**: manual validation that third-party OPC UA clients + AVEVA MXAccess
> observe our non-transparent redundancy signals (ServiceLevel, ServerUriArray,
> RedundancySupport) and fail over to the Backup node when the Primary drops.
>
> **Why manual**: the third-party clients named here are Windows-GUI binaries
> (UaExpert, Kepware QuickClient) or embedded inside AVEVA System Platform.
> Automating any of them into PR-CI is out of scope for v2. This playbook
> captures the minimal dev-box-plus-VM setup and the expected pass criteria so
> the work can be executed repeatably at v2 release readiness and after any
> Phase 6.3 follow-up change.
## Prerequisites
1. Two `OtOpcUa.Server` nodes in one `ServerCluster`:
- Declared as `NodeCount = 2`, `RedundancyMode = Hot` (or `Warm`).
- Each with a distinct `ApplicationUri` (enforced by unique index per
decision #86).
- Each node's `StaticRoutes.xml` points at the other (`ServerCluster.Node[].Host`).
2. `scripts/install/Install-Services.ps1` applied on each node so the
`RedundancyPublisherHostedService` is running.
3. At least one `DriverInstance` with a reachable simulator or PLC so both
servers have a non-empty address space to browse.
4. On the client host:
- `UaExpert` ≥ 1.7 installed
- Kepware `ClientAce QuickClient` (or equivalent) — optional, for a second
client
5. For the AVEVA leg: a `Galaxy.Host` running against an MXAccess deployment
with an external OPC UA client object pointed at the cluster (not at a
single node).
## Expected signals on a running cluster
| Node | `ServiceLevel` | `RedundancySupport` | `ServerUriArray` |
|---|---|---|---|
| Primary, healthy, peer reachable | 200 | `Hot` (or `Warm`) | self + peer |
| Primary, mid-apply | 75 (`PrimaryMidApply`) | same | same |
| Primary, peer UNreachable | 150 (`PrimaryPeerDown`) | same | same |
| Backup, healthy | 100 (`Secondary`) | same | same |
| Either, dwelling in recovery | 50 (`Recovering`) | same | same |
| Either, invariant violation (two Primary, disabled-node mismatch) | 2 (`InvalidTopology`) | same | same |
(The band constants live in `ServiceLevelCalculator.Classify`.)
## Test matrix
Each row is one manual run; pass criterion in the right column.
### Block A — UA protocol signals (UaExpert)
| # | Scenario | Procedure | Pass criterion |
|---|---|---|---|
| A1 | ServiceLevel published | Connect UaExpert to Primary. Browse to `Server.ServerStatus.ServiceLevel`. | Value = 200 (or the expected Band byte per table above) |
| A2 | ServiceLevel updates on peer down | Connect to Primary. Stop Backup (`sc stop OtOpcUa`). Watch `ServiceLevel`. | Transitions 200 → 150 within ~2 s of peer probe timeout |
| A3 | RedundancySupport | Browse to `Server.ServerRedundancy.RedundancySupport`. | Value matches the declared `RedundancyMode` (Warm / Hot / None) |
| A4 | ServerUriArray (non-transparent upgrade) | Requires a redundancy-object-type upgrade follow-up. | When upgrade lands: `ServerUriArray` reports both ApplicationUris, self first |
| A5 | Mid-apply dip | On Primary trigger a `sp_PublishGeneration` apply. | `ServiceLevel` drops to 75 for the apply duration + dwell |
### Block B — Client failover
| # | Scenario | Procedure | Pass criterion |
|---|---|---|---|
| B1 | UaExpert picks Primary by ServiceLevel | In UaExpert configure a Redundancy Group with both endpoint URLs. | Client picks the Primary URL (higher ServiceLevel) |
| B2 | UaExpert cuts over on Primary kill | Kill the Primary's `OtOpcUa` service. | Client session reconnects to Backup within UaExpert's reconnect timeout (default 5 s). Data-change monitored items resume. |
| B3 | UaExpert cuts back when Primary returns | Start the Primary service. Wait ≥ recovery dwell (see `RecoveryStateManager.DwellTime`). | `ServiceLevel` on returning Primary goes through 50 (Recovering) → 200; UaExpert may or may not switch back (client-policy dependent; both are accepted outcomes) |
| B4 | Kepware QuickClient failover | Repeat B1B3 with Kepware in place of UaExpert. | Same pass criteria; establishes we're not UaExpert-specific |
### Block C — Galaxy MXAccess failover
This block validates that an AVEVA System Platform app consuming our cluster
via MXAccess tolerates a Primary drop the same way a native OPC UA client does.
The MXAccess toolkit internally wraps the OPC UA Client and does its own
redundancy negotiation; we're asserting that negotiation honors our
`ServiceLevel` signal.
| # | Scenario | Procedure | Pass criterion |
|---|---|---|---|
| C1 | Galaxy binds to Primary on first connect | Bring the cluster up. Start a Galaxy `$MxAccessClient` object pointed at the cluster with both node URLs. | Galaxy reports `QUALITY = Good` + initial values from the Primary |
| C2 | Galaxy redirects on Primary drop | Stop the Primary. | Galaxy's `QUALITY` briefly goes `Uncertain`, then back to `Good`; values continue streaming from the Backup within MXAccess's `ReconnectInterval` (default 20 s) |
| C3 | Galaxy handles mid-apply dip | Trigger a generation apply on the Primary. | Galaxy continues reading — the mid-apply dip is advertisory (ServiceLevel 75), not a session drop; MXAccess should stay bound |
## Recording results
Copy the tables above into a tracking doc per run. The tracking doc shape:
```
Run date: 2026-MM-DD
Cluster: <id> Primary: <node> Backup: <node> Release: <sha>
A1: PASS evidence: UaExpert screenshot uaexpert-a1.png
A2: PASS evidence: ServiceLevel trend grafana-a2.png
```
One pass of every row is the acceptance criterion. Re-run after any Phase 6.3
follow-up ships (especially the non-transparent redundancy-type upgrade, which
flips A4 from "deferred" to "expected pass").
## Known limitations
- **A4 pending**: `Server.ServerRedundancy` on our current SDK build lands as
the base `ServerRedundancyState`, which has no `ServerUriArray` child.
`ServerRedundancyNodeWriter.ApplyServerUriArray` logs-and-skips until the
redundancy-object-type upgrade follow-up lands.
- **Recovery dwell default**: `RecoveryStateManager.DwellTime` defaults to 60 s
in `Program.cs`. Adjust via future config knob if B3 takes too long to
observe.
- **C-block external dependency**: The `Galaxy.Host` side of the redundancy
story is largely out of our code — it's MXAccess's own client-redundancy
policy talking to our published ServiceLevel. A negative result on C1-C3
does not necessarily indicate an OtOpcUa bug; cross-check with UaExpert
(Block A / B) first.
## Automation notes (why this is a playbook, not a test)
- UaExpert and Kepware binaries are closed-source Windows GUIs; they don't
ship headless CLIs for the browse/connect/subscribe flows.
- The OPC Foundation reference SDK *can* drive every scenario, but our own
`Driver.OpcUaClient` tests already cover that client's behaviour; Block B
adds value specifically because these two clients have independent
redundancy implementations we don't control.
- For the sub-set of scenarios that *can* be automated — the self-loopback
case where our own `otopcua-cli` drives Primary + Backup — the existing
`tests/ZB.MOM.WW.OtOpcUa.Server.Tests/RedundancyStatePublisherTests` +
`ServiceLevelCalculatorTests` (unit) + `ClusterTopologyLoaderTests`
(integration) already cover the math + data path. The wire-level assertion
that the values actually land on the right OPC UA nodes is covered by
`ServerRedundancyNodeWriterTests`.
+61 -40
View File
@@ -1,7 +1,7 @@
# v2 Release Readiness
> **Last updated**: 2026-04-19 (all three release blockers CLOSED — Phase 6.3 Streams A/C core shipped)
> **Status**: **RELEASE-READY (code-path)** for v2 GA — all three code-path release blockers are closed. Remaining work is manual (client interop matrix, deployment checklist signoff, OPC UA CTT pass) + hardening follow-ups; see exit-criteria checklist below.
> **Last updated**: 2026-04-24 (Phase 5 driver complement closed — AB CIP, AB Legacy, TwinCAT, FOCAS all shipped; FOCAS Tier-C retired for a pure-managed in-process client)
> **Status**: **RELEASE-READY (code-path)** for v2 GA. All three original code-path release blockers remain closed. Phase 5 is now complete. Remaining work is manual (live-hardware validations, client interop matrix, deployment checklist signoff, OPC UA CTT pass) + hardening follow-ups; see exit-criteria checklist below.
This doc is the single view of where v2 stands against its release criteria. Update it whenever a deferred follow-up closes or a new release blocker is discovered.
@@ -14,67 +14,78 @@ This doc is the single view of where v2 stands against its release criteria. Upd
| Phase 2 — Galaxy driver split (Proxy/Host/Shared) | ✓ | Shipped |
| Phase 3 — OPC UA server + LDAP + security profiles | ✓ | Shipped |
| Phase 4 — Redundancy scaffold (entities + endpoints) | ✓ | Shipped (runtime closes in 6.3) |
| Phase 5 — Drivers | ⚠ partial | Galaxy / Modbus / S7 / OpcUaClient shipped; AB CIP / AB Legacy / TwinCAT / FOCAS deferred (task #120) |
| Phase 6.1 — Resilience & Observability | ✓ | **SHIPPED** (PRs #7883) |
| Phase 6.2 — Authorization runtime | ◐ core | **SHIPPED (core)** (PRs #8488); dispatch wiring + Admin UI deferred |
| Phase 6.3 — Redundancy runtime | ◐ core | **SHIPPED (core)** (PRs #8990); coordinator + UA-node wiring + Admin UI + interop deferred |
| Phase 6.4 — Admin UI completion | ◐ data layer | **SHIPPED (data layer)** (PRs #9192); Blazor UI + OPC 40010 address-space wiring deferred |
| Phase 5 — Drivers | ✓ | **Shipped** Galaxy, Modbus (+ DL205/S7/MELSEC profiles), S7 native, OPC UA Client, AB CIP, AB Legacy, TwinCAT ADS, FOCAS (managed wire client) |
| Phase 6.1 — Resilience & Observability | ✓ | Shipped (PRs #7883) |
| Phase 6.2 — Authorization runtime | ◐ core | Core shipped (PRs #8488, #94 dispatch wiring); finer-grained Browse/Subscribe/Alarm/Call gating + 3-user interop matrix deferred |
| Phase 6.3 — Redundancy runtime | ◐ core | Core shipped (PRs #8990, #9899); peer-probe HostedServices, OPC UA variable-node binding, `sp_PublishGeneration` lease wrap, client interop matrix deferred |
| Phase 6.4 — Admin UI completion | ◐ data layer + Identification | Data layer + OPC 40010 Identification folder shipped (PRs #9192, Identification audit close-out 2026-04-23); Blazor UI pieces deferred |
**Aggregate test counts:** 906 baseline (pre-Phase-6) → **1159 passing** across Phase 6. One pre-existing Client.CLI `SubscribeCommandTests.Execute_PrintsSubscriptionMessage` flake tracked separately.
**Driver integration-test counts** (end-to-end against live or simulated targets): Modbus 26, FOCAS 9, AbCip 7, OpcUaClient 3, S7 3, AbLegacy 2, TwinCAT 2. Plus Galaxy's separate cross-FX parity/stability suite.
**Aggregate test counts** (2026-04-19 baseline): 1159 passing across the solution. One pre-existing Client.CLI `SubscribeCommandTests.Execute_PrintsSubscriptionMessage` flake tracked separately. Rerun `dotnet test ZB.MOM.WW.OtOpcUa.slnx` after the FOCAS migration commits land to refresh the number.
## Release blockers (must close before v2 GA)
Ordered by severity + impact on production fitness.
All code-path release blockers are closed. The remaining items are live-hardware / manual validations listed under exit criteria.
### ~~Security — Phase 6.2 dispatch wiring~~ (task #143**CLOSED** 2026-04-19, PR #94)
**Closed**. `AuthorizationGate` + `NodeScopeResolver` now thread through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager`. `OnReadValue` + `OnWriteValue` + all four HistoryRead paths call `gate.IsAllowed(identity, operation, scope)` before the invoker. Production deployments activate enforcement by constructing `OpcUaApplicationHost` with an `AuthorizationGate(StrictMode: true)` + populating the `NodeAcl` table.
**Closed**. `AuthorizationGate` + `NodeScopeResolver` thread through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager`. `OnReadValue` + `OnWriteValue` + all four HistoryRead paths call `gate.IsAllowed(identity, operation, scope)` before the invoker. Production deployments activate enforcement by constructing `OpcUaApplicationHost` with an `AuthorizationGate(StrictMode: true)` + populating the `NodeAcl` table.
Additional Stream C surfaces (not release-blocking, hardening only):
Remaining Stream C surfaces (hardening, not release-blocking):
- Browse + TranslateBrowsePathsToNodeIds gating with ancestor-visibility logic per `acl-design.md` §Browse.
- CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).
- Alarm Acknowledge / Confirm / Shelve gating.
- Call (method invocation) gating.
- Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.
- ~~Browse + TranslateBrowsePathsToNodeIds gating with ancestor-visibility logic per `acl-design.md` §Browse.~~ **Partial, 2026-04-24.** `DriverNodeManager.Browse` override post-filters the `ReferenceDescription` list via a new `FilterBrowseReferences` helper — denied nodes disappear silently per OPC UA convention. Ancestor-visibility implication (Read-grant at `Line/Tag` implying Browse on `Line`) still to ship; needs a subtree-has-any-grant query on the trie evaluator. `TranslateBrowsePathsToNodeIds` surface not yet wired.
- ~~CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).~~ **Partial, 2026-04-24.** `DriverNodeManager.CreateMonitoredItems` override pre-gates each request and pre-populates `BadUserAccessDenied` into the errors slot for denied items (the base stack honours pre-set errors and skips those items). Decision #153's per-item `(AuthGenerationId, MembershipVersion)` stamp for detecting mid-subscription revocation is still to ship — needs subscription-layer plumbing. TransferSubscriptions not yet wired (same pattern).
- ~~Alarm Acknowledge / Confirm / Shelve gating.~~ **Partial, 2026-04-24.** Acknowledge + Confirm map to dedicated `OpcUaOperation.AlarmAcknowledge` / `AlarmConfirm` via `MapCallOperation`; Shelve falls through to generic `OpcUaOperation.Call` (needs per-instance method NodeId resolution to distinguish — follow-up).
- ~~Call (method invocation) gating.~~ **Closed 2026-04-24.** `DriverNodeManager.Call` override pre-gates each `CallMethodRequest` via `GateCallMethodRequests`. Denied calls return `BadUserAccessDenied` without running the method. Alarm methods map to alarm-specific operation kinds; everything else gates as generic `Call`.
- ~~Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.~~ **Closed 2026-04-24.** `AuthorizationBootstrap` now loads `NodeAcl` rows for the current generation into a `PermissionTrieCache`, builds the gate, and merges every registered driver's `EquipmentNamespaceContent` into a full-path `NodeScopeResolver` index. `OpcUaServerService` calls the bootstrap after the equipment registry is populated, before `OpcUaApplicationHost.StartAsync`. Disabled by default — operators flip `Node:Authorization:Enabled=true` to enforce, `StrictMode=true` to reject anonymous/no-groups identities.
- 3-user integration matrix covering every operation × allow/deny.
These are additional hardening — the three highest-value surfaces (Read / Write / HistoryRead) are now gated, which covers the base-security gap for v2 GA.
### ~~Config fallback — Phase 6.1 Stream D wiring~~ (task #136**CLOSED** 2026-04-19, PR #96)
**Closed**. `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` end-to-end: bootstrap calls go through the timeout → retry → fallback-to-sealed pipeline; every central-DB success writes a fresh sealed snapshot so the next cache-miss has a known-good fallback; `StaleConfigFlag.IsStale` is now consumed by `HealthEndpointsHost.usingStaleConfig` so `/healthz` body reports reality.
**Closed**. `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` end-to-end; `/healthz` surfaces the stale flag.
Production activation: Program.cs switches `NodeBootstrap → SealedBootstrap` + constructs `OpcUaApplicationHost` with the `StaleConfigFlag` as an optional ctor parameter.
Remaining follow-ups (hardening, not release-blocking):
Remaining follow-ups (hardening):
- A `HostedService` that polls `sp_GetCurrentGenerationForCluster` periodically so peer-published generations land in this node's cache without a restart.
- Richer snapshot payload via `sp_GetGenerationContent` so fallback can serve the full generation content (DriverInstance enumeration, ACL rows, etc.) from the sealed cache alone.
- Richer snapshot payload via `sp_GetGenerationContent` so fallback can serve full generation content (DriverInstance enumeration, ACL rows, etc.) from the sealed cache alone.
### ~~Redundancy — Phase 6.3 Streams A/C core~~ (tasks #145 + #147**CLOSED** 2026-04-19, PRs #9899)
**Closed**. The runtime orchestration layer now exists end-to-end:
- `RedundancyCoordinator` reads `ClusterNode` + peer list at startup (Stream A shipped in PR #98). Invariants enforced: 1-2 nodes (decision #83), unique ApplicationUri (#86), ≤1 Primary in Warm/Hot (#84). Startup fails fast on violation; runtime refresh logs + flips `IsTopologyValid=false` so the calculator falls to band 2 without tearing down.
- `RedundancyStatePublisher` orchestrates topology + apply lease + recovery state + peer reachability through `ServiceLevelCalculator` + emits `OnStateChanged` / `OnServerUriArrayChanged` edge-triggered events (Stream C core shipped in PR #99). The OPC UA `ServiceLevel` Byte variable + `ServerUriArray` String[] variable subscribe to these events.
**Closed**. `RedundancyCoordinator` + `RedundancyStatePublisher` + `PeerReachabilityTracker` orchestrate topology + apply lease + recovery state + peer reachability through `ServiceLevelCalculator` + emit `OnStateChanged` / `OnServerUriArrayChanged` edge-triggered events.
Remaining Phase 6.3 surfaces (hardening, not release-blocking):
- `PeerHttpProbeLoop` + `PeerUaProbeLoop` HostedServices that poll the peer + write to `PeerReachabilityTracker` on each tick. Without these the publisher sees `PeerReachability.Unknown` for every peer → Isolated-Primary band (230) even when the peer is up. Safe default (retains authority) but not the full non-transparent-redundancy UX.
- OPC UA variable-node wiring layer: bind the `ServiceLevel` Byte node + `ServerUriArray` String[] node to the publisher's events via `BaseDataVariable.OnReadValue` / direct value push. Scoped follow-up on the Opc.Ua.Server stack integration.
- `sp_PublishGeneration` wraps its apply in `await using var lease = coordinator.BeginApplyLease(...)` so the `PrimaryMidApply` band (200) fires during actual publishes (task #148 part 2).
- Client interop matrix validation — Ignition / Kepware / Aveva OI Gateway (Stream F, task #150). Manual + doc-only work; doesn't block code ship.
- ~~`PeerHttpProbeLoop` + `PeerUaProbeLoop` HostedServices populating `PeerReachabilityTracker` on each tick.~~ **Closed 2026-04-24.** Two-layer probe model shipped: HTTP probe at 2 s / 1 s timeout against `/healthz`; OPC UA probe at 10 s / 5 s timeout via `DiscoveryClient.GetEndpoints`, short-circuiting when HTTP reports the peer unhealthy. Registered on the Server as `AddHostedService<PeerHttpProbeLoop>` + `AddHostedService<PeerUaProbeLoop>`. Publisher now sees accurate `PeerReachability` per peer instead of degrading to `Unknown` → Isolated-Primary band (230).
- OPC UA variable-node wiring: bind `ServiceLevel` Byte + `ServerUriArray` String[] to the publisher's events via `BaseDataVariable.OnReadValue` / direct value push.
- ~~`sp_PublishGeneration` wraps its apply in `await using var lease = coordinator.BeginApplyLease(...)` so the `PrimaryMidApply` band (200) fires during actual publishes (task #148 part 2).~~ **Closed 2026-04-24.** The apply loop now lives in `GenerationRefreshHostedService` — polls `sp_GetCurrentGenerationForCluster` every 5s, opens a lease when a new generation is detected, calls `RedundancyCoordinator.RefreshAsync` inside the `await using`, releases the lease on all exit paths. Replaces the previous "topology never refreshes without a process restart" behaviour.
- Client interop matrix — Ignition / Kepware / Aveva OI Gateway (Stream F, task #150). Manual + doc-only.
### Remaining drivers (task #120)
### ~~Phase 5 driver complement~~ (task #120**CLOSED** 2026-04-24)
AB CIP, AB Legacy, TwinCAT ADS, FOCAS drivers are planned but unshipped. Decision pending on whether these are release-blocking for v2 GA or can slip to a v2.1 follow-up.
**Closed**. All four deferred drivers shipped:
- **AB CIP** (PRs #202222) — `Driver.AbCip`, `Driver.AbCip.IntegrationTests` (7 tests), AB CIP Cli. Live-boot verified against a ControlLogix rig.
- **AB Legacy** (PRs #202, #223) — `Driver.AbLegacy`, `Driver.AbLegacy.IntegrationTests` (2 tests), AB Legacy Cli. PCCC cip-path workaround for SLC/MicroLogix.
- **TwinCAT ADS** (PRs #205, this branch `task-galaxy-e2e`) — `Driver.TwinCAT`, `Driver.TwinCAT.IntegrationTests` (2 tests), TwinCAT Cli. TCBSD/ESXi fixture for e2e since local Hyper-V / TwinCAT RTIME are mutually exclusive on the dev box.
- **FOCAS** (PRs #173, #199 + this session's migration) — `Driver.FOCAS` with an **in-process managed `FocasWireClient`** that speaks FOCAS/2 over TCP directly. Tier-C isolation retired — `Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `FwlibNative` P/Invoke + shim DLL + NSSM service all deleted. `Driver.FOCAS.IntegrationTests` covers 9 scenarios (fixed tree identity/axes/program/timers/spindle + user-authored PARAM/MACRO/PMC reads, Browse, Subscribe, IAlarmSource raise/clear, Probe transitions).
Decision recorded: FOCAS is **read-only** against the CNC by design — writes return `BadNotWritable`. See `docs/drivers/FOCAS.md` + `docs/drivers/FOCAS-Test-Fixture.md` for the deployment + coverage map.
## Nice-to-haves (not release-blocking)
- **Admin UI** — Phase 6.1 Stream E.2/E.3 (`/hosts` column refresh), Phase 6.2 Stream D (`RoleGrantsTab` + `AclsTab` Probe), Phase 6.3 Stream E (`RedundancyTab`), Phase 6.4 Streams A/B UI pieces, Stream C DiffViewer, Stream D `IdentificationFields.razor`. Tasks #134, #144, #149, #153, #155, #156, #157.
- **Background services** — Phase 6.1 Stream B.4 `ScheduledRecycleScheduler` HostedService (task #137), Phase 6.1 Stream A analyzer (task #135 — Roslyn analyzer asserting every capability surface routes through `CapabilityInvoker`).
- **Multi-host dispatch** — Phase 6.1 Stream A follow-up (task #135). Currently every driver gets a single pipeline keyed on `driver.DriverInstanceId`; multi-host drivers (Modbus with N PLCs) need per-PLC host resolution so failing PLCs trip per-PLC breakers without poisoning siblings. Decision #144 requires this but we haven't wired it yet.
- **Multi-host dispatch** — Phase 6.1 Stream A follow-up (task #135). Every driver currently gets a single pipeline keyed on `driver.DriverInstanceId`; multi-host drivers (Modbus with N PLCs) need per-PLC host resolution so failing PLCs trip per-PLC breakers without poisoning siblings. Decision #144 requires this but not wired.
- **Phase 7** — scripting + alarming + historian sink (plan drafted 2026-04-20 in `docs/v2/implementation/phase-7-*.md`). Out of scope for v2 GA.
## Live-hardware validations (task #54 + task family)
The code ships; these tasks remain open as lab/field verification:
- **#54** — FOCAS live-CNC wire-level smoke against a real FANUC control. The mock's wire responder is PDU-verified against `fwlibe64.dll` upstream but OtOpcUa's managed client has not been pointed at a production CNC.
- **AB CIP live-boot** — already passed on a ControlLogix rig (PR #222). Continue to run ahead of each release.
- **TwinCAT wire-live** — TCBSD/ESXi fixture covers the common path; production PLC verification remains lab-gated.
## Running the release-readiness check
@@ -82,7 +93,12 @@ AB CIP, AB Legacy, TwinCAT ADS, FOCAS drivers are planned but unshipped. Decisio
pwsh ./scripts/compliance/phase-6-all.ps1
```
This meta-runner invokes each `phase-6-N-compliance.ps1` script in sequence and reports an aggregate PASS/FAIL. It is the single-command verification that what we claim is shipped still compiles + tests pass + the plan-level invariants are still satisfied.
This meta-runner invokes each `phase-6-N-compliance.ps1` script in sequence and reports an aggregate PASS/FAIL:
- `phase-6-1-compliance.ps1` — Resilience & Observability
- `phase-6-2-compliance.ps1` — Authorization runtime
- `phase-6-3-compliance.ps1` — Redundancy runtime
- `phase-6-4-compliance.ps1` — Admin UI completion
Exit 0 = every phase passes its compliance checks + no test-count regression.
@@ -92,18 +108,23 @@ v2 GA requires all of the following:
- [ ] All four Phase 6.N compliance scripts exit 0.
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` passes with ≤ 1 known-flake failure.
- [ ] Release blockers listed above all closed (or consciously deferred to v2.1 with a written decision).
- [x] Release blockers listed above all closed.
- [x] Phase 5 driver complement shipped (Galaxy, Modbus, S7, OpcUaClient, AbCip, AbLegacy, TwinCAT, FOCAS).
- [ ] Production deployment checklist (separate doc) signed off by Fleet Admin.
- [ ] At least one end-to-end integration run against the live Galaxy on the dev box succeeds.
- [ ] FOCAS live-CNC wire-level smoke (#54) runs clean against a real FANUC control.
- [ ] OPC UA conformance test (CTT or UA Compliance Test Tool) passes against the live endpoint.
- [ ] Non-transparent redundancy cutover validated with at least one production client (Ignition 8.3 recommended — see decision #85).
## Change log
- **2026-04-19**Release blocker #3 **closed** (PRs #9899). Phase 6.3 Streams A + C core shipped: `ClusterTopologyLoader` + `RedundancyCoordinator` + `RedundancyStatePublisher` + `PeerReachabilityTracker`. Code-path release blockers all closed; remaining Phase 6.3 surfaces (peer-probe HostedServices, OPC UA variable-node binding, sp_PublishGeneration lease wrap, client interop matrix) are hardening follow-ups.
- **2026-04-19**Release blocker #2 **closed** (PR #96). `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag`; `/healthz` now surfaces the stale flag. Remaining follow-ups (periodic poller + richer snapshot payload) downgraded to hardening.
- **2026-04-19**Release blocker #1 **closed** (PR #94). `AuthorizationGate` wired into `DriverNodeManager` Read / Write / HistoryRead dispatch. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-grained scope resolution) downgraded to hardening follow-ups — no longer release-blocking.
- **2026-04-19**Phase 6.4 data layer merged (PRs #9192). Phase 6 core complete. Capstone doc created.
- **2026-04-24**Phase 5 driver complement closed (task #120 CLOSED). AB CIP, AB Legacy, TwinCAT, FOCAS all shipped. FOCAS migration: retired the Tier-C split (`Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `FwlibNative` + shim DLL deleted) in favour of a pure-managed in-process `FocasWireClient` inlined into `Driver.FOCAS`; driver is now read-only against the CNC by design. Integration test matrix grew to cover Browse / Subscribe / IAlarmSource / Probe end-to-end.
- **2026-04-23**Phase 6.4 audit close-out. IdentificationFolderBuilder + OPC 40010 Identification folder verified against the shipped code.
- **2026-04-20**Phase 7 plan drafted (`phase-7-scripting-and-alarming.md`, `phase-7-e2e-smoke.md`). Out of scope for v2 GA.
- **2026-04-19**Release blocker #3 closed (PRs #9899). Phase 6.3 Streams A + C core shipped: `ClusterTopologyLoader` + `RedundancyCoordinator` + `RedundancyStatePublisher` + `PeerReachabilityTracker`. Code-path release blockers all closed; remaining Phase 6.3 surfaces (peer-probe HostedServices, OPC UA variable-node binding, `sp_PublishGeneration` lease wrap, client interop matrix) are hardening follow-ups.
- **2026-04-19** — Release blocker #2 closed (PR #96). `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag`; `/healthz` surfaces the stale flag. Remaining follow-ups (periodic poller + richer snapshot payload) downgraded to hardening.
- **2026-04-19** — Release blocker #1 closed (PR #94). `AuthorizationGate` wired into `DriverNodeManager` Read / Write / HistoryRead dispatch. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-grained scope resolution) downgraded to hardening follow-ups — no longer release-blocking.
- **2026-04-19** — Phase 6.4 data layer merged (PRs #9192). Phase 6 core complete.
- **2026-04-19** — Phase 6.3 core merged (PRs #8990). `ServiceLevelCalculator` + `RecoveryStateManager` + `ApplyLeaseRegistry` land as pure logic; coordinator / UA-node wiring / Admin UI / interop deferred.
- **2026-04-19** — Phase 6.2 core merged (PRs #8488). `AuthorizationGate` + `TriePermissionEvaluator` + `LdapGroupRoleMapping` land; dispatch wiring + Admin UI deferred.
- **2026-04-19** — Phase 6.1 shipped (PRs #7883). Polly resilience + Tier A/B/C stability + health endpoints + LiteDB generation-sealed cache + Admin `/hosts` data layer all live.
+31
View File
@@ -0,0 +1,31 @@
# TwinCAT driver — v3 backlog
The v2 TwinCAT driver is considered solid: 28 integration tests (14 `[TwinCATFact]` +
16-case `[TwinCATTheory]`) running live against the TCBSD fixture, 110 unit tests,
three latent driver bugs shaken out (notification cycle units, `STRING(N)` mapper,
bit-indexed BOOL path). Further work is deferred.
Archived from `docs/drivers/TwinCAT-Test-Fixture.md` § Follow-up candidates.
## Deferred items
1. **TC2 coverage** — spin up a TC2 runtime (Windows CE IPC or legacy XAR)
and run the same suite; any delta surfaces. Blocked on hardware.
2. **Notification coalescing under load** — run the subscribe test while
the PLC cycle is saturated (bump `lineSim` complexity, watch for
dropped notifications). Doable on current rig; deferred as lower
priority than v3 feature work.
3. **Multi-hop AMS route** — add a test behind an IPC gateway with a
chained route entry. Blocked on hardware (gateway IPC).
4. **License-rotation automation** — XAR's 7-day trial expires on
schedule. Either automate `TcActivate.exe /reactivate` via a scheduled
task on the VM (not officially supported; reportedly works for some
TC3 builds), or buy a paid runtime license (~$1k one-time per runtime
per CPU) to kill the rotation. Ops item, not code.
5. **Lab rig** — cheapest IPC (CX7000 / CX9020) on a dedicated network;
the only route that covers TC2 + real EtherCAT I/O timing + cycle
jitter under CPU load. Blocked on hardware + budget.
+274
View File
@@ -0,0 +1,274 @@
# Galaxy / LMX Backend — Restructuring Options
## Context
Today the Galaxy driver is structured very differently from every other driver
in this repo:
- **Galaxy.Proxy** (.NET 10, in-process): tiny shim that frames IPC to the host.
- **Galaxy.Host** (.NET Framework 4.8 **x86**, NSSM-wrapped Windows service):
owns MXAccess COM, the STA pump, the ZB Galaxy Repository SQL queries, the
Wonderware Historian SDK plugin, the per-platform `ScanState` probe manager,
the alarm tracker (`.InAlarm`/`.Priority`/`.DescAttrName`/`.Acked` state
machine + ack writer), recycle policy, and post-mortem MMF.
Other drivers (Modbus, S7, AB CIP, OpcUaClient, TwinCAT, FOCAS Tier-C) are
**in-process Tier-A drivers** in the .NET 10 server. They do data + browse
only; historian and alarming are driver-agnostic concerns at the server layer.
A sibling project, **mxaccessgw**
(`C:\Users\dohertj2\Desktop\mxaccessgw`), already provides:
- A .NET 10 x64 gRPC gateway in front of per-session .NET 4.8 x86 worker
processes that own MXAccess COM, the STA, and event sinks
(`MxGateway.Server` + `MxGateway.Worker`).
- A full MXAccess command + event surface (`Register`, `AddItem`, `Advise`,
`Write`, `WriteSecured`, `OnDataChange`, `OnWriteComplete`, etc.).
- A cached, deploy-gated, paged **Galaxy Repository browse** RPC
(`galaxy_repository.v1`) reading the same ZB tables we read today, with the
query bodies kept byte-identical to OtOpcUa.
- A .NET client library (`clients/dotnet/MxGateway.Client`).
- API-key auth, Blazor dashboard, structured logs, metrics, watchdog/recycle.
The proposal is to **strip Galaxy down to data + browse** — push historian and
alarming out to server-level subsystems where they live for every other driver
— and pick how the slimmed-down driver talks to MXAccess.
---
## What "push historian and alarming out" means
Both options below assume the same scope reduction; they only differ in how
the driver reaches MXAccess.
| Concern | Today (Galaxy.Host) | After |
|---|---|---|
| Galaxy hierarchy browse | `GalaxyRepository` (SQL) inside Host | Driver (Option 1: via gw browse RPC; Option 2: own SQL or worker) |
| Live read / write / subscribe | `MxAccessClient` + STA pump in Host | gw (Option 1) or embedded worker (Option 2) |
| Wonderware Historian SDK | `HistorianDataSource` in Host (x86) | Separate Historian data source plugged into the server's HA service. Likely stays its own .NET 4.8 x86 sidecar because the SDK is x86-only; **independent of the Galaxy driver lifecycle**. |
| Alarm state machine (`.InAlarm`/`.Acked` quartet, transitions, ack writer) | `GalaxyAlarmTracker` in Host | Server-level A&E subsystem subscribes to alarm-bearing attributes the driver advertises and runs the AlarmCondition state machine generically. Driver only flags `IsAlarm=true` in node metadata. |
| `ScanState` per-platform probes | `GalaxyRuntimeProbeManager` in Host | Driver-side: ScanState is just another tag subscription; the driver re-advises one per discovered `$WinPlatform`/`$AppEngine` and reports `HostConnectivityStatus` from the value stream. No special host-side machinery. |
After the strip-down, the Galaxy driver looks like Modbus or OpcUaClient: it
discovers nodes, reads/writes/subscribes, and reports per-host transport
health. Everything else is the server's problem.
---
## Option 1 — Tier-A driver against the MxAccess Gateway
`Driver.Galaxy` becomes a regular **in-process .NET 10 driver** in the OtOpcUa
server (no `.Host`, no `.Proxy` split, no x86). It talks to a separately
deployed `MxGateway.Server` over gRPC using `MxGateway.Client`. Browse comes
from `galaxy_repository.v1.DiscoverHierarchy`. Live data comes from
`MxAccessGateway.OpenSession`/`AddItem`/`Advise`/`StreamEvents`.
```
OtOpcUa.Server (.NET 10 x64)
└── Driver.Galaxy (in-proc, .NET 10)
└── gRPC ──► MxGateway.Server (.NET 10 x64)
└── pipe ──► MxGateway.Worker (.NET 4.8 x86)
└── MXAccess COM (STA)
```
### Pros
- **Architectural parity with other drivers.** No bespoke `Host` service, no
x86 build target, no NSSM wrapper, no STA pump in this repo, no
`PostMortemMmf`/`RecyclePolicy` we maintain ourselves.
- **OtOpcUa server stops needing AVEVA installed on its own host.** The
gateway runs where MXAccess lives; the OPC UA server can live on a different
box, in a container, or on a hardened jump host.
- **One canonical MXAccess surface across the org.** Any future tool — a
diagnostic CLI, a Historian replacement, an integration harness — talks to
the same gw with the same parity guarantees we get.
- **Multi-instance friendly.** Two OtOpcUa servers (warm/hot redundancy) share
one gw and one MXAccess footprint instead of each running their own
`Galaxy.Host` with duplicate Wonderware client identities.
- **Browse + cache for free.** `galaxy_repository.v1` already implements the
hierarchy cache, deploy-time gating, paging, and `WatchDeployEvents` — we
delete `GalaxyRepository.cs`, `GalaxyHierarchyRow.cs`, the change-detection
poll loop, and the matching SQL plumbing.
- **Operability for free.** API-key auth, Blazor dashboard at `/dashboard`,
metrics via `Meter`, structured logs with redaction. We currently have
none of that in `Galaxy.Host`.
- **Future backend swap.** When AVEVA exposes managed NMX or another modern
path, gw routes to it without OtOpcUa changes (gw's stated roadmap).
- **Tighter blast radius.** A hung COM event, a leaking COM object, a
crashing worker — all owned by gw's session/worker isolation, not the
OPC UA server process.
- **Simpler version story for OtOpcUa.** Driver is plain .NET 10; the
bitness/runtime split lives entirely in mxaccessgw's repo.
### Cons
- **Extra deployment dependency.** mxaccessgw is now a service that has to be
installed, monitored, and kept on a compatible protocol version. For a
single-box install this is one more moving piece.
- **Two hops on every call** (driver→gw, gw→worker) instead of one
(proxy→host). Today's hop is MessagePack over a named pipe; the new outer
hop is gRPC over TCP. Per-call overhead is a few hundred microseconds, not
a regression for OPC UA workloads but measurable for very chatty bursts.
- **Auth/secret surface added.** OtOpcUa now holds an API key for gw and
rotates it; gw's SQLite-backed key store has to be managed.
- **Failure model spans two processes we don't own** — gw + worker. Reconnect
logic in our driver has to ride both: gw transport drop, gw session lease
expiry, gw-detected worker crash, plus the worker's own MXAccess reconnect.
All of it is exposed in the gRPC contract, but it's still surface area.
- **Cross-repo protocol coupling.** Bumping `mxaccessgw` major version (gRPC
contract changes, session shape changes) ripples into OtOpcUa releases.
Mitigated by versioned contracts; not free.
- **Galaxy redundancy still has to think about gw.** A redundancy fail-over of
OtOpcUa is independent of the gw's session lifecycle. Need to decide whether
the standby holds an open session or only opens it on takeover.
- **Sensitive writes (`WriteSecured`, `AuthenticateUser`) cross the network**
if gw is remote. TLS + mTLS solves it but adds setup.
---
## Option 2 — Embed mxaccessgw worker, no gateway
`Driver.Galaxy` is still in-process .NET 10, but instead of speaking gRPC to a
gateway service, it directly **launches and supervises one (or more)
`MxGateway.Worker` processes** and talks to them over the same named-pipe
worker protocol gw uses internally
(`docs/WorkerFrameProtocol.md`, `docs/WorkerProcessLauncher.md`). Browse stays
local — driver runs the SQL queries against ZB itself.
```
OtOpcUa.Server (.NET 10 x64)
└── Driver.Galaxy (in-proc, .NET 10)
├── ZB SQL (local, in-proc)
└── pipe ──► MxGateway.Worker (.NET 4.8 x86, child process)
└── MXAccess COM (STA)
```
### Pros
- **One hop, not two.** Driver → worker pipe is the same shape as today's
Proxy → Host pipe. Latency is on par with the current implementation.
- **No new service to deploy.** Worker is launched as a child process the
same way `Galaxy.Host` is launched today (just with mxaccessgw's worker
binary). Single-machine install story stays simple.
- **Keeps the trust boundary local.** No API keys, no TLS, no exposed gRPC
port on the OtOpcUa box.
- **Reuses mxaccessgw's parity-tested worker code** — STA pump, COM lifetime,
event conversion, fault model — without inheriting gw's ASP.NET Core /
Blazor / SQLite footprint.
- **Tighter ownership.** OtOpcUa owns the worker lifecycle; recycle, kill,
restart, post-mortem all decided by the driver, not by an external service
we don't control.
- **Easier to reason about during integration tests.** No second service to
spin up in CI; just a child process per test fixture.
### Cons
- **OtOpcUa server box must still have AVEVA + MXAccess installed**, since
the worker runs locally. The major deployment win of Option 1
(separating where MXAccess runs from where OtOpcUa runs) is lost.
- **OtOpcUa still ships an x86 .NET 4.8 binary alongside it.** Even if we
vendor mxaccessgw's worker rather than write our own, installer complexity
and bitness considerations remain.
- **We re-implement everything gw already gives.** Process supervision,
watchdog, recycle policy, heartbeat, post-mortem — these are exactly what
`Galaxy.Host` does today, and they'd live in our repo again, just calling a
different worker binary.
- **No browse cache, no deploy gating, no `WatchDeployEvents`** — we keep
running our own ZB queries and our own `time_of_last_deploy` poll, or we
port gw's cache code into the driver. Either way it's duplicated logic.
- **No auth, no dashboard, no metrics.** Operability stays where it is today
(i.e., minimal). Adding it ourselves is a separate project.
- **Multiple OtOpcUa instances multiply MXAccess sessions.** Redundancy pair
→ two MXAccess clients on the Galaxy from the same software, vs. Option 1
where one gw arbitrates.
- **Worker protocol coupling without the contract surface.** We depend on
mxaccessgw's worker IPC frame format — a surface that mxaccessgw treats as
*internal* to its own gw↔worker boundary. If they refactor it, we have to
follow. The public gRPC contract (Option 1) is more stable by design.
- **Loses the "common MXAccess access point" benefit.** Other consumers
(CLI, integration harnesses, future tools) can't share state with our
embedded worker.
---
## Status quo (for comparison)
Keep `Galaxy.Host` as today, and in-place rip out historian + alarming +
probe manager. End state: the Host shrinks to `MxAccessClient` + `GalaxyRepository`,
which is roughly what Option 2 ends up looking like — but with our hand-rolled
COM bridge instead of mxaccessgw's worker. Not a serious option once
mxaccessgw exists; we'd be maintaining a parallel implementation of the same
thing.
---
## Recommendation (effort-agnostic)
**Go with Option 1 — Tier-A driver against the MxAccess Gateway.**
The decisive arguments:
1. **It's the only option that aligns Galaxy with how every other driver in
this repo is structured.** The user's stated goal — "keep lmx to data +
browsing, similar to other drivers" — only fully resolves if there is no
`.Host` and no x86 build artifact in this repo at all. Option 2 still has
an x86 child process and supervisor code; it's `Galaxy.Host` with a
different worker binary inside.
2. **It separates *where MXAccess runs* from *where OtOpcUa runs*.** That is
a strategically larger win than a few hundred microseconds of per-call
latency. The OPC UA server stops being chained to AVEVA install footprint,
bitness, and Wonderware client identity — which removes a class of
deployment, redundancy, and CI problems we hit today (e.g., the
`DESKTOP-6JL3KKO` Hyper-V/Docker conflict, the `dohertj2`-only pipe ACL,
the live-Galaxy smoke test prerequisites).
3. **It collapses scope.** A non-trivial fraction of `Galaxy.Host` (browse
cache, deploy-event watch, worker supervision, COM bridge, post-mortem,
recycle, ACL hardening) is reproduced *better* in mxaccessgw. Option 1
deletes our copy. Option 2 keeps it.
4. **It positions historian and alarming for the right home.** Once the
Galaxy driver is "just another driver", historian becomes a server-level
data source (one that can also feed Modbus/S7 history if we ever want it),
and alarming becomes a server-level A&E subsystem. Option 2 nominally
allows the same move, but the temptation to keep them in `Galaxy.Host`
"while we're already there" is real.
5. **It future-proofs against AVEVA's roadmap.** Managed NMX, ASB, or any
replacement that shows up over the next few years gets adopted in
mxaccessgw without a release in this repo.
The case for Option 2 is real but narrow: it's the right call **only** if we
commit to single-box deployments forever, refuse to take a gRPC dependency,
and value local-trust simplicity over the consolidation/operability benefits
gw provides. None of those constraints hold here.
### What flips the recommendation
- If the gw protocol is unstable or perf-tested under our subscription
patterns turns out worse than expected → revisit Option 2.
- If org-policy forbids running an MXAccess gateway as its own service →
Option 2.
- If Galaxy goes from one of several drivers to *the* primary driver and
raw call-rate matters more than architectural fit → revisit.
Otherwise: Option 1.
---
## Out-of-scope follow-ups (don't decide here, but flag them)
- **Where does the Wonderware Historian SDK live?** Likely its own
.NET 4.8 x86 sidecar exposing a small `IHistorianDataSource` over a pipe or
gRPC, plugged into the OPC UA server's HA service alongside any future
historian sources. Independent of which option above is chosen.
- **Alarm subsystem ownership.** Decide whether the server hosts a generic
AlarmCondition state machine driven by driver-advertised alarm metadata, or
whether each driver continues to emit pre-shaped alarm transitions. Galaxy's
4-attr quartet is a strong forcing function for the generic approach.
- **Redundancy + gw sessions.** Standby OtOpcUa holds an open gw session
(warm) vs. opens on takeover (cold). Affects gw worker count and Galaxy
client-identity collisions.
- **Auth between OtOpcUa and gw.** API key in DPAPI-protected secret file vs.
Windows-auth gRPC. Both supported by gw; pick before rollout.
+476
View File
@@ -0,0 +1,476 @@
# Galaxy → MxAccessGateway Migration Plan
Implements **Option 1** from `lmx_backend.md`: replace the bespoke `Galaxy.Host`
+ `Galaxy.Proxy` IPC pair with an **in-process Tier-A** `Driver.Galaxy` running
in the .NET 10 OtOpcUa server, talking to a separately-deployed
`MxGateway.Server` (mxaccessgw repo) over gRPC for live MXAccess work and
Galaxy Repository browse.
## Outcome
After this work:
- `OtOpcUa.Server` is fully .NET 10 x64 — no x86 build artifacts in this repo.
- `Driver.Galaxy.Host` (Windows service, NSSM-wrapped, .NET 4.8 x86) is
retired. `Driver.Galaxy.Proxy` and `Driver.Galaxy.Shared` are deleted.
AVEVA platform is no longer required on the OtOpcUa box.
- A new in-process `Driver.Galaxy` lives next to `Driver.Modbus`,
`Driver.OpcUaClient`, etc. It implements the same `IDriver` capability set
the proxy implements today, but its body calls `MxGateway.Client`
(`MxGatewayClient`, `MxGatewaySession`, `GalaxyRepositoryClient`).
- Wonderware Historian SDK access moves out of the Galaxy driver into a
driver-agnostic historian data source (`Driver.Historian.Wonderware`,
separate sidecar, .NET 4.8 x86). The OPC UA HA service plugs into it the
same way it would plug into any future historian.
- Alarm condition tracking moves out of the driver into the OPC UA server's
generic A&E subsystem. The driver only flags `IsAlarm=true` on attribute
metadata and forwards live `.InAlarm`/`.Acked`/etc value changes; the
server runs the AlarmCondition state machine.
- Per-platform `ScanState` probes degrade to plain attribute subscriptions —
no special probe manager.
---
## Pre-flight: improvements to land in mxaccessgw first
These are **integration-quality changes** in the mxaccessgw repo that make
the OtOpcUa side dramatically simpler / faster / more robust. They aren't
strictly required to start, but ship enough of them before phase 3 that we're
not designing around gaps.
### gw-1. Galaxy attribute metadata parity
**What's there:** `galaxy_repository.v1.DiscoverHierarchy` returns
`GalaxyObject` with name, parent, category, and dynamic attributes.
**What's missing for OtOpcUa:** every field today's `MxAccessGalaxyBackend`
copies into `GalaxyAttributeInfo` — confirm gw's `Attribute` proto carries:
- `mx_data_type` (int)
- `is_array` (bool)
- `array_dimension` (uint, optional)
- `security_classification` (int)
- `is_historized` (bool, from `HistorizedExtension` primitive)
- `is_alarm` (bool, from `AlarmExtension` primitive)
If any are missing, add them to the proto and the server-side query mapper.
Without `IsAlarm` and `IsHistorized` the OPC UA server can't decide which
nodes get HasHistoricalConfiguration / which become AlarmConditions.
### gw-2. Stable, documented event-stream resume semantics
**What's needed:** the OtOpcUa driver must survive a transient gw transport
drop without losing subscription state or duplicating change events. gw's
`StreamEventsAsync(afterWorkerSequence)` already exposes resumption.
Document the per-session retention window (how long does the worker buffer
events the gateway hasn't acked?) and the "events were dropped, you must
re-subscribe" signal. If retention is bounded by count rather than time,
expose the bound in `OpenSessionReply` so the client can size its own buffer.
### gw-3. Reconnectable sessions
Listed under "post-v1 revisit" in `gateway.md`. Without it, every gw or
OtOpcUa restart re-`Register`s, re-`AddItem`s, re-`Advise`s the entire
address space — for a 50k-tag Galaxy that's a non-trivial cold-start. With
reconnectable sessions, the driver presents its `SessionId` after a restart
and the worker keeps its handles.
If full reconnection is too large, ship a **bulk replay** instead: a single
RPC that takes the full subscription set and the worker performs the
register/add/advise inside one round trip. We can drive it from a
client-side cache rather than gw state. See gw-5 below.
### gw-4. Driver-shaped subscribe primitive
`MxGatewaySession` already has `SubscribeBulkAsync` (one RPC: `Register`
implicit + `AddItem` + `Advise` for a list of tag addresses, returning
per-tag `SubscribeResult`). That's exactly what `ISubscribable.SubscribeAsync`
wants. Confirm it returns enough per-tag detail to surface a partial-failure
list to OPC UA monitored items (good handle, status code, error text).
If not already, expose **`SubscribeBulk` with optional update-rate hint**
forwarded to `SetBufferedUpdateInterval` so the OPC UA publishing interval
becomes a single field on the subscribe call rather than a follow-up RPC.
### gw-5. Subscription replay snapshot
Provide an RPC `ReplaySubscriptionsAsync(SessionId, IEnumerable<TagAddress>)`
that re-establishes a list of subscriptions after a session reset and returns
per-tag results. The client stores its tag list locally (the driver already
has it from `Discover`), and the gw worker turns it into one
register/add/advise sequence. This is the minimum surface we need; full
"reattach to a previous session by id" (gw-3) is a richer version of the
same thing.
### gw-6. Transport-health stream
The gw already exposes worker / session health on its dashboard. Add a small
streaming RPC `StreamSessionHealth(SessionId) → stream SessionHealth` so the
OtOpcUa driver can surface "MXAccess transport up/down" to its
`IHostConnectivityProbe` without faking it via probe-tag subscriptions.
Today `MxAccessClient.ConnectionStateChanged` does this in-process; we want
the same signal at the gw boundary.
### gw-7. Optional .NET 10 client polish
- Async-disposable session pattern is already there.
- Add a **typed `MxValue` ⇄ `object` adapter** for the seven Galaxy types
OtOpcUa cares about (Boolean, Int32, Float, Double, String, DateTime,
arrays of the same). Today every consumer writes its own `MxValue.From<T>`
helpers; this shaves boilerplate from the driver.
- Add a **`SubscribeWithCallback`** convenience wrapper that combines
`OpenSession` + `SubscribeBulk` + `StreamEvents` and routes events through
a delegate per tag. Keeps the OPC UA driver from re-implementing the
fan-out / sequencer pattern.
### gw-8. Auth minimums
Document API-key scoping as it applies to OtOpcUa: the server identity needs
`session`, `invoke`, `event`, and `metadata:read` scopes. Provide a CLI to
mint a key bound to those scopes for an OtOpcUa instance.
### gw-9. Performance: bulk paths and value coalescing
- Confirm `SubscribeBulkAsync` is implemented as a single MXAccess
`AddItem`+`Advise` loop on the worker, not N pipe round trips. If not, fix
before we drive 50k-tag Galaxies through it.
- Expose `SetBufferedUpdateInterval` per session so OtOpcUa can request
buffered updates at the OPC UA publishing interval and get one batched
`OnBufferedDataChange` per tick rather than N `OnDataChange` events.
These can all ship in mxaccessgw independently and improve every consumer.
---
## OtOpcUa-side improvements to land in parallel
Some are forced by removing `Galaxy.Host`; others are quality-of-life.
### ot-1. Promote `IHistorianDataSource` to a server-level extension point
Today `IHistorianDataSource` is a Galaxy-internal abstraction in
`Driver.Galaxy.Host`. Lift it to `OtOpcUa.Core.Abstractions` (or a similar
home next to `IDriver`) and let the OPC UA HA service consume **any number
of registered data sources** keyed by node namespace. Drivers don't own
historian access; the server mounts data sources alongside drivers. This is
the prerequisite that lets us move Wonderware Historian out of the Galaxy
driver without losing the feature.
### ot-2. Generic alarm condition state machine in the server
Move the `.InAlarm`/`.Priority`/`.DescAttrName`/`.Acked` quartet handling
out of `GalaxyAlarmTracker` into a server-level alarm subsystem keyed off the
`IsAlarm=true` flag drivers set during discovery. The server subscribes to
the four sub-attributes itself and runs the AlarmCondition state machine.
Driver only:
- declares `IsAlarm=true` in `DriverAttributeInfo`,
- forwards plain attribute value changes (already done by `ISubscribable`).
This is also a precondition for future drivers (Modbus DL205 alarm bits,
S7 alarm DBs) to emit alarms without each writing their own tracker.
### ot-3. Driver capabilities trim
After ot-1 and ot-2, `Driver.Galaxy` no longer needs to implement:
- `IHistoryProvider` (server's HA service handles it via Wonderware
historian data source)
- `IAlarmHistorianWriter` (server's A&E historian, or kept generic — Galaxy
shouldn't own the SQLite path)
- `IAlarmSource` ack route (server-level alarm subsystem writes back via the
driver's `IWritable.WriteAsync`, which the gw already supports)
Keep:
- `IDriver`, `ITagDiscovery`, `IReadable`, `IWritable`, `ISubscribable`,
`IRediscoverable`, `IHostConnectivityProbe`.
### ot-4. Treat `time_of_last_deploy` as `IRediscoverable`'s pump
Replace the Host-side change-detection poll with a managed
`GalaxyRepositoryClient.WatchDeployEventsAsync` consumer in the driver.
Each event raises `OnRediscoveryNeeded` with the new deploy time as the
`scopeHint`. No polling code in this repo.
### ot-5. Connection pool at the server, not the driver
If the redundancy pair runs two OtOpcUa instances against one gw, both
should share a single `GrpcChannel` per process (already gRPC default) but
**different sessions** (one MXAccess client identity per OtOpcUa instance,
not one shared session that fights over Wonderware client state). Encode
the per-instance MXAccess client name in driver config — already partly
there (`OTOPCUA_GALAXY_CLIENT_NAME`); make it explicit in the new driver's
`appsettings.json` shape.
---
## Phased implementation
Each phase is a working, mergeable slice. Keep `Galaxy.Host` running
alongside the new driver until phase 7 — gated by a config switch
`Galaxy:Backend = legacy-host | mxgateway`.
### Phase 0 — pre-flight (mxaccessgw repo)
Ship gw-1, gw-2, gw-4, gw-9 (the parity, performance, and contract bits the
plan immediately depends on). gw-3, gw-5, gw-6, gw-7 can come during or
after phase 5.
**Exit:** local OtOpcUa dev box can `MxGatewayClient.Create` a client, open a
session, `SubscribeBulkAsync` 100 tags, and observe `OnDataChange` events at
the configured update rate.
### Phase 1 — server-level historian extension point (ot-1)
1. Extract `IHistorianDataSource` (and its DTOs `HistorianSample`,
`HistorianAggregateSample`, `HistoricalEvent`) from
`Driver.Galaxy.Host/Backend/Historian/` into
`src/ZB.MOM.WW.OtOpcUa.Core/Abstractions/Historian/`.
2. Extend the OPC UA HA service to look up a registered
`IHistorianDataSource` per namespace and call into it for `HistoryRead`,
`HistoryReadProcessed`, `HistoryReadAtTime`, `HistoryReadEvents`. Drivers
stop implementing `IHistoryProvider` directly; the server proxies.
3. Add a no-op default registration so drivers without history keep working.
**Exit:** all current Galaxy history reads route through an
`IHistorianDataSource` registered by `Driver.Galaxy.Host` (still legacy)
without behavior change. Other drivers untouched.
### Phase 2 — server-level alarm subsystem (ot-2)
1. Add an `IAlarmConditionDeclaration` API on the address-space builder so
discovery can flag a node as alarm-bearing and supply the four
sub-attribute references.
2. Add a hosted `AlarmConditionService` in the server that, on driver
`Discover`, subscribes to the four sub-attributes via the driver's own
`ISubscribable`, runs the state machine, and emits
`IAlarmSource.OnAlarmEvent` itself. Acks route back through the driver's
`IWritable.WriteAsync` to the `.AckMsg` attribute.
3. Add Galaxy-specific defaults (sub-attribute naming) as a small adapter
so the same service can serve future drivers with different conventions.
**Exit:** Galaxy alarms still work end-to-end; the tracker code that runs
inside `Galaxy.Host` is dead but kept for the legacy-host backend path.
### Phase 3 — Wonderware Historian sidecar (`Driver.Historian.Wonderware`)
1. New solution project: `Driver.Historian.Wonderware`, .NET 4.8 x86,
console app + NSSM (mirrors today's Galaxy.Host packaging exactly,
minus Galaxy responsibilities).
2. Hosts the existing `HistorianDataSource`, `HistorianClusterEndpointPicker`,
`HistorianHealthSnapshot` code lifted from `Galaxy.Host/Backend/Historian/`
and exposes them over a small named-pipe protocol (or local gRPC if
.NET 4.8 cost is acceptable; named pipe is simpler).
3. Add `Driver.Historian.Wonderware.Client` — .NET 10 — implementing
`IHistorianDataSource` against the sidecar.
4. Server registers it as a data source for the `Galaxy` namespace.
**Exit:** OPC UA history reads work via the sidecar with the legacy-host
backend still in place. We've decoupled history from MXAccess.
### Phase 4 — new `Driver.Galaxy` against gw
This is the meat. New project: `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/`, .NET 10,
in-process. Capabilities (post ot-3): `IDriver`, `ITagDiscovery`, `IReadable`,
`IWritable`, `ISubscribable`, `IRediscoverable`, `IHostConnectivityProbe`.
Shape:
```
Driver.Galaxy/
GalaxyDriver.cs # IDriver root
Browse/
GalaxyDiscoverer.cs # consumes GalaxyRepositoryClient.DiscoverHierarchyAsync
DataTypeMap.cs # mx_data_type → DriverDataType
SecurityMap.cs # security_classification → SecurityClassification
Runtime/
GalaxyMxSession.cs # owns one MxGatewaySession; Register + map per-driver client name
SubscriptionRegistry.cs # tag → server/item handles; persists to memory only
EventPump.cs # consumes session.StreamEventsAsync, fans out to OnDataChange
ReconnectSupervisor.cs # gw transport drop / session-lost recovery
DeployWatcher.cs # GalaxyRepositoryClient.WatchDeployEventsAsync → OnRediscoveryNeeded
Health/
HostConnectivityForwarder.cs # gw-6 SessionHealth → IHostConnectivityProbe
Config/
GalaxyDriverOptions.cs # endpoint, ApiKey, ClientName, TLS, retry, intervals
GalaxyDriverFactoryExtensions.cs # AddGalaxyDriver(IServiceCollection)
```
Key behaviors:
- **Discovery** calls `GalaxyRepositoryClient.DiscoverHierarchyAsync()`
once at init and on every `WatchDeployEvents` event, then drives the
address space builder. Same node naming as today (parent contained-name
hierarchy + leaf attributes named `tag_name.AttributeName`).
- **Read** uses one-off `AddItem` + `Advise` + read-after-first-callback
is overkill; instead, use **`Register` + per-call `AddItem`/`Read`** if gw
exposes a synchronous read, otherwise short-lived advise. *Action item:*
confirm gw's read story; if absent, request a synchronous `ReadAsync` RPC
on top of MXAccess `Read` (which exists in the COM API).
- **Write** maps `WriteRequest.Value` to `MxValue` via gw-7 helpers and
calls `WriteAsync(serverHandle, itemHandle, value, userId=0)`. Routes
`WriteSecured` (where `SecurityClassification == SecuredWrite/Verified`)
to `WriteSecuredAsync` once exposed on `MxGatewaySession`.
- **Subscribe** calls `SubscribeBulkAsync` once per `ISubscribable.Subscribe`
call. Stores `(tag → itemHandle, sid)` in `SubscriptionRegistry`. The
single `EventPump` consumes one `StreamEventsAsync` per session and fans
out per `sid`.
- **Unsubscribe** calls `UnsubscribeBulkAsync` and drops registry entries.
- **Reconnect** — when the gRPC channel drops or `StreamEvents` returns,
`ReconnectSupervisor` reopens the session and replays subscriptions via
gw-5 `ReplaySubscriptionsAsync`. The driver flags `DriverState.Degraded`
during recovery; the server keeps publishing last-good values with
`Uncertain` quality.
- **Host connectivity** — single synthesized host entry named after
`OTOPCUA_GALAXY_CLIENT_NAME` driven by gw-6 `SessionHealth` updates
(or, until gw-6 lands, by transport drops).
Wire into the server next to other Tier-A drivers in the
`AddDrivers(...)` call site.
**Exit:** flipping `Galaxy:Backend` to `mxgateway` runs the OPC UA server
end-to-end with no `Galaxy.Host` involvement. Live read, live write, live
subscribe pass against the dev Galaxy. Historian + alarms still work via
phases 13.
### Phase 5 — parity test matrix
Reuse the existing live-Galaxy integration tests; run each scenario twice:
once with `Galaxy:Backend=legacy-host`, once with `mxgateway`. Compare:
- discovered hierarchy node count + names + datatypes,
- subscribed publish rates (allow ±10% tolerance vs. legacy),
- write success / status codes for each `SecurityClassification`,
- alarm condition transitions (Active / Acked / Inactive) — already
routed through phase 2's server-level subsystem,
- history reads — phase 3 sidecar, identical results both backends,
- reconnect behavior under gw kill, worker kill, network drop, ZB drop.
Document the matrix; resolve every discrepancy or explicitly accept it.
**Exit:** parity matrix has zero unexplained deltas. Performance budget
agreed: e.g. ≤ 2× per-call latency vs. named-pipe baseline at the 95th
percentile, equal or better throughput in `SubscribeBulk` setup time.
### Phase 6 — perf + hardening
- Land gw-9 buffered-update intervals.
- Add OpenTelemetry traces from the driver around every gw call,
correlated via `client_correlation_id`.
- Write soak test: 50k tags subscribed, 24h, count missed events, gw
restarts, OtOpcUa restarts.
- Tune `MxGatewayClientOptions.MaxGrpcMessageBytes`, retry pipeline,
call timeouts based on soak results.
**Exit:** production-acceptable perf numbers documented in
`docs/Galaxy.Driver.md`.
### Phase 7 — retirement
1. Default `Galaxy:Backend = mxgateway` everywhere (sample configs,
install scripts, e2e configs).
2. Delete `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`,
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy`,
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared`, and matching tests.
3. Remove `OtOpcUaGalaxyHost` NSSM registration from
`scripts/install/Install-Services.ps1`. Add a registration block for the
Wonderware historian sidecar from phase 3.
4. Remove every x86 .NET 4.8 reference, build target, and CI step from this
repo; remove `mxaccess_documentation.md`-driven dependencies that no
longer apply.
5. Update CLAUDE.md, `docs/v2/dev-environment.md`, `docs/ServiceHosting.md`,
`docs/Redundancy.md` to reflect the new topology.
6. Memory housekeeping: retire `project_galaxy_host_service.md` and
`project_galaxy_host_installed.md`; add a short note about the gw
dependency.
**Exit:** `git grep -i 'Galaxy\.Host'` returns nothing in source.
---
## Configuration shape (new driver)
```jsonc
"Drivers": {
"Galaxy": {
"Type": "Galaxy",
"InstanceId": "galaxy-prod-1",
"Gateway": {
"Endpoint": "https://mxgw.aveva.local:5001",
"ApiKeySecretRef": "galaxy:apiKey", // resolved via existing secret store
"UseTls": true,
"CaCertificatePath": "C:\\publish\\mxgw\\ca.crt",
"ConnectTimeoutSeconds": 10,
"DefaultCallTimeoutSeconds": 5,
"StreamTimeoutSeconds": 0 // unbounded
},
"MxAccess": {
"ClientName": "OtOpcUa-A", // unique per OtOpcUa instance
"PublishingIntervalMs": 1000, // hint for SetBufferedUpdateInterval
"WriteUserId": 0
},
"Repository": {
"DiscoverPageSize": 5000,
"WatchDeployEvents": true
},
"Reconnect": {
"InitialBackoffMs": 500,
"MaxBackoffMs": 30000,
"ReplayOnSessionLost": true
}
}
}
```
The OtOpcUa secret store already handles DPAPI-protected values for LDAP
binds; reuse it for the gw API key. Never put the key in plaintext in the
sample config.
---
## Risks and mitigations
| Risk | Mitigation |
|---|---|
| gw protocol regression breaks production | Pin gw NuGet to a contract version range; CI runs parity matrix on every gw bump; staged rollout via `Galaxy:Backend` flag. |
| Per-call latency regresses for chatty workloads | Land gw-9 (buffered updates) before phase 5; soak the 95p in phase 6. |
| Reconnect storm after gw restart re-registers 50k tags | Land gw-3 or gw-5 before phase 6; client-side bulk replay throttled by `SubscribeBulkAsync` chunk size. |
| Alarm parity gap from moving tracker server-side | Phase 2 ships before phase 4; parity matrix gates phase 7. |
| Historian sidecar adds a second .NET 4.8 x86 service | Acceptable: it's a *driver-agnostic* component, and it ships only where Wonderware historian access is actually needed. |
| Two OtOpcUa instances both registering as same MXAccess client | `ClientName` is per-instance config (ot-5); install scripts lint that the redundancy pair has distinct names. |
| Cross-machine MXAccess writes traverse plaintext gRPC | Phase 0 enforces `UseTls=true` for any non-loopback `Endpoint`; CI lints the sample configs. |
| gw API key leaked in logs | gw and `MxGatewayClient` already redact `authorization` metadata; phase 6 audit. |
| Memory leak in `EventPump` under high event rate | Bounded channel between `StreamEventsAsync` and per-sub fan-out, drop-newest with a metric counter; soak test catches. |
---
## Cross-cutting deliverables
- **Docs:** `docs/v2/Galaxy.Driver.md` (new), updates to
`docs/v2/dev-environment.md`, `docs/ServiceHosting.md`,
`docs/Redundancy.md`, `CLAUDE.md`.
- **Install scripts:** `scripts/install/Install-Services.ps1` removes
`OtOpcUaGalaxyHost`, adds `OtOpcUaWonderwareHistorian`, no Galaxy
service registration on the OtOpcUa node.
- **e2e:** `scripts/e2e/e2e-config.sample.json` — drop `OTOPCUA_GALAXY_*`
pipe vars, add `Drivers:Galaxy:Gateway:Endpoint` etc.
- **Memory:** retire stale Galaxy.Host entries; add gw dependency entry,
redundancy + client-name guidance.
---
## Order-of-work summary
```
Phase 0 (gw repo): gw-1, gw-2, gw-4, gw-9
Phase 1 (this): ot-1 — historian extension point
Phase 2 (this): ot-2 — alarm subsystem
Phase 3 (this): Driver.Historian.Wonderware sidecar
Phase 4 (this): Driver.Galaxy (new) behind backend flag
— depends on Phase 0, 1, 2
Phase 5 (this+gw): parity matrix
— drives gw-3 / gw-5 / gw-6 / gw-7 if gaps surface
Phase 6 (this): perf + hardening
Phase 7 (this): retire Galaxy.Host / Proxy / Shared
```
Phases 13 are independent of each other and can run in parallel. Phase 4
needs all three plus Phase 0. Phase 5 requires Phase 4. Phases 6 and 7 are
sequential after Phase 5.
+1050
View File
File diff suppressed because it is too large Load Diff
+43 -20
View File
@@ -53,27 +53,47 @@ read-only tag.
## Status
Stages 1 + 2 (driver-side probe + loopback) are verified end-to-end
against the pymodbus / ab_server / python-snap7 fixtures. Stages 3-5
(anything crossing the OtOpcUa server) are **blocked** on server-side
driver factory wiring:
All seven driver factories are registered in
`src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` — Galaxy, FOCAS, Modbus,
AB CIP, AB Legacy, S7, TwinCAT. `DriverInstanceBootstrapper` can
materialise any `DriverType` row from the central Config DB into a
live driver. The factory-wiring block that originally gated stages
3-5 is closed.
- `src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` only registers Galaxy +
FOCAS factories. `DriverInstanceBootstrapper` skips any `DriverType`
without a registered factory — so Modbus / AB CIP / AB Legacy / S7 /
TwinCAT rows in the Config DB are silently no-op'd even when the seed
is perfect.
- No Config DB seed script exists for non-Galaxy drivers; Admin UI is
currently the only path to author one.
Live-boot verification:
Tracking: **#209** (umbrella) → #210 (Modbus), #211 (AB CIP), #212 (S7),
#213 (AB Legacy, also hardware-gated — #222). Each child issue lists
the factory class to write + the seed SQL shape + the verification
command.
- **Galaxy** — 7/7 stages (read / write / subscribe / alarms / history)
against a real Galaxy + `OtOpcUaGalaxyHost` on this dev box.
- **AB CIP, S7** — 5/5 stages each under task #220 against the
`ab_server` + `python-snap7` fixtures.
- **AB Legacy** — 5/5 stages under task #222 against `ab_server` SLC500
/ MicroLogix / PLC-5 profiles (requires the `cip-path /1,0` workaround
for the Docker fixture).
- **Modbus** — 5/5 stages against the `pymodbus` + dl205 profile,
including HR[200] scratch register + per-protocol bidirectional +
subscribe-sees-change stages.
- **TwinCAT** — factory registered; driver features validated against the
TCBSD VM virtual-PLC fixture (FreeBSD + TwinCAT/BSD runtime on ESXi —
bypasses the Hyper-V/RTIME conflict that blocks XAR on the dev box).
`TWINCAT_TRUST_WIRE=1` is still required to run the script —
false-pass-prevention belt, not an "unverified" flag.
- **FOCAS** — factory registered; gated by `FOCAS_TRUST_WIRE=1` pending
the lab-rig CNC (task #222 follow-up).
- **OpcUaClient (gateway)** — eight-stage script (`test-opcuaclient.ps1`)
covers probe / remote read / forward bridge / subscribe / reverse
bridge / browse mirror / alarm / history against the opc-plc Docker
fixture at `opc.tcp://localhost:50000`. Reverse-bridge / alarm /
history stages are opt-in per the parameter docs (opc-plc's default
image has no writable nodes and does not historize).
Until those ship, stages 3-5 will fail with "read failed" (nothing
published at that NodeId) and `[FAIL]` the suite even on a running
server.
Remaining work is **per-protocol seed authoring**: each dev fills in
the NodeIds their server publishes under `e2e-config.json` (sidecar
is `.gitignore`-d; see `e2e-config.sample.json` for the shape). Admin
UI remains the supported path for authoring the matching driver
instance rows in the Config DB.
Tracking: umbrella #209 is closed; remaining TwinCAT / FOCAS work
tracks under their hardware-fixture tasks (#221 / #222).
## Prereqs
@@ -85,7 +105,9 @@ server.
for the simulator matrix — pymodbus / ab_server / python-snap7 /
opc-plc cover Modbus / AB / S7 / OPC UA Client. FOCAS and TwinCAT
have no public simulator; they are gated with env-var skip flags
below.
below. For OpcUaClient, `docker compose -f
tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/Docker/
docker-compose.yml up -d` brings up `opc-plc` on port 50000.
3. **PowerShell 7+**. The runner uses null-coalescing + `Set-StrictMode`;
the Windows-PowerShell-5.1 shell will not parse `test-all.ps1`.
4. **.NET 10 SDK**. Each script either runs `dotnet run --project
@@ -136,7 +158,8 @@ section to skip it.
| Galaxy | — | **PASS** (requires OtOpcUaGalaxyHost + a live Galaxy; 7 stages including alarms + history) |
| S7 | — | **PASS** (python-snap7 fixture) |
| FOCAS | `FOCAS_TRUST_WIRE=1` | **SKIP** (no public simulator — task #222 lab rig) |
| TwinCAT | `TWINCAT_TRUST_WIRE=1` | **SKIP** (needs XAR or standalone Router — task #221) |
| TwinCAT | `TWINCAT_TRUST_WIRE=1` | **SKIP** by default; features **validated** against the TCBSD VM fixture — set the env var to run |
| OpcUaClient | — | **PASS** stages 1-4 + browse (opc-plc Docker fixture); stages 5/7/8 are opt-in (require writable / alarm / historizing upstream) |
| Phase 7 | — | **PASS** if the Modbus instance seeds a `VT_DoubledHR100` virtual tag + `AlarmHigh` scripted alarm |
Set the `*_TRUST_WIRE` env vars to `1` when you've pointed the script at
+5 -2
View File
@@ -422,8 +422,11 @@ function Write-Summary {
[Parameter(Mandatory)] [string]$Title,
[Parameter(Mandatory)] [array]$Results
)
$passed = ($Results | Where-Object { $_.Passed }).Count
$failed = ($Results | Where-Object { -not $_.Passed }).Count
# @(...) forces an array even when Where-Object matches 0 or 1 items,
# otherwise .Count trips Set-StrictMode -Version 3.0 ("property 'Count'
# cannot be found on this object") on $null or on a single hashtable.
$passed = @($Results | Where-Object { $_.Passed }).Count
$failed = @($Results | Where-Object { -not $_.Passed }).Count
Write-Host ""
Write-Host "=== $Title summary: $passed/$($Results.Count) passed ===" `
-ForegroundColor $(if ($failed -eq 0) { "Green" } else { "Red" })
+16 -1
View File
@@ -34,7 +34,7 @@
},
"focas": {
"$comment": "Gated behind FOCAS_TRUST_WIRE=1 — no public simulator. Point at a real CNC + ensure Fwlib32.dll is on PATH.",
"$comment": "Gated behind FOCAS_TRUST_WIRE=1 for real-CNC runs, or pass -ProfileName to run against the focas-mock Docker fixture. Managed wire client — no native dependencies.",
"host": "192.168.1.20",
"port": 8193,
"address": "R100",
@@ -60,6 +60,21 @@
"historyLookbackSec": 3600
},
"opcuaclient": {
"$comment": "OPC UA Client (gateway) driver. Default opc-plc Docker fixture exposes ns=3;s=FastUInt1 as a ticker. The `bridgeNodeId` is the local mirror of remoteNodeId after the OpcUaClient driver's DiscoverAsync runs — dev-specific. Stages 5/7/8 are opt-in: supply writable* NodeIds to enable reverse-bridge, alarmNodeId to enable alarm, historyNodeId to enable history (opc-plc does not historize by default — a Prosys / UA Expert sample server is needed for stage 8).",
"remoteUrl": "opc.tcp://localhost:50000",
"remoteNodeId": "ns=3;s=FastUInt1",
"bridgeNodeId": "ns=2;s=OpcUaClient/FastUInt1",
"bridgeRootNodeId": "ns=2;s=OpcUaClient",
"browseDepth": 3,
"browseMinNodes": 5,
"changeWaitSec": 8,
"writableRemoteNodeId": "",
"writableBridgeNodeId": "",
"alarmNodeId": "",
"historyNodeId": ""
},
"phase7": {
"$comment": "Virtual tags + scripted alarms. The VirtualNodeId must resolve to a server-side virtual tag whose script reads the modbus InputNodeId and writes VT = input * 2. The AlarmNodeId is the ConditionId of a scripted alarm that fires when VT > 100.",
"modbusEndpoint": "127.0.0.1:5502",
+30 -1
View File
@@ -189,6 +189,33 @@ if ($galaxy) {
}
else { $summary["galaxy"] = "SKIP (no config entry)" }
# ---------------------------------------------------------------------------
# OPC UA Client (gateway driver)
# ---------------------------------------------------------------------------
$opcuaclient = Get-Or $config "opcuaclient"
if ($opcuaclient) {
Write-Header "== OPC UA CLIENT =="
Run-Suite "opcuaclient" {
& "$PSScriptRoot/test-opcuaclient.ps1" `
-RemoteUrl (Get-Or $opcuaclient "remoteUrl" "opc.tcp://localhost:50000") `
-OpcUaUrl (Get-Or $opcuaclient "opcUaUrl" $OpcUaUrl) `
-RemoteNodeId (Get-Or $opcuaclient "remoteNodeId" "ns=3;s=FastUInt1") `
-BridgeNodeId $opcuaclient["bridgeNodeId"] `
-WritableRemoteNodeId (Get-Or $opcuaclient "writableRemoteNodeId" "") `
-WritableBridgeNodeId (Get-Or $opcuaclient "writableBridgeNodeId" "") `
-BridgeRootNodeId (Get-Or $opcuaclient "bridgeRootNodeId" "i=85") `
-BrowseDepth (Get-Or $opcuaclient "browseDepth" 3) `
-BrowseMinNodes (Get-Or $opcuaclient "browseMinNodes" 5) `
-AlarmNodeId (Get-Or $opcuaclient "alarmNodeId" "") `
-AlarmWaitSec (Get-Or $opcuaclient "alarmWaitSec" 15) `
-HistoryNodeId (Get-Or $opcuaclient "historyNodeId" "") `
-HistoryLookbackSec (Get-Or $opcuaclient "historyLookbackSec" 3600) `
-ChangeWaitSec (Get-Or $opcuaclient "changeWaitSec" 8)
}
}
else { $summary["opcuaclient"] = "SKIP (no config entry)" }
$phase7 = Get-Or $config "phase7"
if ($phase7) {
Write-Header "== PHASE 7 virtual tags + scripted alarms =="
@@ -220,7 +247,9 @@ $summary.GetEnumerator() | ForEach-Object {
Write-Host (" {0,-10} {1}" -f $_.Key, $_.Value) -ForegroundColor $color
}
$failed = ($summary.Values | Where-Object { $_ -eq "FAIL" }).Count
# @() wrap — Where-Object returns $null / a single scalar for 0-or-1 matches,
# and .Count on either trips Set-StrictMode -Version 3.0.
$failed = @($summary.Values | Where-Object { $_ -eq "FAIL" }).Count
if ($failed -gt 0) {
Write-Host "$failed suite(s) failed." -ForegroundColor Red
exit 1
+171 -47
View File
@@ -1,16 +1,35 @@
#Requires -Version 7.0
#Requires -Version 7.0
<#
.SYNOPSIS
End-to-end CLI test for the FOCAS (Fanuc CNC) driver.
.DESCRIPTION
**Hardware-gated.** There is no public FOCAS simulator; the driver's
FwlibFocasClient P/Invokes Fanuc's licensed Fwlib32.dll. Against a dev
box without the DLL on PATH the test will skip with a clear message.
Against a real CNC with the DLL present it runs probe / driver-loopback /
server-bridge the same way the other scripts do.
Runs the CLI against either the managed wire client (default Driver.FOCAS.Cli
dials the CNC on TCP:8193 directly, no native dependencies) or the focas-mock
Docker fixture. Hardware-gated by default because the default CncHost is
127.0.0.1; set FOCAS_TRUST_WIRE=1 once -CncHost points at a real CNC, or pass
-ProfileName to run against the Docker sim.
Set FOCAS_TRUST_WIRE=1 when -CncHost points at a real CNC to un-gate.
The script also supports three nice-to-have modes shipped 2026-04-24:
-Series per-series matrix mode. Accepts a comma-separated list; the
core stages are run once per series, swapping the -Address to
the supplied per-series probe. Fails fast if any series's
configured address is outside the documented range (the driver
itself enforces that at InitializeAsync).
-ProfileName for use with the Python Docker simulator (see
docs/v2/implementation/focas-simulator-plan.md). Selects a
docker-compose profile + matching -Series. When set, the
FOCAS_TRUST_WIRE gate is considered satisfied because the sim
is a legitimate non-hardware target.
-HandleLeakCycles <int> stress stage that opens + closes <N> sessions
via the CLI's `probe` command with a short sleep between
cycles. Exercises the Tier-C supervisor's handle-recycle path
without touching user data. Typical values: 1001000. A CNC's
FWLIB handle pool is finite (~510), so this shakes out
handle-leak bugs if either side forgets to free.
.PARAMETER CncHost
IP or hostname of the CNC. Default 127.0.0.1 override for real runs.
@@ -19,7 +38,22 @@
FOCAS TCP port. Default 8193.
.PARAMETER Address
FOCAS address to exercise. Default R100 (PMC R-file register).
FOCAS address to exercise. Default R100 (PMC R-file register). Ignored
when -Series is set and the series profile supplies its own probe.
.PARAMETER Series
Comma-separated list of CNC series to run the matrix against. Known:
ZeroI_D, ZeroI_F, ZeroI_MF, ZeroI_TF, Sixteen_i, Thirty_i, ThirtyOne_i,
ThirtyTwo_i, PowerMotion_i. When empty the script runs a single pass
without a series constraint.
.PARAMETER ProfileName
docker-compose profile name from tests/.../Docker/profiles/. When set,
the script assumes the Python simulator is the target + un-gates
FOCAS_TRUST_WIRE.
.PARAMETER HandleLeakCycles
Run a handle-leak stress stage with <N> open/close cycles. 0 = skip.
.PARAMETER OpcUaUrl
OtOpcUa server endpoint.
@@ -32,6 +66,9 @@ param(
[string]$CncHost = "127.0.0.1",
[int]$CncPort = 8193,
[string]$Address = "R100",
[string]$Series = "",
[string]$ProfileName = "",
[int]$HandleLeakCycles = 0,
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[Parameter(Mandatory)] [string]$BridgeNodeId
)
@@ -39,11 +76,41 @@ param(
$ErrorActionPreference = "Stop"
. "$PSScriptRoot/_common.ps1"
if (-not ($env:FOCAS_TRUST_WIRE -eq "1" -or $env:FOCAS_TRUST_WIRE -eq "true")) {
Write-Skip "FOCAS_TRUST_WIRE not set — no public simulator exists (task #222 tracks the lab rig). Set =1 when -CncHost points at a real CNC with Fwlib32.dll on PATH."
$simGated = -not [string]::IsNullOrWhiteSpace($ProfileName)
if (-not $simGated -and -not ($env:FOCAS_TRUST_WIRE -eq "1" -or $env:FOCAS_TRUST_WIRE -eq "true")) {
Write-Skip "FOCAS_TRUST_WIRE not set. Pass -ProfileName <profile> to run against the Docker mock in tests/.../Driver.FOCAS.IntegrationTests/Docker/, or set FOCAS_TRUST_WIRE=1 when -CncHost points at a real CNC."
exit 0
}
if ($simGated) {
Write-Info "Sim mode — profile '$ProfileName'. FOCAS_TRUST_WIRE gate bypassed."
}
# Per-series probe addresses — each one is inside the authoritative range for
# that series (docs/v2/focas-version-matrix.md). Picking one representative per
# kind (PMC / parameter / macro) is enough to exercise the driver's validator.
$seriesProbes = @{
"ZeroI_D" = "R100"
"ZeroI_F" = "R100"
"ZeroI_MF" = "R100"
"ZeroI_TF" = "R100"
"Sixteen_i" = "R100"
"Thirty_i" = "R100"
"ThirtyOne_i" = "R100"
"ThirtyTwo_i" = "R100"
"PowerMotion_i"= "R100"
}
$seriesList = @()
if (-not [string]::IsNullOrWhiteSpace($Series)) {
$seriesList = @($Series.Split(',') | ForEach-Object { $_.Trim() } | Where-Object { $_ })
$unknown = @($seriesList | Where-Object { -not $seriesProbes.ContainsKey($_) })
if ($unknown.Count -gt 0) {
Write-Fail "Unknown -Series entries: $($unknown -join ', '). Known: $($seriesProbes.Keys -join ', ')."
exit 2
}
}
$focasCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli" `
-ExeName "otopcua-focas-cli"
@@ -51,46 +118,103 @@ $opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
$commonFocas = @("-h", $CncHost, "-p", $CncPort)
$results = @()
$allResults = @()
$results += Test-Probe `
-Cli $focasCli `
-ProbeArgs (@("probe") + $commonFocas + @("-a", $Address, "--type", "Int16"))
function Invoke-FocasCore {
param(
[string]$Label,
[string]$ProbeAddress
)
$writeValue = Get-Random -Minimum 1 -Maximum 9999
$results += Test-DriverLoopback `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $Address, "-t", "Int16", "-v", $writeValue)) `
-ReadArgs (@("read") + $commonFocas + @("-a", $Address, "-t", "Int16")) `
-ExpectedValue "$writeValue"
Write-Header "FOCAS stages — $Label"
$commonFocas = @("-h", $CncHost, "-p", $CncPort)
$results = @()
$bridgeValue = Get-Random -Minimum 10000 -Maximum 19999
$results += Test-ServerBridge `
-DriverCli $focasCli `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $Address, "-t", "Int16", "-v", $bridgeValue)) `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-ExpectedValue "$bridgeValue"
$results += Test-Probe `
-Cli $focasCli `
-ProbeArgs (@("probe") + $commonFocas + @("-a", $ProbeAddress, "--type", "Int16"))
$reverseValue = Get-Random -Minimum 20000 -Maximum 29999
$results += Test-OpcUaWriteBridge `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-DriverCli $focasCli `
-DriverReadArgs (@("read") + $commonFocas + @("-a", $Address, "-t", "Int16")) `
-ExpectedValue "$reverseValue"
$writeValue = Get-Random -Minimum 1 -Maximum 9999
$results += Test-DriverLoopback `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16", "-v", $writeValue)) `
-ReadArgs (@("read") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16")) `
-ExpectedValue "$writeValue"
$subValue = Get-Random -Minimum 30000 -Maximum 32766
$results += Test-SubscribeSeesChange `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-DriverCli $focasCli `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $Address, "-t", "Int16", "-v", $subValue)) `
-ExpectedValue "$subValue"
$bridgeValue = Get-Random -Minimum 10000 -Maximum 19999
$results += Test-ServerBridge `
-DriverCli $focasCli `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16", "-v", $bridgeValue)) `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-ExpectedValue "$bridgeValue"
Write-Summary -Title "FOCAS e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
$reverseValue = Get-Random -Minimum 20000 -Maximum 29999
$results += Test-OpcUaWriteBridge `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-DriverCli $focasCli `
-DriverReadArgs (@("read") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16")) `
-ExpectedValue "$reverseValue"
$subValue = Get-Random -Minimum 30000 -Maximum 32766
$results += Test-SubscribeSeesChange `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-DriverCli $focasCli `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16", "-v", $subValue)) `
-ExpectedValue "$subValue"
return $results
}
function Invoke-HandleLeakStage {
param([int]$Cycles)
Write-Header "FOCAS handle-leak stress — $Cycles cycles"
$commonFocas = @("-h", $CncHost, "-p", $CncPort)
$failed = 0
for ($i = 1; $i -le $Cycles; $i++) {
$probe = Test-Probe `
-Cli $focasCli `
-ProbeArgs (@("probe") + $commonFocas + @("-a", $Address, "--type", "Int16"))
if (-not $probe.Passed) {
$failed++
# First 3 failures are informative; the rest just tally.
if ($failed -le 3) {
Write-Fail "cycle $i failed: $($probe.Reason)"
}
}
# Tiny delay so a broken loop can't DDoS the CNC; FWLIB handles take a
# few tens of ms to recycle in practice.
Start-Sleep -Milliseconds 50
}
$passed = $Cycles - $failed
if ($failed -eq 0) {
Write-Pass "handle-leak stress: $passed/$Cycles cycles succeeded"
return @{ Passed = $true; Reason = "$passed/$Cycles" }
} else {
Write-Fail "handle-leak stress: $failed/$Cycles cycles failed"
return @{ Passed = $false; Reason = "$failed/$Cycles failed" }
}
}
if ($seriesList.Count -eq 0) {
$allResults += Invoke-FocasCore -Label "single" -ProbeAddress $Address
} else {
foreach ($series in $seriesList) {
$probeAddr = $seriesProbes[$series]
Write-Info "Running matrix pass for series '$series' with address $probeAddr"
$allResults += Invoke-FocasCore -Label $series -ProbeAddress $probeAddr
}
}
if ($HandleLeakCycles -gt 0) {
$allResults += Invoke-HandleLeakStage -Cycles $HandleLeakCycles
}
Write-Summary -Title "FOCAS e2e" -Results $allResults
if ($allResults | Where-Object { -not $_.Passed }) { exit 1 }
+39 -19
View File
@@ -49,16 +49,20 @@
OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
.PARAMETER SourceNodeId
NodeId of the driver-sourced Galaxy tag (numeric, writable preferred).
Default matches the Phase 7 seed `ns=2;s=p7-smoke-tag-source`.
NodeId of the driver-sourced Galaxy tag (numeric, writable preferred). NodeIds
are path-based per OPC UA Part 3 §5.2.2 the default matches the Phase 7 seed
walking `p7-smoke-galaxy` (DriverInstanceId) `lab-floor` `galaxy-line`
`reactor-1` `Source` (Tag.Name).
.PARAMETER VirtualNodeId
NodeId of the VirtualTag computed as Source × 2 (Phase 7 scripting).
Default matches the Phase 7 seed `ns=2;s=p7-smoke-vt-derived`.
NodeId of the VirtualTag that computes MachineStatus = (Source > 0) (Phase 7
scripting). Same path-based scheme, ending in the VirtualTag.Name
(`MachineStatus`). The tag is historized so the write/subscribe exercise
doubles as a historian-sink check.
.PARAMETER AlarmNodeId
NodeId of the scripted-alarm Condition (fires when Source > 50).
Default matches the Phase 7 seed `ns=2;s=p7-smoke-al-overtemp`.
NodeId of the scripted-alarm Condition (fires when Source > 50). Same
path-based scheme, ending in ScriptedAlarm.Name (`OverTemp`).
.PARAMETER AlarmTriggerValue
Value written to -SourceNodeId to push it over the alarm threshold.
@@ -90,13 +94,20 @@
param(
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[string]$SourceNodeId = "ns=2;s=p7-smoke-tag-source",
[string]$VirtualNodeId = "ns=2;s=p7-smoke-vt-derived",
[string]$AlarmNodeId = "ns=2;s=p7-smoke-al-overtemp",
[string]$SourceNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source",
[string]$VirtualNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/MachineStatus",
[string]$AlarmNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/OverTemp",
[string]$AlarmTriggerValue = "75",
[int]$ChangeWaitSec = 10,
[int]$AlarmWaitSec = 10,
[int]$HistoryLookbackSec = 3600
[int]$HistoryLookbackSec = 3600,
# The default Phase 7 seed uses a Galaxy attribute with
# security_classification=Operate. Anonymous OPC UA sessions are denied writes
# against Operate-classified tags (PR 26 / docs/Security.md). Supply an LDAP
# user with WriteOperate to exercise the reverse-bridge stage — e.g.
# `-Username writeop -Password writeop123` against the dev-box GLAuth.
[string]$Username = "",
[string]$Password = ""
)
$ErrorActionPreference = "Stop"
@@ -106,6 +117,13 @@ $opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
# Auth-extension helper — appends `-U / -P` to the CLI args when credentials
# were supplied. Stays empty for anonymous runs so the default smoke path
# doesn't require an LDAP round-trip.
$authArgs = @()
if ($Username) { $authArgs += @("-U", $Username) }
if ($Password) { $authArgs += @("-P", $Password) }
$results = @()
# ---------------------------------------------------------------------------
@@ -115,7 +133,7 @@ $results = @()
# ---------------------------------------------------------------------------
Write-Header "Probe"
$probe = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $SourceNodeId)
$probe = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
if ($probe.ExitCode -eq 0 -and $probe.Output -match "Status:\s+0x00000000") {
Write-Pass "source NodeId readable (Galaxy pipe → proxy → server → client chain up)"
$results += @{ Passed = $true }
@@ -132,7 +150,7 @@ if ($probe.ExitCode -eq 0 -and $probe.Output -match "Status:\s+0x00000000") {
# ---------------------------------------------------------------------------
Write-Header "Source read"
$sourceRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $SourceNodeId)
$sourceRead = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
$sourceValue = $null
if ($sourceRead.ExitCode -eq 0 -and $sourceRead.Output -match "Value:\s+([^\r\n]+)") {
$sourceValue = $Matches[1].Trim()
@@ -156,7 +174,7 @@ if ([string]::IsNullOrEmpty($VirtualNodeId)) {
Write-Skip "VirtualNodeId not supplied — skipping Phase 7 bridge check"
} else {
Write-Header "Virtual-tag bridge"
$vtRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $VirtualNodeId)
$vtRead = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $VirtualNodeId) + $authArgs)
if ($vtRead.ExitCode -eq 0 -and $vtRead.Output -match "Value:\s+([^\r\n]+)") {
$vtValue = $Matches[1].Trim()
Write-Pass "virtual-tag value = $vtValue (source was $sourceValue)"
@@ -179,7 +197,7 @@ $stdout = New-TemporaryFile
$stderr = New-TemporaryFile
$subArgs = @($opcUaCli.PrefixArgs) + @(
"subscribe", "-u", $OpcUaUrl, "-n", $SourceNodeId,
"-i", "500", "--duration", "$ChangeWaitSec")
"-i", "500", "--duration", "$ChangeWaitSec") + $authArgs
$subProc = Start-Process -FilePath $opcUaCli.File `
-ArgumentList $subArgs -NoNewWindow -PassThru `
-RedirectStandardOutput $stdout.FullName `
@@ -191,8 +209,10 @@ $subOut = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -R
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
# Any `=` followed by `(Good)` line after the initial subscribe-confirmation
# indicates at least one data-change tick arrived.
$changeLines = ($subOut -split "`n") | Where-Object { $_ -match "=\s+.*\(Good\)" }
# indicates at least one data-change tick arrived. The `@(...)` forces an array
# so `.Count` works on the 0-match + single-match cases that Set-StrictMode
# -Version 3.0 otherwise flags as `property 'Count' cannot be found`.
$changeLines = @(($subOut -split "`n") | Where-Object { $_ -match "=\s+.*\(Good\)" })
if ($changeLines.Count -gt 0) {
Write-Pass "$($changeLines.Count) data-change events observed"
$results += @{ Passed = $true }
@@ -210,8 +230,8 @@ if ($changeLines.Count -gt 0) {
Write-Header "Reverse bridge (OPC UA write)"
$writeValue = [int]$AlarmTriggerValue # reuse the alarm trigger value — two stages for one write
$w = Invoke-Cli -Cli $opcUaCli -Args @(
"write", "-u", $OpcUaUrl, "-n", $SourceNodeId, "-v", "$writeValue")
$w = Invoke-Cli -Cli $opcUaCli -Args (@(
"write", "-u", $OpcUaUrl, "-n", $SourceNodeId, "-v", "$writeValue") + $authArgs)
if ($w.ExitCode -ne 0) {
# Connection/protocol failure — still a test failure.
Write-Fail "write CLI exit=$($w.ExitCode)"
@@ -226,7 +246,7 @@ if ($w.ExitCode -ne 0) {
} elseif ($w.Output -match "Write successful") {
# Read back — Galaxy poll interval + MXAccess advise may need a second or two to settle.
Start-Sleep -Seconds 2
$r = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $SourceNodeId)
$r = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
if ($r.Output -match "Value:\s+$([Regex]::Escape("$writeValue"))\b") {
Write-Pass "write propagated — source reads back $writeValue"
$results += @{ Passed = $true }
+392
View File
@@ -0,0 +1,392 @@
#Requires -Version 7.0
<#
.SYNOPSIS
End-to-end CLI test for the OPC UA Client (gateway) driver bridged through
the OtOpcUa server.
.DESCRIPTION
The OpcUaClient driver is unique in the fleet it's a gateway that connects
to ANOTHER OPC UA server and re-exposes its address space through the local
OtOpcUa server. So there's no protocol-specific driver CLI; both directions
of this test use `otopcua-cli` against two different endpoints:
remote = the upstream OPC UA server the driver connects to (opc-plc fixture
by default, opc.tcp://localhost:50000)
local = the OtOpcUa server itself, which mirrors remote nodes through the
OpcUaClient driver instance (opc.tcp://localhost:4840)
Eight stages cover the driver's full capability surface:
1. Remote probe otopcua-cli connect to the upstream. Confirms the
simulator / target server is reachable and
speaking UA Secure Channel.
2. Remote read otopcua-cli read of -RemoteNodeId on the upstream.
Captures the current value + confirms the node
exists. Baseline for the forward-bridge stage.
3. Forward bridge otopcua-cli read of -BridgeNodeId on the LOCAL
server. Proves the driver discovered + mirrored
the remote node into the local address space and
the read path is live (IReadable via session).
4. Subscribe-sees-change subscribe on local -BridgeNodeId in the
background. opc-plc's tickers (FastUInt1, StepUp)
mutate autonomously, so no driver poke is needed
a data-change event should arrive within the
subscription window. Covers ISubscribable +
upstream subscription transfer.
5. Reverse bridge otopcua-cli write to local -WritableBridgeNodeId,
then otopcua-cli read of -WritableRemoteNodeId
directly on the upstream. Confirms writes flow
through the driver to the remote (IWritable). Opt-
in opc-plc default image has no writable nodes
without `--sn`; pass -WritableBridgeNodeId AND
-WritableRemoteNodeId to enable.
6. Browse mirror otopcua-cli browse of the local -BridgeRootNodeId
at depth -BrowseDepth. Asserts at least
-BrowseMinNodes descendants appear. Covers
ITagDiscovery local-namespace projection.
7. Alarm fires otopcua-cli alarms subscription on local
-AlarmNodeId. opc-plc with `--alm` cycles a
TripAlarm autonomously; assert an Active alarm
event surfaces. Covers IAlarmSource OPC UA A&E
projection. Opt-in via -AlarmNodeId.
8. History read historyread on local -HistoryNodeId over a
lookback window. Covers IHistoryProvider
upstream HistoryRead dispatch. Opt-in via
-HistoryNodeId. Note: opc-plc's default image
does not historize a historizing upstream
(Prosys, UaExpert sample server) is required.
Prereqs:
1. Upstream OPC UA server reachable at -RemoteUrl. Default expects the
opc-plc Docker fixture (`tests/.../Driver.OpcUaClient.IntegrationTests/
Docker/docker-compose.yml`): `docker compose up -d` before running.
2. OtOpcUa server running at -OpcUaUrl with an OpcUaClient DriverInstance
in its Config DB whose EndpointUrl = -RemoteUrl. The server's
DiscoverAsync populates the mirrored namespace at startup; the
-BridgeNodeId / -BridgeRootNodeId you pass must correspond to whatever
NodeIds that discovery produced on your local server.
3. To exercise stages 5 / 7 / 8, the upstream must expose writable nodes /
alarm conditions / history. opc-plc alone doesn't cover all three see
parameter docs below for the combinations that work with opc-plc.
.PARAMETER RemoteUrl
Upstream OPC UA server endpoint (the server the driver connects to).
Default matches the opc-plc Docker fixture opc.tcp://localhost:50000.
.PARAMETER OpcUaUrl
Local OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
.PARAMETER RemoteNodeId
NodeId on the upstream used for stages 1-2. Default ns=3;s=FastUInt1 opc-plc
ticker that increments every 100 ms.
.PARAMETER BridgeNodeId
NodeId on the LOCAL server that mirrors -RemoteNodeId after the OpcUaClient
driver discovers it. Dev-specific whatever the local DiscoverAsync produced
for the upstream node. No default; mandatory for stages 3-4.
.PARAMETER WritableRemoteNodeId
Writable NodeId on the upstream for the reverse-bridge stage. opc-plc's
default image has no writable nodes; add `--sn=1` to the compose command to
expose `ns=3;s=SlowUInt1` as writable (or similar per opc-plc docs). Omit to
skip stage 5.
.PARAMETER WritableBridgeNodeId
Matching local mirror of -WritableRemoteNodeId. Omit to skip stage 5.
.PARAMETER BridgeRootNodeId
Root NodeId on the local server under which the mirrored upstream sits. The
browse stage walks from this node down to -BrowseDepth. Default i=85
(ObjectsFolder) works but produces a lot of output; pass a narrower root
for faster / more targeted coverage.
.PARAMETER BrowseDepth
Max depth for the browse stage. Default 3.
.PARAMETER BrowseMinNodes
Minimum number of descendants expected under -BridgeRootNodeId. Default 5.
.PARAMETER AlarmNodeId
NodeId of the ConditionType on the local server for the alarm-fires stage.
opc-plc with `--alm` exposes e.g. TripAlarm conditions; the local mirror path
of that condition goes here. Omit to skip stage 7.
.PARAMETER AlarmWaitSec
Seconds to wait for the alarm to cycle. opc-plc's TripAlarm fires on its own
cadence; 15 s usually covers one cycle. Default 15.
.PARAMETER HistoryNodeId
NodeId on the local server whose history to query. Omit to skip stage 8.
.PARAMETER HistoryLookbackSec
Seconds back from now to query history. Default 3600.
.PARAMETER ChangeWaitSec
Seconds the subscribe-sees-change stage waits for a natural ticker update.
opc-plc's FastUInt1 ticks every 100 ms so a short window suffices. Default 8.
.EXAMPLE
# Bare-minimum: stages 1-4 + browse, against the opc-plc compose fixture.
# Requires the local OtOpcUa server to have discovered opc-plc and placed
# FastUInt1 under (for example) ns=2;s=OpcUaClient/FastUInt1.
./scripts/e2e/test-opcuaclient.ps1 -BridgeNodeId "ns=2;s=OpcUaClient/FastUInt1"
.EXAMPLE
# Full matrix — all eight stages. Requires an opc-plc image with --sn (for
# writable) + --alm (for alarms; default compose has this) + a historizing
# upstream (opc-plc does not; Prosys does).
./scripts/e2e/test-opcuaclient.ps1 `
-BridgeNodeId "ns=2;s=OpcUaClient/FastUInt1" `
-WritableRemoteNodeId "ns=3;s=SlowUInt1" `
-WritableBridgeNodeId "ns=2;s=OpcUaClient/SlowUInt1" `
-BridgeRootNodeId "ns=2;s=OpcUaClient" `
-AlarmNodeId "ns=2;s=OpcUaClient/TripAlarm" `
-HistoryNodeId "ns=2;s=OpcUaClient/StepUp"
#>
param(
[string]$RemoteUrl = "opc.tcp://localhost:50000",
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[string]$RemoteNodeId = "ns=3;s=FastUInt1",
[Parameter(Mandatory)] [string]$BridgeNodeId,
[string]$WritableRemoteNodeId = "",
[string]$WritableBridgeNodeId = "",
[string]$BridgeRootNodeId = "i=85",
[int]$BrowseDepth = 3,
[int]$BrowseMinNodes = 5,
[string]$AlarmNodeId = "",
[int]$AlarmWaitSec = 15,
[string]$HistoryNodeId = "",
[int]$HistoryLookbackSec = 3600,
[int]$ChangeWaitSec = 8
)
$ErrorActionPreference = "Stop"
. "$PSScriptRoot/_common.ps1"
$opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
$results = @()
# ---------------------------------------------------------------------------
# Stage 1 — Remote probe. `otopcua-cli connect` exits 0 when the Secure Channel
# + Session handshake to the upstream complete cleanly. A failure here means
# opc-plc isn't running or the endpoint is unreachable — nothing downstream is
# worth trying.
# ---------------------------------------------------------------------------
Write-Header "Remote probe"
$probe = Invoke-Cli -Cli $opcUaCli -Args @("connect", "-u", $RemoteUrl)
if ($probe.ExitCode -eq 0 -and $probe.Output -match "Connection successful") {
Write-Pass "upstream $RemoteUrl reachable + speaks UA"
$results += @{ Passed = $true }
} else {
Write-Fail "upstream connect failed (exit=$($probe.ExitCode))"
Write-Host $probe.Output
$results += @{ Passed = $false; Reason = "remote probe failed" }
# Fail fast: if the upstream is down every other stage will cascade.
Write-Summary -Title "OpcUaClient e2e" -Results $results
exit 1
}
# ---------------------------------------------------------------------------
# Stage 2 — Remote read. Pulls the current value of -RemoteNodeId directly from
# the upstream. Recorded for later stages to compare against, and confirms the
# chosen NodeId actually exists on this upstream.
# ---------------------------------------------------------------------------
Write-Header "Remote read"
$remoteRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $RemoteUrl, "-n", $RemoteNodeId)
$remoteValue = $null
if ($remoteRead.ExitCode -eq 0 -and $remoteRead.Output -match "Value:\s+([^\r\n]+)") {
$remoteValue = $Matches[1].Trim()
Write-Pass "remote $RemoteNodeId = $remoteValue"
$results += @{ Passed = $true }
} else {
Write-Fail "remote read of $RemoteNodeId failed"
Write-Host $remoteRead.Output
$results += @{ Passed = $false; Reason = "remote read failed" }
}
# ---------------------------------------------------------------------------
# Stage 3 — Forward bridge. Read -BridgeNodeId on the LOCAL server. If the
# OpcUaClient driver is live + its discovery mapped -RemoteNodeId into the
# local namespace, this should return a Good value. For ticker nodes like
# FastUInt1 we don't require exact equality with stage 2 (the ticker has
# likely advanced between reads); a Good-status read is the real signal.
# ---------------------------------------------------------------------------
Write-Header "Forward bridge (remote → local)"
$localRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $BridgeNodeId)
if ($localRead.ExitCode -eq 0 -and $localRead.Output -match "Status:\s+0x00000000" -and $localRead.Output -match "Value:\s+([^\r\n]+)") {
$localValue = $Matches[1].Trim()
Write-Pass "local bridge $BridgeNodeId = $localValue (remote was $remoteValue)"
$results += @{ Passed = $true }
} else {
Write-Fail "local bridge read failed — driver instance may not be configured or discovery hasn't run"
Write-Host $localRead.Output
$results += @{ Passed = $false; Reason = "forward bridge failed" }
}
# ---------------------------------------------------------------------------
# Stage 4 — Subscribe sees change. opc-plc's FastUInt1 ticks autonomously so we
# don't need to drive a write. A properly wired OpcUaClient driver forwards
# remote MonitoredItem data-change callbacks to the local server, which then
# publishes them to our subscribe client. If nothing arrives within the
# window, either the remote node isn't a ticker OR the upstream subscription
# chain is broken (probe state, keep-alive, SDK publish queue).
# ---------------------------------------------------------------------------
Write-Header "Subscribe sees change"
$stdout = New-TemporaryFile
$stderr = New-TemporaryFile
$subArgs = @($opcUaCli.PrefixArgs) + @(
"subscribe", "-u", $OpcUaUrl, "-n", $BridgeNodeId,
"-i", "200", "--duration", "$ChangeWaitSec")
$subProc = Start-Process -FilePath $opcUaCli.File `
-ArgumentList $subArgs -NoNewWindow -PassThru `
-RedirectStandardOutput $stdout.FullName `
-RedirectStandardError $stderr.FullName
Write-Info "subscription started (pid $($subProc.Id)) for ${ChangeWaitSec}s"
$subProc.WaitForExit(($ChangeWaitSec + 5) * 1000) | Out-Null
if (-not $subProc.HasExited) { Stop-Process -Id $subProc.Id -Force }
$subOut = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -Raw)
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
# SubscribeCommand prints `[timestamp] <NodeId> = <value> (0xNNNNNNNN)` per
# data-change event. 0x00000000 == Good; anything else is a non-Good status
# we intentionally don't count (a quality drop isn't a "saw the change").
$changeLines = @(($subOut -split "`n") | Where-Object { $_ -match "=\s+\S.*\(0x00000000\)" })
if ($changeLines.Count -gt 0) {
Write-Pass "$($changeLines.Count) data-change events observed on bridge"
$results += @{ Passed = $true }
} else {
Write-Fail "no data-change events in ${ChangeWaitSec}s — upstream node may be static, or subscription chain broken"
Write-Host $subOut
$results += @{ Passed = $false; Reason = "no data-change" }
}
# ---------------------------------------------------------------------------
# Stage 5 — Reverse bridge. Only runs when both writable NodeIds are supplied.
# Writes on the local bridge side, reads directly on the upstream to verify
# the write crossed the driver. 2s settle accounts for the driver's next poll
# (non-idempotent writes on upstream side may take a tick to propagate).
# ---------------------------------------------------------------------------
if ([string]::IsNullOrEmpty($WritableBridgeNodeId) -or [string]::IsNullOrEmpty($WritableRemoteNodeId)) {
Write-Header "Reverse bridge (local → remote)"
Write-Skip "WritableBridgeNodeId / WritableRemoteNodeId not supplied — opc-plc default has no writable nodes. Add --sn=N to the compose and re-run with both params set."
} else {
Write-Header "Reverse bridge (local → remote)"
$writeValue = Get-Random -Minimum 1 -Maximum 9999
$w = Invoke-Cli -Cli $opcUaCli -Args @(
"write", "-u", $OpcUaUrl, "-n", $WritableBridgeNodeId, "-v", "$writeValue")
if ($w.ExitCode -ne 0 -or $w.Output -notmatch "Write successful") {
Write-Fail "local-side write failed"
Write-Host $w.Output
$results += @{ Passed = $false; Reason = "reverse-bridge write failed" }
} else {
Write-Info "local write ok, waiting 2s for driver propagate"
Start-Sleep -Seconds 2
$r = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $RemoteUrl, "-n", $WritableRemoteNodeId)
if ($r.ExitCode -eq 0 -and $r.Output -match "Value:\s+$([Regex]::Escape("$writeValue"))\b") {
Write-Pass "remote reads back $writeValue"
$results += @{ Passed = $true }
} else {
Write-Fail "remote value did not reflect $writeValue"
Write-Host $r.Output
$results += @{ Passed = $false; Reason = "reverse-bridge readback mismatch" }
}
}
}
# ---------------------------------------------------------------------------
# Stage 6 — Browse mirror. Walks -BridgeRootNodeId to -BrowseDepth levels. The
# BrowseCommand emits one line per encountered node; we count non-empty lines
# minus the root-summary line and compare against -BrowseMinNodes. A naked
# i=85 root always has something; a narrower dev-specific root is stricter.
# ---------------------------------------------------------------------------
Write-Header "Browse mirror"
$br = Invoke-Cli -Cli $opcUaCli -Args @(
"browse", "-u", $OpcUaUrl, "-n", $BridgeRootNodeId,
"-r", "-d", "$BrowseDepth")
if ($br.ExitCode -ne 0) {
Write-Fail "browse failed (exit=$($br.ExitCode))"
Write-Host $br.Output
$results += @{ Passed = $false; Reason = "browse failed" }
} else {
# BrowseCommand prints one line per node: `[Type] Name (NodeId: xxx)` with
# indentation for depth. Count every line carrying a NodeId marker.
$nodeLines = @(($br.Output -split "`n") | Where-Object { $_ -match "\(NodeId:" })
$count = $nodeLines.Count
if ($count -ge $BrowseMinNodes) {
Write-Pass "$count descendants under $BridgeRootNodeId (>= $BrowseMinNodes)"
$results += @{ Passed = $true }
} else {
Write-Fail "only $count descendants — expected >= $BrowseMinNodes"
Write-Host $br.Output
$results += @{ Passed = $false; Reason = "browse under-populated" }
}
}
# ---------------------------------------------------------------------------
# Stage 7 — Alarm fires. opc-plc with --alm (set in the compose) cycles a
# TripAlarm Condition autonomously. The local alarm subscription should
# surface at least one Active transition within the wait window. Opt-in:
# requires the user to know the local mirror of the upstream alarm Condition.
# ---------------------------------------------------------------------------
if ([string]::IsNullOrEmpty($AlarmNodeId)) {
Write-Header "Alarm fires"
Write-Skip "AlarmNodeId not supplied — skipping alarm stage"
} else {
Write-Header "Alarm fires"
$stdout = New-TemporaryFile
$stderr = New-TemporaryFile
$allArgs = @($opcUaCli.PrefixArgs) + @(
"alarms", "-u", $OpcUaUrl, "-n", $AlarmNodeId, "-i", "500", "--refresh")
$proc = Start-Process -FilePath $opcUaCli.File `
-ArgumentList $allArgs -NoNewWindow -PassThru `
-RedirectStandardOutput $stdout.FullName `
-RedirectStandardError $stderr.FullName
Write-Info "alarm subscription started (pid $($proc.Id)), waiting ${AlarmWaitSec}s for opc-plc alarm cycle"
Start-Sleep -Seconds $AlarmWaitSec
if (-not $proc.HasExited) { Stop-Process -Id $proc.Id -Force }
$out = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -Raw)
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
if ($out -match "ALARM\b" -and $out -match "Active\b") {
Write-Pass "alarm condition fired with Active state"
$results += @{ Passed = $true }
} else {
Write-Fail "no Active alarm event observed in ${AlarmWaitSec}s — check opc-plc compose has --alm + the AlarmNodeId is the local mirror of the upstream Condition"
Write-Host $out
$results += @{ Passed = $false; Reason = "no alarm event" }
}
}
# ---------------------------------------------------------------------------
# Stage 8 — History read. IHistoryProvider dispatch to the upstream's
# HistoryRead service. opc-plc does NOT historize by default, so this stage
# SKIPs when -HistoryNodeId is empty. Against a historizing upstream (Prosys,
# UA Expert sample server, AVEVA Historian) point -HistoryNodeId at the local
# mirror of a historized node.
# ---------------------------------------------------------------------------
if ([string]::IsNullOrEmpty($HistoryNodeId)) {
Write-Header "History read"
Write-Skip "HistoryNodeId not supplied — opc-plc default does not historize; supply a historized-upstream mirror NodeId to enable."
} else {
$results += Test-HistoryHasSamples `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-NodeId $HistoryNodeId `
-LookbackSec $HistoryLookbackSec
}
Write-Summary -Title "OpcUaClient e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
-108
View File
@@ -1,108 +0,0 @@
<#
.SYNOPSIS
Registers the OtOpcUaFocasHost Windows service. Optional companion to
Install-Services.ps1 only run this on nodes where FOCAS driver instances will run
with Tier-C process isolation enabled.
.DESCRIPTION
FOCAS PR #220 / Tier-C isolation plan. Wraps OtOpcUa.Driver.FOCAS.Host.exe (net48 x86)
as a Windows service using NSSM, running under the same service account as the main
OtOpcUa service so the named-pipe ACL works. Passes the per-process shared secret via
environment variable at service-start time so it never hits disk.
.PARAMETER InstallRoot
Where the FOCAS Host binaries live (typically
C:\Program Files\OtOpcUa\Driver.FOCAS.Host).
.PARAMETER ServiceAccount
Service account SID or DOMAIN\name. Must match the main OtOpcUa server account so the
PipeAcl match succeeds.
.PARAMETER FocasSharedSecret
Per-process secret passed via env var. Generated freshly per install if not supplied.
.PARAMETER FocasBackend
Backend selector for the Host process. One of:
fwlib32 (default real Fanuc Fwlib32.dll integration; requires licensed DLL on PATH)
fake (in-memory; smoke-test mode)
unconfigured (safe default returning structured errors; use until hardware is wired)
.PARAMETER FocasPipeName
Pipe name the Host listens on. Default: OtOpcUaFocas.
.EXAMPLE
.\Install-FocasHost.ps1 -InstallRoot 'C:\Program Files\OtOpcUa\Driver.FOCAS.Host' `
-ServiceAccount 'OTOPCUA\svc-otopcua' -FocasBackend fwlib32
#>
[CmdletBinding()]
param(
[Parameter(Mandatory)] [string]$InstallRoot,
[Parameter(Mandatory)] [string]$ServiceAccount,
[string]$FocasSharedSecret,
[ValidateSet('fwlib32','fake','unconfigured')] [string]$FocasBackend = 'unconfigured',
[string]$FocasPipeName = 'OtOpcUaFocas',
[string]$ServiceName = 'OtOpcUaFocasHost',
[string]$NssmPath = 'C:\Program Files\nssm\nssm.exe'
)
$ErrorActionPreference = 'Stop'
function Resolve-Sid {
param([string]$Account)
if ($Account -match '^S-\d-\d+') { return $Account }
try {
$nt = New-Object System.Security.Principal.NTAccount($Account)
return $nt.Translate([System.Security.Principal.SecurityIdentifier]).Value
} catch {
throw "Could not resolve '$Account' to a SID. Pass an explicit SID or check the account name."
}
}
if (-not (Test-Path $NssmPath)) {
throw "nssm.exe not found at '$NssmPath'. Install NSSM or pass -NssmPath."
}
$hostExe = Join-Path $InstallRoot 'OtOpcUa.Driver.FOCAS.Host.exe'
if (-not (Test-Path $hostExe)) {
throw "FOCAS Host binary not found at '$hostExe'. Publish the Driver.FOCAS.Host project first."
}
if (-not $FocasSharedSecret) {
$FocasSharedSecret = [System.Guid]::NewGuid().ToString('N')
Write-Host "Generated FocasSharedSecret — store it alongside the OtOpcUa service config."
}
$allowedSid = Resolve-Sid $ServiceAccount
# Idempotent install — remove + re-create if present.
$existing = Get-Service -Name $ServiceName -ErrorAction SilentlyContinue
if ($existing) {
Write-Host "Removing existing '$ServiceName' service..."
& $NssmPath stop $ServiceName confirm | Out-Null
& $NssmPath remove $ServiceName confirm | Out-Null
}
& $NssmPath install $ServiceName $hostExe | Out-Null
& $NssmPath set $ServiceName DisplayName 'OT-OPC-UA FOCAS Host (Tier-C isolated Fwlib32)' | Out-Null
& $NssmPath set $ServiceName Description 'Out-of-process Fwlib32.dll host for OtOpcUa FOCAS driver. Crash-isolated from the main OPC UA server.' | Out-Null
& $NssmPath set $ServiceName ObjectName $ServiceAccount | Out-Null
& $NssmPath set $ServiceName Start SERVICE_AUTO_START | Out-Null
& $NssmPath set $ServiceName AppStdout (Join-Path $env:ProgramData 'OtOpcUa\focas-host-stdout.log') | Out-Null
& $NssmPath set $ServiceName AppStderr (Join-Path $env:ProgramData 'OtOpcUa\focas-host-stderr.log') | Out-Null
& $NssmPath set $ServiceName AppRotateFiles 1 | Out-Null
& $NssmPath set $ServiceName AppRotateBytes 10485760 | Out-Null
& $NssmPath set $ServiceName AppEnvironmentExtra `
"OTOPCUA_FOCAS_PIPE=$FocasPipeName" `
"OTOPCUA_ALLOWED_SID=$allowedSid" `
"OTOPCUA_FOCAS_SECRET=$FocasSharedSecret" `
"OTOPCUA_FOCAS_BACKEND=$FocasBackend" | Out-Null
& $NssmPath set $ServiceName DependOnService OtOpcUa | Out-Null
Write-Host "Installed '$ServiceName' under '$ServiceAccount' (SID=$allowedSid)."
Write-Host "Pipe: \\.\pipe\$FocasPipeName Backend: $FocasBackend"
Write-Host "Start the service with: Start-Service $ServiceName"
Write-Host ""
Write-Host "NOTE: the Fwlib32 backend requires the licensed Fwlib32.dll on PATH"
Write-Host "alongside the Host exe. See docs/v2/focas-deployment.md."
+87
View File
@@ -0,0 +1,87 @@
# Integration runners
Scripts that orchestrate multi-component integration-test loops —
each one wires up docker fixtures, support binaries, and `dotnet test`
in sequence so a developer (or a CI agent) can get from "freshly
cloned repo" to "green integration suite" with one command.
Unlike `scripts/e2e/test-*.ps1` (which drive the built server through
the CLI for black-box coverage), scripts in this folder operate
**below** the server layer — they bring up the raw fixtures the
driver-level `IntegrationTests` projects need.
## Scripts
| Script | Purpose |
|--------|---------|
| [`run-focas.ps1`](run-focas.ps1) | FOCAS driver: builds shim DLLs + starts focas-mock docker + copies shim into test bin + runs `WireCompatGatedTests` + `FocasSimSmokeTests` + tears down docker |
## run-focas.ps1
### Prerequisites
- **Windows + PowerShell 7+**
- **.NET 10 SDK** — `dotnet --version` prints 10.x
- **Native C compiler** — one of:
- Visual Studio Build Tools with the C++ workload (then run from an
"x64 Native Tools Command Prompt for VS" shell), or
- Zig (`zig.exe` on PATH) as a drop-in alternative
- **Docker Desktop** running, OR pass `-SkipDocker` and run the mock
externally
### One-shot run
```powershell
cd C:\Users\dohertj2\Desktop\lmxopcua
pwsh .\scripts\integration\run-focas.ps1
```
That's the default invocation: thirtyone profile, debug build,
docker cleans up on exit.
### Development iteration
Re-run the tests without rebuilding the shim or restarting docker:
```powershell
# First run bootstraps everything + keeps the mock up.
pwsh .\scripts\integration\run-focas.ps1 -KeepDocker
# Iterate on test bodies without re-doing the slow steps.
pwsh .\scripts\integration\run-focas.ps1 -SkipShimBuild -SkipDocker
```
### Per-series runs
```powershell
pwsh .\scripts\integration\run-focas.ps1 -Profile thirty # 30i series
pwsh .\scripts\integration\run-focas.ps1 -Profile zerod # 0i-D
pwsh .\scripts\integration\run-focas.ps1 -Profile powermotion
```
Full profile list is in
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml`.
### Exit codes
| Code | Meaning |
|------|---------|
| 0 | All tests passed or cleanly skipped |
| 1 | `dotnet test` reported failures |
| 2 | The runner itself crashed (missing file, unexpected exception) |
| 3 | No C compiler detected for shim build |
| 4 | Docker CLI not on PATH |
### CI integration
Wire this into the project's CI runner (Gitea Actions, Jenkins,
whatever's hosting this repo) by calling:
```yaml
- name: FOCAS integration
shell: pwsh
run: ./scripts/integration/run-focas.ps1
```
The script is idempotent; a previous run's `docker compose down`
failure won't block the next one.
+123
View File
@@ -0,0 +1,123 @@
#Requires -Version 7.0
<#
.SYNOPSIS
Orchestrates the FOCAS driver integration-test loop: bring up the
focas-mock Docker container, run the managed wire-client integration
tests, tear down Docker.
.DESCRIPTION
The FOCAS integration fixture now needs just two things running
together:
1. A single focas-mock container listening on :8193 (one service,
no per-series compose profile ceremony the mock's native
FOCAS Ethernet responder handles every call the managed driver
issues).
2. The integration-test assembly built. The managed
`WireFocasClient` dials the mock directly; there is no shim
DLL, no P/Invoke, no test-bin DLL copy step.
This script handles both and cleans up on exit.
Designed to run unattended on a build agent or on a developer box.
Exit code matches the test suite (0 = all pass or skip-clean,
non-zero when any integration test failed).
.PARAMETER Profile
focas-mock profile name to seed at startup (e.g. `ThirtyOne_i`,
`Sixteen_i`, `fwlib30i64`). Defaults to `ThirtyOne_i`. The fixture
resolves aliases via `FocasCncSeries`, and tests that need per-series
state can call `fixture.LoadProfileAsync` directly at test start to
override the default.
.PARAMETER SkipDocker
Skip docker up/down. Use when the mock is already running from
another shell.
.PARAMETER Configuration
Build configuration Debug or Release. Default: Debug.
.PARAMETER KeepDocker
Don't tear down the docker stack on exit. Useful for iterating on
tests.
#>
param(
[string]$Profile = "ThirtyOne_i",
[switch]$SkipDocker,
[ValidateSet("Debug", "Release")]
[string]$Configuration = "Debug",
[switch]$KeepDocker
)
Set-StrictMode -Version 3.0
$ErrorActionPreference = "Stop"
$repoRoot = (Resolve-Path (Join-Path $PSScriptRoot "..\..")).Path
$integTests = Join-Path $repoRoot "tests\ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests"
$dockerYml = Join-Path $integTests "Docker\docker-compose.yml"
function Write-Step { param([string]$Msg) Write-Host ""; Write-Host "=== $Msg ===" -ForegroundColor Cyan }
function Write-Info { param([string]$Msg) Write-Host "[INFO] $Msg" -ForegroundColor Gray }
function Write-Fail { param([string]$Msg) Write-Host "[FAIL] $Msg" -ForegroundColor Red }
$cleanupScripts = @()
trap {
Write-Host ""
Write-Fail "run-focas.ps1 crashed: $_"
foreach ($c in $cleanupScripts) { try { & $c } catch { Write-Host "cleanup failed: $_" -ForegroundColor DarkYellow } }
exit 2
}
Write-Step "Build FOCAS IntegrationTests ($Configuration)"
dotnet build (Join-Path $integTests "ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests.csproj") `
--configuration $Configuration --nologo --verbosity minimal
if ($LASTEXITCODE -ne 0) { throw "dotnet build failed (exit $LASTEXITCODE)" }
if (-not $SkipDocker) {
Write-Step "docker compose up"
if (-not (Get-Command docker -ErrorAction SilentlyContinue)) {
Write-Fail "docker CLI not on PATH. Install Docker Desktop or pass -SkipDocker + run the mock externally."
exit 4
}
docker compose -f $dockerYml up -d --build --wait 2>&1 | Write-Host
if ($LASTEXITCODE -ne 0) { throw "docker compose up failed (exit $LASTEXITCODE)" }
if (-not $KeepDocker) {
$cleanupScripts += {
Write-Step "docker compose down"
docker compose -f $dockerYml down --remove-orphans 2>&1 | Write-Host
}
}
Write-Info "probing localhost:8193..."
$tcp = [System.Net.Sockets.TcpClient]::new()
try {
$ok = $tcp.ConnectAsync("127.0.0.1", 8193).Wait([TimeSpan]::FromSeconds(5))
if (-not $ok -or -not $tcp.Connected) {
throw "TCP probe to localhost:8193 failed after docker compose --wait succeeded"
}
}
finally { $tcp.Dispose() }
Write-Info "mock is accepting connections"
}
else {
Write-Step "Docker (skipped)"
}
Write-Step "dotnet test (wire-backend integration)"
$env:OTOPCUA_FOCAS_SIM_PROFILE = $Profile
dotnet test (Join-Path $integTests "ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests.csproj") `
--configuration $Configuration --no-build --nologo --verbosity minimal
$testExit = $LASTEXITCODE
foreach ($c in $cleanupScripts) { & $c }
if ($testExit -ne 0) {
Write-Fail "integration tests failed with exit $testExit"
exit $testExit
}
Write-Host ""
Write-Host "run-focas.ps1 completed successfully." -ForegroundColor Green
exit 0
+31 -11
View File
@@ -71,6 +71,13 @@ INSERT dbo.ClusterNode(NodeId, ClusterId, RedundancyRole, Host, OpcUaPort, Dashb
VALUES (@NodeId, @ClusterId, 'Primary', 'localhost', 4840, 5000,
'urn:OtOpcUa:p7-smoke-node', 200, 1, 'p7-smoke');
-- sp_GetCurrentGenerationForCluster gates access by SUSER_SNAME() against
-- ClusterNodeCredential; without this binding the Server bootstrap fails with
-- `Unauthorized: caller sa is not bound to NodeId p7-smoke-node`. Dev Docker
-- SQL runs with `sa`; production deploys would rotate to a per-node login.
INSERT dbo.ClusterNodeCredential(NodeId, Kind, Value, Enabled, CreatedBy)
VALUES (@NodeId, N'SqlLogin', N'sa', 1, N'p7-smoke');
-- 2. Generation (created Draft, flipped to Published at the end so insert order
-- constraints (one Draft per cluster, etc.) don't fight us).
DECLARE @Gen bigint;
@@ -106,30 +113,43 @@ VALUES (@Gen, @DrvId, @ClusterId, @NsId, 'galaxy-smoke', 'Galaxy', N'{
}', 1);
-- 6. One driver-sourced Tag bound to the Equipment. TagConfig is the Galaxy
-- fullRef ("DelmiaReceiver_001.DownloadPath" style); replace with a real
-- attribute on this Galaxy. The script paths below use
-- /lab-floor/galaxy-line/reactor-1/Source which the EquipmentNodeWalker
-- emits + the DriverSubscriptionBridge maps to this driver fullRef.
-- fullRef; the EquipmentNodeWalker reads the JSON's FullName field and
-- hands that string to the Galaxy driver as the MXAccess reference.
-- Default points at TestMachine_001.TestHistoryValue — the live dev-box
-- Galaxy ships it as Int32 + writable (security_classification=Operate)
-- + historized (HistoryExtension primitive), so the E2E script can
-- exercise read + write + subscribe + alarm + history against the
-- same attribute. Swap to any other writable historized attribute on
-- this Galaxy by re-running `gr/queries/attributes_extended.sql` with
-- `WHERE is_historized=1 AND security_classification > 0`.
INSERT dbo.Tag(GenerationId, TagId, DriverInstanceId, EquipmentId, Name, DataType,
AccessLevel, TagConfig, WriteIdempotent)
VALUES (@Gen, @TagId, @DrvId, @EqId, 'Source', 'Float64', 'Read',
N'{"FullName":"REPLACE_WITH_REAL_GALAXY_ATTRIBUTE","DataType":"Float64"}', 0);
VALUES (@Gen, @TagId, @DrvId, @EqId, 'Source', 'Int32', 'ReadWrite',
N'{"FullName":"TestMachine_001.TestHistoryValue","DataType":"Int32"}', 0);
-- 7. Scripts (SourceHash is SHA-256 of SourceCode, computed externally — using
-- a placeholder here; the engine recomputes on first use anyway).
--
-- MachineStatus predicate — the dynamic "is the machine running?" status
-- derived from the raw Source value. Boolean: true when Source > 0. Matches
-- the shape Aveva operators typically want on a machine-status tile (green
-- when above zero, otherwise grey) without needing a separate threshold
-- attribute. Rename + re-threshold here to mirror site semantics.
INSERT dbo.Script(GenerationId, ScriptId, Name, SourceCode, SourceHash, Language)
VALUES
(@Gen, @VtScript, 'doubled-source',
N'return ((double)ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) * 2.0;',
(@Gen, @VtScript, 'machine-status-predicate',
N'return System.Convert.ToInt32(ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) > 0;',
'0000000000000000000000000000000000000000000000000000000000000000', 'CSharp'),
(@Gen, @AlScript, 'overtemp-predicate',
N'return ((double)ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) > 50.0;',
N'return System.Convert.ToInt32(ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) > 50;',
'0000000000000000000000000000000000000000000000000000000000000000', 'CSharp');
-- 8. VirtualTag — derived value computed by Roslyn each time Source changes.
-- 8. VirtualTag — MachineStatus boolean computed by Roslyn each time Source
-- changes. Historized so the dashboard can plot a running/idle timeline
-- next to the raw TestHistoryValue trend from Aveva Historian.
INSERT dbo.VirtualTag(GenerationId, VirtualTagId, EquipmentId, Name, DataType,
ScriptId, ChangeTriggered, TimerIntervalMs, Historize, Enabled)
VALUES (@Gen, @VtId, @EqId, 'Doubled', 'Float64', @VtScript, 1, NULL, 0, 1);
VALUES (@Gen, @VtId, @EqId, 'MachineStatus', 'Boolean', @VtScript, 1, NULL, 1, 1);
-- 9. ScriptedAlarm — Active when Source > 50.
INSERT dbo.ScriptedAlarm(GenerationId, ScriptedAlarmId, EquipmentId, Name, AlarmType,
@@ -51,6 +51,7 @@ else
<li class="nav-item"><button class="nav-link @Tab("uns")" @onclick='() => _tab = "uns"'>UNS Structure</button></li>
<li class="nav-item"><button class="nav-link @Tab("namespaces")" @onclick='() => _tab = "namespaces"'>Namespaces</button></li>
<li class="nav-item"><button class="nav-link @Tab("drivers")" @onclick='() => _tab = "drivers"'>Drivers</button></li>
<li class="nav-item"><button class="nav-link @Tab("tags")" @onclick='() => _tab = "tags"'>Tags</button></li>
<li class="nav-item"><button class="nav-link @Tab("acls")" @onclick='() => _tab = "acls"'>ACLs</button></li>
<li class="nav-item"><button class="nav-link @Tab("redundancy")" @onclick='() => _tab = "redundancy"'>Redundancy</button></li>
<li class="nav-item"><button class="nav-link @Tab("audit")" @onclick='() => _tab = "audit"'>Audit</button></li>
@@ -89,6 +90,10 @@ else
{
<DriversTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "tags" && _currentDraft is not null)
{
<TagsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "acls" && _currentDraft is not null)
{
<AclsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
@@ -1,3 +1,5 @@
@using System.Text.Json
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Pages.Modbus
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject DriverInstanceService DriverSvc
@@ -17,7 +19,21 @@ else
<tbody>
@foreach (var d in _drivers)
{
<tr><td><code>@d.DriverInstanceId</code></td><td>@d.Name</td><td>@d.DriverType</td><td><code>@d.NamespaceId</code></td></tr>
<tr>
<td><code>@d.DriverInstanceId</code></td>
<td>@d.Name</td>
<td>
@if (string.Equals(d.DriverType, "Focas", StringComparison.OrdinalIgnoreCase))
{
<a href="/drivers/focas/@d.DriverInstanceId">@d.DriverType</a>
}
else
{
@d.DriverType
}
</td>
<td><code>@d.NamespaceId</code></td>
</tr>
}
</tbody>
</table>
@@ -36,13 +52,14 @@ else
<label class="form-label">DriverType</label>
<select class="form-select" @bind="_type">
<option>Galaxy</option>
<option>ModbusTcp</option>
<option>Modbus</option>
<option>AbCip</option>
<option>AbLegacy</option>
<option>S7</option>
<option>Focas</option>
<option>OpcUaClient</option>
</select>
<div class="form-text">Type string must match the driver's registered factory name; this dropdown wraps the canonical names.</div>
</div>
<div class="col-md-6">
<label class="form-label">Namespace</label>
@@ -51,9 +68,19 @@ else
</select>
</div>
<div class="col-12">
<label class="form-label">DriverConfig JSON (schemaless per driver type)</label>
<textarea class="form-control font-monospace" rows="6" @bind="_config"></textarea>
<div class="form-text">Phase 1: generic JSON editor — per-driver schema validation arrives in each driver's phase (decision #94).</div>
@if (string.Equals(_type, "Modbus", StringComparison.OrdinalIgnoreCase))
{
@* #147 — typed editor for Modbus drivers. The generic textarea is a fall-back
for driver types that haven't yet shipped a typed editor. *@
<label class="form-label">Modbus options (typed editor)</label>
<ModbusOptionsEditor Model="_modbusOptions"/>
}
else
{
<label class="form-label">DriverConfig JSON (schemaless per driver type)</label>
<textarea class="form-control font-monospace" rows="6" @bind="_config"></textarea>
<div class="form-text">Phase 1: generic JSON editor — per-driver schema validation arrives in each driver's phase (decision #94).</div>
}
</div>
</div>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
@@ -73,11 +100,17 @@ else
private List<Namespace>? _namespaces;
private bool _showForm;
private string _name = string.Empty;
private string _type = "ModbusTcp";
private string _type = "Modbus";
private string _nsId = string.Empty;
private string _config = "{}";
private string? _error;
// #147 — typed editor model for Modbus drivers. Defaults match ModbusDriverOptions
// defaults so an unedited form produces config equivalent to the historical
// pre-typed-editor wire output. Serialised to _config on Save when type=Modbus.
private ModbusOptionsEditor.ModbusOptionsViewModel _modbusOptions = new();
private static readonly JsonSerializerOptions ModbusJsonOptions = new() { WriteIndented = true };
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync()
@@ -97,11 +130,57 @@ else
}
try
{
await DriverSvc.AddAsync(GenerationId, ClusterId, _nsId, _name, _type, _config, CancellationToken.None);
// #147 — for Modbus drivers serialize the typed editor model into the DriverConfig
// JSON column. Other driver types still use the raw textarea contents until each
// ships its own typed editor (decision #94 — per-driver schema validation arrives
// per driver phase).
var configJson = string.Equals(_type, "Modbus", StringComparison.OrdinalIgnoreCase)
? SerializeModbusOptions(_modbusOptions)
: _config;
await DriverSvc.AddAsync(GenerationId, ClusterId, _nsId, _name, _type, configJson, CancellationToken.None);
_name = string.Empty; _config = "{}";
_modbusOptions = new();
_showForm = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
/// <summary>
/// Maps the view-model field names onto the JSON shape <c>ModbusDriverFactoryExtensions</c>
/// consumes. Hand-rolled because the DTO uses millisecond / byte field flavours that the
/// view model exposes as TimeSpan-derived integers; a System.Text.Json round-trip would
/// emit the .NET-native names instead.
/// </summary>
private static string SerializeModbusOptions(ModbusOptionsEditor.ModbusOptionsViewModel m) =>
JsonSerializer.Serialize(new
{
host = m.Host,
port = m.Port,
unitId = m.UnitId,
family = m.Family.ToString(),
melsecSubFamily = m.MelsecSubFamily.ToString(),
keepAlive = new
{
enabled = m.KeepAliveEnabled,
timeMs = m.KeepAliveTimeSec * 1000,
intervalMs = m.KeepAliveIntervalSec * 1000,
retryCount = m.KeepAliveRetryCount,
},
reconnect = new
{
initialDelayMs = m.ReconnectInitialDelayMs,
maxDelayMs = m.ReconnectMaxDelayMs,
backoffMultiplier = m.ReconnectBackoffMultiplier,
},
maxRegistersPerRead = m.MaxRegistersPerRead,
maxRegistersPerWrite = m.MaxRegistersPerWrite,
maxCoilsPerRead = m.MaxCoilsPerRead,
maxReadGap = m.MaxReadGap,
useFC15ForSingleCoilWrites = m.UseFC15ForSingleCoilWrites,
useFC16ForSingleRegisterWrites = m.UseFC16ForSingleRegisterWrites,
writeOnChangeOnly = m.WriteOnChangeOnly,
tags = Array.Empty<object>(),
}, ModbusJsonOptions);
}
@@ -26,6 +26,15 @@
reservation conflict exists.
</div>
<div class="alert alert-secondary small mb-3">
<strong>Per-tag addressing for Modbus drivers</strong> isn't part of equipment import —
tags are configured at the driver-instance level via the
<a href="/clusters/@ClusterId/draft/@GenerationId">Drivers tab</a>. Use the
<a href="/modbus/address-preview" target="_blank">address-preview tool</a> to sanity-check
grammar strings (<code>40001:F:CDAB</code>, <code>HR1:I</code>, <code>V2000</code> for
DL205 family, etc.) before pasting them into the driver config.
</div>
<div class="card mb-3">
<div class="card-body">
<div class="row g-3">
@@ -0,0 +1,372 @@
@using System.Text.Json
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Pages.Modbus
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@inject TagService TagSvc
@inject DriverInstanceService DriverSvc
@inject EquipmentService EquipmentSvc
@*
#155 — interactive Tag CRUD scoped to a draft generation. Conditional editor: when the
selected DriverInstance is Modbus, the address input switches to ModbusAddressEditor (#145)
so users get the live-parse preview + grammar validation. Other driver types fall back to
a generic JSON textarea, matching the DriversTab pattern from #147.
*@
<div class="d-flex justify-content-between mb-3">
<h4>Tags (draft gen @GenerationId)</h4>
<button class="btn btn-primary btn-sm" @onclick="StartAdd">Add tag</button>
</div>
<div class="row g-3 mb-3">
<div class="col-md-4">
<label class="form-label small text-muted">Filter by driver</label>
<select class="form-select form-select-sm" @bind="_filterDriverId" @bind:after="ReloadAsync">
<option value="">— all drivers —</option>
@if (_drivers is not null)
{
@foreach (var d in _drivers)
{
<option value="@d.DriverInstanceId">@d.Name (@d.DriverType)</option>
}
}
</select>
</div>
</div>
@if (_tags is null) { <p>Loading…</p> }
else if (_tags.Count == 0 && !_showForm) { <p class="text-muted">No tags in this filter.</p> }
else if (_tags.Count > 0)
{
<table class="table table-sm">
<thead>
<tr><th>Name</th><th>Driver</th><th>Equipment</th><th>DataType</th><th>Access</th><th>TagConfig</th><th></th></tr>
</thead>
<tbody>
@foreach (var t in _tags)
{
<tr>
<td>@t.Name</td>
<td><code>@t.DriverInstanceId</code></td>
<td>@(t.EquipmentId ?? "—")</td>
<td>@t.DataType</td>
<td>@t.AccessLevel</td>
<td class="font-monospace small text-truncate" style="max-width:18rem">@t.TagConfig</td>
<td>
<button class="btn btn-sm btn-outline-secondary me-1" @onclick="() => StartEdit(t)">Edit</button>
<button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteAsync(t.TagRowId)">Remove</button>
</td>
</tr>
}
</tbody>
</table>
}
@if (_showForm)
{
<div class="card mt-3">
<div class="card-body">
<h5>@(_editMode ? "Edit tag" : "New tag")</h5>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">Name</label>
<input class="form-control" @bind="_draft.Name"/>
</div>
<div class="col-md-4">
<label class="form-label">DriverInstance</label>
<select class="form-select" @bind="_draft.DriverInstanceId" @bind:after="OnDriverChanged">
<option value="">— select driver —</option>
@if (_drivers is not null)
{
@foreach (var d in _drivers) { <option value="@d.DriverInstanceId">@d.Name (@d.DriverType)</option> }
}
</select>
</div>
<div class="col-md-4">
<label class="form-label">Equipment (optional)</label>
<select class="form-select" @bind="_draft.EquipmentId">
<option value="">— none (folder-path mode) —</option>
@if (_equipment is not null)
{
@foreach (var e in _equipment) { <option value="@e.EquipmentId">@e.Name</option> }
}
</select>
</div>
<div class="col-md-4">
<label class="form-label">DataType</label>
<input class="form-control" @bind="_draft.DataType" placeholder="Boolean / Int32 / Float / etc."/>
</div>
<div class="col-md-4">
<label class="form-label">AccessLevel</label>
<select class="form-select" @bind="_draft.AccessLevel">
@foreach (var a in Enum.GetValues<TagAccessLevel>())
{
<option value="@a">@a</option>
}
</select>
</div>
<div class="col-md-4">
<div class="form-check mt-4">
<input type="checkbox" class="form-check-input" @bind="_draft.WriteIdempotent"/>
<label class="form-check-label">WriteIdempotent</label>
</div>
</div>
</div>
<div class="mt-3">
@if (_isModbus)
{
<ModbusAddressEditor @bind-AddressString="_modbusAddress"
@bind-AddressString:after="OnAddressChanged"/>
}
else
{
<label class="form-label">TagConfig (driver-specific JSON or string)</label>
<textarea class="form-control font-monospace" rows="3" @bind="_draft.TagConfig"
placeholder='@("{\"address\": ...}")'></textarea>
}
</div>
@* #156 — advanced Modbus fields. Collapsed by default; the basic form covers the
typical edit workflow. Expander surfaces Deadband (#141) / UnitId override (#142) /
CoalesceProhibited (#143) for the multi-slave / noisy-analog / protected-hole
deployments. Save-side flushes these into TagConfig as a structured JSON object
that ModbusTagDto's BuildTag honours alongside the address string. *@
@if (_isModbus)
{
<div class="mt-3">
<button type="button" class="btn btn-sm btn-link p-0"
@onclick="() => _showAdvanced = !_showAdvanced">
@(_showAdvanced ? "▼ Advanced" : "▶ Advanced") (Deadband / UnitId override / CoalesceProhibited)
</button>
</div>
@if (_showAdvanced)
{
<div class="row g-3 mt-1 ps-3 border-start">
<div class="col-md-4">
<label class="form-label small">Deadband
<span class="text-muted">(numeric scalar types only)</span>
</label>
<input type="number" step="any" class="form-control form-control-sm"
@bind="_advancedDeadband" @bind:after="OnAdvancedChanged"
placeholder="e.g. 0.5"/>
<div class="form-text">Suppress publishes when |new - last| &lt; threshold.</div>
</div>
<div class="col-md-4">
<label class="form-label small">UnitId override
<span class="text-muted">(0255, blank = use driver default)</span>
</label>
<input type="number" min="0" max="255" class="form-control form-control-sm"
@bind="_advancedUnitId" @bind:after="OnAdvancedChanged"
placeholder="leave blank for driver-level"/>
<div class="form-text">Per-tag MBAP unit ID. Required when fronting a multi-slave gateway.</div>
</div>
<div class="col-md-4">
<label class="form-label small">CoalesceProhibited</label>
<div class="form-check mt-1">
<input type="checkbox" class="form-check-input"
@bind="_advancedCoalesceProhibited" @bind:after="OnAdvancedChanged"/>
<label class="form-check-label">Read in isolation (#143)</label>
</div>
<div class="form-text">Use when surrounding registers are write-only or fault on read.</div>
</div>
</div>
}
}
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button class="btn btn-sm btn-primary" @onclick="SaveAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="Cancel">Cancel</button>
</div>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<Tag>? _tags;
private List<DriverInstance>? _drivers;
private List<Equipment>? _equipment;
private string _filterDriverId = string.Empty;
private bool _showForm;
private bool _editMode;
private Tag _draft = NewBlankDraft();
private string? _error;
private bool _isModbus;
private string? _modbusAddress;
// #156 — advanced Modbus fields. Bound separately from _draft.TagConfig because they
// round-trip through a structured JSON shape, not a single string. Synced into TagConfig
// by OnAdvancedChanged / OnAddressChanged (whichever fires).
private bool _showAdvanced;
private double? _advancedDeadband;
private byte? _advancedUnitId;
private bool _advancedCoalesceProhibited;
private static Tag NewBlankDraft() => new()
{
TagId = string.Empty, DriverInstanceId = string.Empty, Name = string.Empty,
DataType = "Int32", AccessLevel = TagAccessLevel.Read, TagConfig = string.Empty,
};
protected override async Task OnParametersSetAsync()
{
_drivers = await DriverSvc.ListAsync(GenerationId, CancellationToken.None);
_equipment = await EquipmentSvc.ListAsync(GenerationId, CancellationToken.None);
await ReloadAsync();
}
private async Task ReloadAsync()
{
_tags = await TagSvc.ListAsync(GenerationId,
string.IsNullOrWhiteSpace(_filterDriverId) ? null : _filterDriverId,
equipmentId: null,
CancellationToken.None);
}
private void StartAdd()
{
_draft = NewBlankDraft();
_editMode = false;
_modbusAddress = null;
_isModbus = false;
_error = null;
_showForm = true;
ResetAdvanced();
}
private void ResetAdvanced()
{
_showAdvanced = false;
_advancedDeadband = null;
_advancedUnitId = null;
_advancedCoalesceProhibited = false;
}
private void StartEdit(Tag row)
{
_draft = new Tag
{
TagRowId = row.TagRowId,
GenerationId = row.GenerationId,
TagId = row.TagId,
DriverInstanceId = row.DriverInstanceId,
DeviceId = row.DeviceId,
EquipmentId = row.EquipmentId,
Name = row.Name,
FolderPath = row.FolderPath,
DataType = row.DataType,
AccessLevel = row.AccessLevel,
WriteIdempotent = row.WriteIdempotent,
PollGroupId = row.PollGroupId,
TagConfig = row.TagConfig,
};
_editMode = true;
OnDriverChanged();
// Try to extract addressString + advanced fields from existing JSON config so the
// form pre-fills correctly when an operator hits Edit on an existing row.
ResetAdvanced();
if (_isModbus) HydrateModbusFromTagConfig(row.TagConfig);
_error = null;
_showForm = true;
}
private void HydrateModbusFromTagConfig(string tagConfig)
{
try
{
using var doc = JsonDocument.Parse(tagConfig);
var root = doc.RootElement;
if (root.TryGetProperty("addressString", out var addr) && addr.ValueKind == JsonValueKind.String)
_modbusAddress = addr.GetString();
if (root.TryGetProperty("deadband", out var db) && db.ValueKind is JsonValueKind.Number)
_advancedDeadband = db.GetDouble();
if (root.TryGetProperty("unitId", out var uid) && uid.ValueKind is JsonValueKind.Number)
_advancedUnitId = uid.GetByte();
if (root.TryGetProperty("coalesceProhibited", out var cp) && cp.ValueKind is JsonValueKind.True or JsonValueKind.False)
_advancedCoalesceProhibited = cp.GetBoolean();
// Auto-expand the advanced panel when any of those fields was actually set so
// operators see immediately what's been configured.
if (_advancedDeadband.HasValue || _advancedUnitId.HasValue || _advancedCoalesceProhibited)
_showAdvanced = true;
}
catch { /* Malformed JSON falls back to empty advanced state. */ }
}
private void OnDriverChanged()
{
var driver = _drivers?.FirstOrDefault(d => d.DriverInstanceId == _draft.DriverInstanceId);
_isModbus = driver is not null
&& string.Equals(driver.DriverType, "Modbus", StringComparison.OrdinalIgnoreCase);
}
private void OnAddressChanged() => RefreshTagConfigJson();
private void OnAdvancedChanged() => RefreshTagConfigJson();
/// <summary>
/// Re-serializes the current address + advanced fields into TagConfig as a structured
/// JSON object. ModbusTagDto's BuildTag honours every field — addressString drives
/// the parser, while the structured bits (deadband / unitId / coalesceProhibited)
/// pass through directly. Fields with default / empty values are omitted from the
/// JSON to keep diffs in the existing draft-diff viewer clean.
/// </summary>
private void RefreshTagConfigJson()
{
if (string.IsNullOrWhiteSpace(_modbusAddress)
&& !_advancedDeadband.HasValue
&& !_advancedUnitId.HasValue
&& !_advancedCoalesceProhibited)
{
return;
}
var payload = new Dictionary<string, object?>();
if (!string.IsNullOrWhiteSpace(_modbusAddress)) payload["addressString"] = _modbusAddress;
if (_advancedDeadband.HasValue) payload["deadband"] = _advancedDeadband.Value;
if (_advancedUnitId.HasValue) payload["unitId"] = _advancedUnitId.Value;
if (_advancedCoalesceProhibited) payload["coalesceProhibited"] = true;
_draft.TagConfig = JsonSerializer.Serialize(payload);
}
private void Cancel()
{
_showForm = false;
_editMode = false;
}
private async Task SaveAsync()
{
_error = null;
try
{
if (string.IsNullOrWhiteSpace(_draft.Name) || string.IsNullOrWhiteSpace(_draft.DriverInstanceId))
{
_error = "Name and DriverInstance are required.";
return;
}
if (_editMode)
await TagSvc.UpdateAsync(_draft, CancellationToken.None);
else
await TagSvc.CreateAsync(GenerationId, _draft, CancellationToken.None);
_showForm = false;
_editMode = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
private async Task DeleteAsync(Guid id)
{
await TagSvc.DeleteAsync(id, CancellationToken.None);
await ReloadAsync();
}
}
@@ -0,0 +1,224 @@
@page "/drivers/focas/{InstanceId}"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@inject FocasDriverDetailService DetailSvc
<h1 class="mb-3">FOCAS driver <code>@InstanceId</code></h1>
@if (_loading)
{
<p>Loading…</p>
}
else if (_detail is null)
{
<div class="alert alert-warning">
No FOCAS driver instance with id <code>@InstanceId</code> was found.
<div class="small text-muted mt-1">
Either the id is wrong, or the instance's <code>DriverType</code> is not "Focas". The list of drivers per cluster draft is on the <a href="/clusters">Clusters</a> page.
</div>
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Name</h6>
<div class="fs-5">@_detail.Instance.Name</div>
</div></div></div>
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Cluster</h6>
<div class="fs-5"><code>@_detail.Instance.ClusterId</code></div>
</div></div></div>
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Namespace</h6>
<div class="fs-5"><code>@_detail.Instance.NamespaceId</code></div>
</div></div></div>
<div class="col-md-3"><div class="card @(_detail.Instance.Enabled ? "border-success" : "border-secondary")"><div class="card-body">
<h6 class="text-muted mb-1">Enabled</h6>
<div class="fs-5">@(_detail.Instance.Enabled ? "Yes" : "No")</div>
</div></div></div>
</div>
@if (_detail.ParseError is not null)
{
<div class="alert alert-danger">
<strong>DriverConfig JSON failed to parse:</strong> @_detail.ParseError
<div class="small text-muted mt-1">
Falling back to raw-JSON view below; the per-section tables are hidden because the shape couldn't be projected.
</div>
</div>
}
else if (_detail.Config is not null)
{
<h2 class="h5 mt-4">Devices</h2>
@if (_detail.Config.Devices is null || _detail.Config.Devices.Count == 0)
{
<p class="text-muted">No devices configured.</p>
}
else
{
<table class="table table-sm align-middle">
<thead><tr><th>HostAddress</th><th>DeviceName</th><th>Series</th></tr></thead>
<tbody>
@foreach (var d in _detail.Config.Devices)
{
<tr>
<td><code>@d.HostAddress</code></td>
<td>@(d.DeviceName ?? "—")</td>
<td>@(string.IsNullOrEmpty(d.Series) ? "Unknown" : d.Series)</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h5 mt-4">Tags</h2>
@if (_detail.Config.Tags is null || _detail.Config.Tags.Count == 0)
{
<p class="text-muted">No tags configured.</p>
}
else
{
<p class="small text-muted">@_detail.Config.Tags.Count tag(s) configured.</p>
<table class="table table-sm align-middle">
<thead><tr><th>Name</th><th>Device</th><th>Address</th><th>DataType</th><th>Writable</th></tr></thead>
<tbody>
@foreach (var t in _detail.Config.Tags)
{
<tr>
<td>@t.Name</td>
<td><code class="small">@t.DeviceHostAddress</code></td>
<td><code>@t.Address</code></td>
<td>@t.DataType</td>
<td>@(t.Writable ? "Yes" : "No")</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h5 mt-4">Driver behaviour</h2>
<table class="table table-sm align-middle" style="max-width: 640px;">
<tbody>
<tr>
<th style="width: 30%;">Probe</th>
<td>
@if (_detail.Config.Probe is { } probe)
{
<span class="badge @(probe.Enabled ? "bg-success" : "bg-secondary")">@(probe.Enabled ? "Enabled" : "Disabled")</span>
<span class="ms-2 small text-muted">Interval: @(probe.Interval ?? "default")</span>
}
else { <span class="text-muted">default (enabled)</span> }
</td>
</tr>
<tr>
<th>Alarm projection</th>
<td>
@if (_detail.Config.AlarmProjection is { } ap)
{
<span class="badge @(ap.Enabled ? "bg-success" : "bg-secondary")">@(ap.Enabled ? "Enabled" : "Disabled")</span>
<span class="ms-2 small text-muted">PollInterval: @(ap.PollInterval ?? "default")</span>
}
else { <span class="text-muted">disabled (default)</span> }
</td>
</tr>
<tr>
<th>Handle recycling</th>
<td>
@if (_detail.Config.HandleRecycle is { } hr)
{
<span class="badge @(hr.Enabled ? "bg-warning text-dark" : "bg-secondary")">@(hr.Enabled ? "Enabled" : "Disabled")</span>
<span class="ms-2 small text-muted">Interval: @(hr.Interval ?? "default (01:00:00)")</span>
}
else { <span class="text-muted">disabled (default)</span> }
</td>
</tr>
</tbody>
</table>
}
<h2 class="h5 mt-4">Host status</h2>
@if (_detail.HostStatuses.Count == 0)
{
<div class="alert alert-secondary small">
No <code>DriverHostStatus</code> rows yet for this instance. The Server publishes its first
tick ~2 s after the driver starts — if this stays empty after a minute, check that the Server is running and the instance is in a published generation.
</div>
}
else
{
<table class="table table-sm table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Host</th>
<th>State</th>
<th class="text-end" title="Consecutive failures">Fail#</th>
<th>Breaker last opened</th>
<th>Last recycled</th>
<th>Last seen</th>
<th>Detail</th>
</tr>
</thead>
<tbody>
@foreach (var r in _detail.HostStatuses)
{
<tr class="@(IsStale(r) ? "table-warning" : "")">
<td><code>@r.NodeId</code></td>
<td>@r.HostName</td>
<td><span class="badge @StateBadge(r.State)">@r.State</span></td>
<td class="text-end small">@r.ConsecutiveFailures</td>
<td class="small">@FormatUtc(r.LastCircuitBreakerOpenUtc)</td>
<td class="small">@FormatUtc(r.LastRecycleUtc)</td>
<td class="small @(IsStale(r) ? "text-warning" : "")">@FormatAge(r.LastSeenUtc)</td>
<td class="text-truncate small" style="max-width: 240px;" title="@r.Detail">@r.Detail</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h5 mt-4">Raw DriverConfig JSON</h2>
<pre class="small bg-light border p-3"><code>@_detail.Instance.DriverConfig</code></pre>
<div class="mt-4 small text-muted">
Docs: <code>docs/drivers/FOCAS.md</code> (getting started) · <code>docs/v2/focas-deployment.md</code> (NSSM + pipe ACL) · <code>docs/drivers/FOCAS-Test-Fixture.md</code> (test coverage).
</div>
}
@code {
[Parameter] public string InstanceId { get; set; } = string.Empty;
private FocasDriverDetail? _detail;
private bool _loading = true;
protected override async Task OnParametersSetAsync()
{
_loading = true;
try { _detail = await DetailSvc.GetAsync(InstanceId, CancellationToken.None); }
finally { _loading = false; }
}
private static bool IsStale(FocasHostStatusRow r) =>
DateTime.UtcNow - r.LastSeenUtc > TimeSpan.FromSeconds(30);
private static string StateBadge(string state) => state switch
{
"Running" => "bg-success",
"Faulted" => "bg-danger",
"Starting" => "bg-info",
"Stopped" => "bg-secondary",
_ => "bg-secondary",
};
private static string FormatUtc(DateTime? utc) =>
utc is null ? "—" : utc.Value.ToString("yyyy-MM-dd HH:mm 'UTC'");
private static string FormatAge(DateTime utc)
{
var age = DateTime.UtcNow - utc;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 48) return $"{(int)age.TotalHours}h ago";
return utc.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
}
@@ -1,9 +1,12 @@
@page "/hosts"
@using Microsoft.AspNetCore.SignalR.Client
@using Microsoft.EntityFrameworkCore
@using ZB.MOM.WW.OtOpcUa.Admin.Hubs
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject IServiceScopeFactory ScopeFactory
@implements IDisposable
@inject NavigationManager Nav
@implements IAsyncDisposable
<h1 class="mb-4">Driver host status</h1>
@@ -128,6 +131,7 @@ else
private bool _refreshing;
private DateTime? _lastRefreshUtc;
private Timer? _timer;
private HubConnection? _hub;
protected override async Task OnInitializedAsync()
{
@@ -136,6 +140,44 @@ else
state: null,
dueTime: TimeSpan.FromSeconds(RefreshIntervalSeconds),
period: TimeSpan.FromSeconds(RefreshIntervalSeconds));
await ConnectHubAsync();
}
// Phase 6.1 Stream E.2 — subscribe to FleetStatusHub so resilience deltas upsert the
// matching row without waiting for the next RefreshIntervalSeconds tick. The 10 s
// poll stays as a safety net in case the hub connection is down.
private async Task ConnectHubAsync()
{
var hubUrl = Nav.ToAbsoluteUri("/hubs/fleet");
_hub = new HubConnectionBuilder().WithUrl(hubUrl).WithAutomaticReconnect().Build();
_hub.On<ResilienceStatusChangedMessage>("ResilienceStatusChanged", OnResilienceChanged);
try
{
await _hub.StartAsync();
await _hub.SendAsync("SubscribeFleet");
}
catch
{
// Hub is best-effort; polling refresh is the fallback. Swallow connect errors
// so the page still renders against the initial RefreshAsync pass.
}
}
private async Task OnResilienceChanged(ResilienceStatusChangedMessage msg)
{
if (_rows is null) return;
var idx = _rows.FindIndex(r =>
r.DriverInstanceId == msg.DriverInstanceId && r.HostName == msg.HostName);
if (idx < 0) return;
var prior = _rows[idx];
_rows[idx] = prior with
{
ConsecutiveFailures = msg.ConsecutiveFailures,
LastCircuitBreakerOpenUtc = msg.LastCircuitBreakerOpenUtc,
CurrentBulkheadDepth = msg.CurrentBulkheadDepth,
LastRecycleUtc = msg.LastRecycleUtc,
};
await InvokeAsync(StateHasChanged);
}
private async Task RefreshAsync()
@@ -180,5 +222,12 @@ else
return t.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
public void Dispose() => _timer?.Dispose();
public async ValueTask DisposeAsync()
{
_timer?.Dispose();
if (_hub is not null)
{
try { await _hub.DisposeAsync(); } catch { }
}
}
}
@@ -0,0 +1,79 @@
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@*
#145 — Live address-string parser preview for Modbus tag editing. Bound to a string
AddressString; on every input keystroke the parser runs and surfaces the resolved
breakdown (Region, PduOffset, DataType, Bit, ByteOrder, ArrayCount, StringLength) or
the parse error. Family flag drives the parser's family-native branch (#144).
Re-uses the same ModbusAddressParser the wire driver uses, so grammar drift is
impossible by construction. Internal-namespace component called from the larger
DriverInstance editor.
*@
<div class="mb-3">
<label class="form-label">Address string</label>
<input type="text" class="form-control @(IsValid ? "is-valid" : Diagnostic is null ? "" : "is-invalid")"
value="@AddressString"
@oninput="@OnInputChanged"
placeholder="e.g. 40001:F:CDAB:5"/>
@if (IsValid && _parsed is not null)
{
<div class="form-text text-success">
<strong>Parsed:</strong>
Region=<code>@_parsed.Region</code>
Offset=<code>@_parsed.Offset</code>
Type=<code>@_parsed.DataType</code>
@if (_parsed.Bit.HasValue) { <text>Bit=<code>@_parsed.Bit</code></text> }
@if (_parsed.ByteOrder != ModbusByteOrder.BigEndian) { <text>Order=<code>@_parsed.ByteOrder</code></text> }
@if (_parsed.ArrayCount.HasValue) { <text>Array[<code>@_parsed.ArrayCount</code>]</text> }
@if (_parsed.StringLength > 0) { <text>StrLen=<code>@_parsed.StringLength</code></text> }
</div>
}
else if (Diagnostic is not null)
{
<div class="invalid-feedback">@Diagnostic</div>
}
</div>
@code {
[Parameter] public string? AddressString { get; set; }
[Parameter] public EventCallback<string?> AddressStringChanged { get; set; }
[Parameter] public ModbusFamily Family { get; set; } = ModbusFamily.Generic;
[Parameter] public MelsecFamily MelsecSubFamily { get; set; } = MelsecFamily.Q_L_iQR;
[Parameter] public EventCallback<ParsedModbusAddress?> ParsedChanged { get; set; }
private ParsedModbusAddress? _parsed;
private string? Diagnostic;
private bool IsValid => _parsed is not null && Diagnostic is null;
protected override void OnParametersSet() => Reparse();
private async Task OnInputChanged(ChangeEventArgs e)
{
AddressString = e.Value as string;
await AddressStringChanged.InvokeAsync(AddressString);
Reparse();
await ParsedChanged.InvokeAsync(_parsed);
}
private void Reparse()
{
if (string.IsNullOrWhiteSpace(AddressString))
{
_parsed = null;
Diagnostic = null;
return;
}
if (ModbusAddressParser.TryParse(AddressString, Family, MelsecSubFamily, out var parsed, out var err))
{
_parsed = parsed;
Diagnostic = null;
}
else
{
_parsed = null;
Diagnostic = err;
}
}
}
@@ -0,0 +1,85 @@
@page "/modbus/address-preview"
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@*
#149 — standalone preview / sanity-check tool for Modbus address strings. The Admin UI
doesn't yet have a per-tag CRUD surface (tags are seeded via SQL or arrive at runtime
through ITagDiscovery), so the ModbusAddressEditor component shipped in #145 needs a
page where operators can paste an address string and confirm it parses to what they
expect before committing it to a config row.
Doubles as a "did the parser ship correctly" smoke target for QA + a copy-pasteable
grammar reference for users skimming the docs.
*@
<PageTitle>Modbus address preview</PageTitle>
<div class="container py-4">
<h1>Modbus address preview</h1>
<p class="text-muted">
Paste an address string and watch the parser break it down field by field. Useful for
sanity-checking a tag spreadsheet row before adding it to a driver's <code>DriverConfig</code>.
Full grammar: <a href="https://github.com/" target="_blank">docs/v2/modbus-addressing.md</a>.
</p>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">PLC family hint (drives the family-native branch)</label>
<select class="form-select" @bind="_family">
@foreach (var f in Enum.GetValues<ModbusFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
@if (_family == ModbusFamily.MELSEC)
{
<div class="col-md-4">
<label class="form-label">MELSEC sub-family</label>
<select class="form-select" @bind="_melsecSubFamily">
@foreach (var f in Enum.GetValues<MelsecFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
}
</div>
<div class="mt-4">
<ModbusAddressEditor @bind-AddressString="_address"
Family="_family"
MelsecSubFamily="_melsecSubFamily"/>
</div>
<h3 class="mt-5">Quick-reference grammar</h3>
<pre class="bg-light p-3 rounded small">@_grammarReference</pre>
</div>
@code {
private string? _address;
private ModbusFamily _family = ModbusFamily.Generic;
private MelsecFamily _melsecSubFamily = MelsecFamily.Q_L_iQR;
// Held as a const string rather than inline markup so the Razor parser doesn't try to
// interpret the angle-bracket grammar tokens as element open/close.
private const string _grammarReference = @"<region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]
Examples (post-#146 type codes):
40001 HoldingRegisters[0], Int16
400001 same, 6-digit form
40001:F Float32 (HR[0..1])
40001:F:CDAB Float32 word-swapped
40001:STR20 20-char ASCII string
40001:S:5 Int16[5] array (3-field shorthand)
40001:I:CDAB:10 Int32[10] word-swapped (4-field strict)
40001.5 bit 5 of HR[0]
HR1:I Int32 via mnemonic region (matches Wonderware)
C100 Coil 100 (mnemonic, 1-based)
V2000:F:CDAB DL205 V-memory at PDU 1024 (Family=DL205)
D100:I MELSEC D-register 100, Int32 (Family=MELSEC)
Type codes: BOOL, S (Int16), US (UInt16), I (Int32), UI (UInt32),
I_64 (Int64), UI_64 (UInt64), F, D, BCD, BCD_32, STR<n>
Byte order: ABCD (BE default), CDAB (word-swap), BADC (byte-swap), DCBA (full reverse)";
}
@@ -0,0 +1,120 @@
@page "/modbus/diagnostics/{DriverInstanceId}"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@inject DriverDiagnosticsClient Diagnostics
@*
#154 — operator-facing view of the Server's auto-prohibition state for a Modbus driver.
Fetches via DriverDiagnosticsClient (HttpClient against the Server's HealthEndpointsHost).
Refreshes on demand; auto-refresh is a future task once a SignalR diag channel exists.
*@
<PageTitle>Modbus diagnostics — @DriverInstanceId</PageTitle>
<div class="container py-4">
<h1>Modbus auto-prohibitions</h1>
<p class="text-muted">
Driver instance <code>@DriverInstanceId</code>. Live snapshot of coalesced ranges
the planner has learned to read individually (#148 / #150 / #151 / #152).
</p>
<div class="mb-3">
<button class="btn btn-sm btn-outline-primary" @onclick="LoadAsync" disabled="@_loading">
@(_loading ? "Loading…" : "Refresh")
</button>
@if (_lastRefreshed is not null)
{
<span class="text-muted ms-3 small">Last refreshed @_lastRefreshed.Value.ToLocalTime().ToString("HH:mm:ss")</span>
}
</div>
@if (_error is not null)
{
<div class="alert alert-danger">@_error</div>
}
else if (_response is null)
{
<p class="text-muted">Click <strong>Refresh</strong> to load.</p>
}
else if (_response.Count == 0)
{
<div class="alert alert-success">No auto-prohibitions. The planner is coalescing freely.</div>
}
else
{
<table class="table table-sm">
<thead>
<tr>
<th>Unit</th>
<th>Region</th>
<th>Start</th>
<th>End</th>
<th>Span</th>
<th>Status</th>
<th>Last probed</th>
</tr>
</thead>
<tbody>
@foreach (var r in _response.Ranges.OrderBy(r => r.UnitId).ThenBy(r => r.Region).ThenBy(r => r.StartAddress))
{
<tr>
<td><code>@r.UnitId</code></td>
<td><code>@r.Region</code></td>
<td><code>@r.StartAddress</code></td>
<td><code>@r.EndAddress</code></td>
<td>@(r.EndAddress - r.StartAddress + 1)</td>
<td>
@if (r.BisectionPending)
{
<span class="badge bg-warning text-dark">BISECTING</span>
}
else
{
<span class="badge bg-danger">ISOLATED</span>
}
</td>
<td class="small text-muted">@FormatTimeSince(r.LastProbedUtc)</td>
</tr>
}
</tbody>
</table>
}
</div>
@code {
[Parameter] public string DriverInstanceId { get; set; } = string.Empty;
private ModbusAutoProhibitionsResponse? _response;
private string? _error;
private bool _loading;
private DateTime? _lastRefreshed;
private async Task LoadAsync()
{
_loading = true;
_error = null;
try
{
_response = await Diagnostics.GetModbusAutoProhibitedRangesAsync(DriverInstanceId);
_lastRefreshed = DateTime.UtcNow;
if (_response is null)
_error = $"Server reports driver '{DriverInstanceId}' is not present or is not a Modbus driver.";
}
catch (Exception ex)
{
_error = $"Fetch failed: {ex.Message}";
}
finally
{
_loading = false;
}
}
private static string FormatTimeSince(DateTime utc)
{
var span = DateTime.UtcNow - utc;
if (span.TotalSeconds < 60) return $"{(int)span.TotalSeconds}s ago";
if (span.TotalMinutes < 60) return $"{(int)span.TotalMinutes}m ago";
if (span.TotalHours < 24) return $"{(int)span.TotalHours}h ago";
return $"{(int)span.TotalDays}d ago";
}
}
@@ -0,0 +1,169 @@
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@*
#145 — Driver-instance options panel for the Modbus driver. Surfaces every option group
added by #136-#144 so users can configure the driver via the UI rather than hand-editing
DriverConfig JSON. Bound to a ModbusOptionsViewModel; the parent page round-trips that
model to the DriverConfig.json column on save.
*@
<div class="modbus-options-editor">
<h5>Connection</h5>
<div class="row mb-3">
<div class="col-sm-6">
<label class="form-label">Host</label>
<input class="form-control" @bind="Model.Host"/>
</div>
<div class="col-sm-3">
<label class="form-label">Port</label>
<input type="number" class="form-control" @bind="Model.Port"/>
</div>
<div class="col-sm-3">
<label class="form-label">Default UnitId</label>
<input type="number" class="form-control" @bind="Model.UnitId"/>
</div>
</div>
<h5>Family (#144)</h5>
<div class="row mb-3">
<div class="col-sm-6">
<label class="form-label">PLC family</label>
<select class="form-select" @bind="Model.Family">
@foreach (var f in Enum.GetValues<ModbusFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
@if (Model.Family == ModbusFamily.MELSEC)
{
<div class="col-sm-6">
<label class="form-label">MELSEC sub-family</label>
<select class="form-select" @bind="Model.MelsecSubFamily">
@foreach (var f in Enum.GetValues<MelsecFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
}
</div>
<h5>Keep-alive (#139)</h5>
<div class="row mb-3">
<div class="col-sm-3">
<div class="form-check mt-4">
<input type="checkbox" class="form-check-input" @bind="Model.KeepAliveEnabled"/>
<label class="form-check-label">Enabled</label>
</div>
</div>
<div class="col-sm-3">
<label class="form-label">Time (s)</label>
<input type="number" class="form-control" @bind="Model.KeepAliveTimeSec"/>
</div>
<div class="col-sm-3">
<label class="form-label">Interval (s)</label>
<input type="number" class="form-control" @bind="Model.KeepAliveIntervalSec"/>
</div>
<div class="col-sm-3">
<label class="form-label">Retry count</label>
<input type="number" class="form-control" @bind="Model.KeepAliveRetryCount"/>
</div>
</div>
<h5>Reconnect (#139)</h5>
<div class="row mb-3">
<div class="col-sm-4">
<label class="form-label">Initial delay (ms)</label>
<input type="number" class="form-control" @bind="Model.ReconnectInitialDelayMs"/>
</div>
<div class="col-sm-4">
<label class="form-label">Max delay (ms)</label>
<input type="number" class="form-control" @bind="Model.ReconnectMaxDelayMs"/>
</div>
<div class="col-sm-4">
<label class="form-label">Backoff multiplier</label>
<input type="number" step="0.1" class="form-control" @bind="Model.ReconnectBackoffMultiplier"/>
</div>
</div>
<h5>Protocol (#140)</h5>
<div class="row mb-3">
<div class="col-sm-3">
<label class="form-label">Max regs / read</label>
<input type="number" class="form-control" @bind="Model.MaxRegistersPerRead"/>
</div>
<div class="col-sm-3">
<label class="form-label">Max regs / write</label>
<input type="number" class="form-control" @bind="Model.MaxRegistersPerWrite"/>
</div>
<div class="col-sm-3">
<label class="form-label">Max coils / read</label>
<input type="number" class="form-control" @bind="Model.MaxCoilsPerRead"/>
</div>
<div class="col-sm-3">
<label class="form-label">Max read gap (#143)</label>
<input type="number" class="form-control" @bind="Model.MaxReadGap"/>
</div>
</div>
<div class="row mb-3">
<div class="col-sm-4">
<div class="form-check">
<input type="checkbox" class="form-check-input" @bind="Model.UseFC15ForSingleCoilWrites"/>
<label class="form-check-label">Use FC15 for single coil</label>
</div>
</div>
<div class="col-sm-4">
<div class="form-check">
<input type="checkbox" class="form-check-input" @bind="Model.UseFC16ForSingleRegisterWrites"/>
<label class="form-check-label">Use FC16 for single reg</label>
</div>
</div>
<div class="col-sm-4">
<div class="form-check">
<input type="checkbox" class="form-check-input" @bind="Model.WriteOnChangeOnly"/>
<label class="form-check-label">Write-on-change only (#141)</label>
</div>
</div>
</div>
</div>
@code {
[Parameter, EditorRequired] public ModbusOptionsViewModel Model { get; set; } = default!;
/// <summary>
/// UI binding model. Maps 1:1 onto the JSON DTO the driver factory accepts; serialised
/// to DriverConfig.json by the calling save handler. Defaults match
/// <c>ModbusDriverOptions</c> defaults so unedited rows produce the historical wire
/// output verbatim.
/// </summary>
public sealed class ModbusOptionsViewModel
{
public string Host { get; set; } = "127.0.0.1";
public int Port { get; set; } = 502;
public byte UnitId { get; set; } = 1;
public ModbusFamily Family { get; set; } = ModbusFamily.Generic;
public MelsecFamily MelsecSubFamily { get; set; } = MelsecFamily.Q_L_iQR;
public bool KeepAliveEnabled { get; set; } = true;
public int KeepAliveTimeSec { get; set; } = 30;
public int KeepAliveIntervalSec { get; set; } = 10;
public int KeepAliveRetryCount { get; set; } = 3;
public int ReconnectInitialDelayMs { get; set; } = 0;
public int ReconnectMaxDelayMs { get; set; } = 30000;
public double ReconnectBackoffMultiplier { get; set; } = 2.0;
public int MaxRegistersPerRead { get; set; } = 125;
public int MaxRegistersPerWrite { get; set; } = 123;
public int MaxCoilsPerRead { get; set; } = 2000;
public int MaxReadGap { get; set; } = 0;
public bool UseFC15ForSingleCoilWrites { get; set; } = false;
public bool UseFC16ForSingleRegisterWrites { get; set; } = false;
public bool WriteOnChangeOnly { get; set; } = false;
}
}
@@ -37,3 +37,18 @@ public sealed record NodeStateChangedMessage(
string? LastAppliedError,
DateTime? LastAppliedAt,
DateTime? LastSeenAt);
/// <summary>
/// Pushed by <c>FleetStatusPoller</c> when it observes a change in a
/// <c>DriverInstanceResilienceStatus</c> row. Closes the last Phase 6.1 Stream E.2/E.3
/// deferral — lets the Admin <c>/hosts</c> page upsert the matching row without the
/// 10-second polling round-trip. Keyed on (DriverInstanceId, HostName); the client
/// fan-outs to the matching row by matching both.
/// </summary>
public sealed record ResilienceStatusChangedMessage(
string DriverInstanceId,
string HostName,
int ConsecutiveFailures,
DateTime? LastCircuitBreakerOpenUtc,
int CurrentBulkheadDepth,
DateTime? LastRecycleUtc);
+11
View File
@@ -41,9 +41,20 @@ builder.Services.AddDbContext<OtOpcUaConfigDbContext>(opt =>
builder.Services.AddScoped<ClusterService>();
builder.Services.AddScoped<GenerationService>();
builder.Services.AddScoped<EquipmentService>();
builder.Services.AddScoped<TagService>();
builder.Services.AddScoped<UnsService>();
builder.Services.AddScoped<NamespaceService>();
builder.Services.AddScoped<DriverInstanceService>();
builder.Services.AddScoped<FocasDriverDetailService>();
// #154 — Server diagnostics client. Default base URL points at the same machine's
// HealthEndpointsHost (loopback :4841); deployments with remote Servers set
// "DriverDiagnostics:ServerBaseUrl" in appsettings.json.
builder.Services.AddHttpClient<DriverDiagnosticsClient>(client =>
{
var baseUrl = builder.Configuration["DriverDiagnostics:ServerBaseUrl"] ?? "http://localhost:4841/";
client.BaseAddress = new Uri(baseUrl);
});
builder.Services.AddScoped<NodeAclService>();
builder.Services.AddScoped<PermissionProbeService>();
builder.Services.AddScoped<AclChangeNotifier>();
@@ -0,0 +1,61 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// #154 — Admin-side client for the Server's driver-diagnostics HTTP endpoints. Wraps
/// <see cref="HttpClient"/> so Blazor pages can fetch per-driver runtime state from a
/// remote Server process. The base URL is configured at registration time
/// (typically read from <c>appsettings.json</c> at startup).
/// </summary>
/// <remarks>
/// One client instance per Server endpoint. Multi-server deployments register multiple
/// keyed clients. Errors propagate as exceptions; pages catch and surface to the
/// operator rather than swallowing.
/// </remarks>
public sealed class DriverDiagnosticsClient
{
private readonly HttpClient _http;
public DriverDiagnosticsClient(HttpClient http) => _http = http;
/// <summary>
/// Fetch the current Modbus auto-prohibition list for the named driver instance.
/// Returns null when the Server reports the driver doesn't exist or isn't a Modbus
/// driver. Throws on transport / serialization failures.
/// </summary>
public async Task<ModbusAutoProhibitionsResponse?> GetModbusAutoProhibitedRangesAsync(
string driverInstanceId, CancellationToken ct = default)
{
var resp = await _http.GetAsync(
$"/diagnostics/drivers/{Uri.EscapeDataString(driverInstanceId)}/modbus/auto-prohibited", ct)
.ConfigureAwait(false);
if (resp.StatusCode is System.Net.HttpStatusCode.NotFound or System.Net.HttpStatusCode.BadRequest)
return null;
resp.EnsureSuccessStatusCode();
return await resp.Content.ReadFromJsonAsync<ModbusAutoProhibitionsResponse>(cancellationToken: ct).ConfigureAwait(false);
}
}
/// <summary>
/// Server response shape for the Modbus auto-prohibition diagnostic. Mirrors the JSON the
/// <c>HealthEndpointsHost</c> serialises; fields are flat strings/numbers so the
/// Admin-side client doesn't take a dependency on the Driver.Modbus assembly's
/// <c>ModbusAutoProhibition</c> record.
/// </summary>
public sealed class ModbusAutoProhibitionsResponse
{
public string DriverInstanceId { get; set; } = string.Empty;
public int Count { get; set; }
public List<ModbusAutoProhibitionRow> Ranges { get; set; } = new();
}
public sealed class ModbusAutoProhibitionRow
{
public byte UnitId { get; set; }
public string Region { get; set; } = string.Empty;
public ushort StartAddress { get; set; }
public ushort EndAddress { get; set; }
public DateTime LastProbedUtc { get; set; }
public bool BisectionPending { get; set; }
}
@@ -0,0 +1,123 @@
using System.Text.Json;
using System.Text.Json.Serialization;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Per-instance detail view for FOCAS driver rows. Loads the latest
/// <see cref="DriverInstance"/> row for the requested <c>DriverInstanceId</c> (most-recent
/// draft wins when multiple rows exist across generations), parses the schemaless
/// <c>DriverConfig</c> JSON into <see cref="FocasDriverConfigView"/>, and joins the
/// per-device <see cref="DriverHostStatus"/> rows so the Admin page can render host
/// state + consecutive-failure counters next to each configured device.
/// </summary>
public sealed class FocasDriverDetailService(OtOpcUaConfigDbContext db)
{
private static readonly JsonSerializerOptions JsonOpts = new()
{
PropertyNameCaseInsensitive = true,
NumberHandling = JsonNumberHandling.AllowReadingFromString,
};
public async Task<FocasDriverDetail?> GetAsync(string driverInstanceId, CancellationToken ct = default)
{
if (string.IsNullOrWhiteSpace(driverInstanceId)) return null;
var instance = await db.DriverInstances.AsNoTracking()
.Where(d => d.DriverInstanceId == driverInstanceId
&& d.DriverType.ToLower() == "focas")
.OrderByDescending(d => d.GenerationId)
.FirstOrDefaultAsync(ct);
if (instance is null) return null;
FocasDriverConfigView? config = null;
string? parseError = null;
try { config = JsonSerializer.Deserialize<FocasDriverConfigView>(instance.DriverConfig, JsonOpts); }
catch (JsonException ex) { parseError = ex.Message; }
var hostStatuses = await (from s in db.DriverHostStatuses.AsNoTracking()
where s.DriverInstanceId == driverInstanceId
join r in db.DriverInstanceResilienceStatuses.AsNoTracking()
on new { s.DriverInstanceId, s.HostName }
equals new { r.DriverInstanceId, r.HostName } into rj
from r in rj.DefaultIfEmpty()
orderby s.HostName
select new FocasHostStatusRow(
s.NodeId,
s.HostName,
s.State.ToString(),
s.StateChangedUtc,
s.LastSeenUtc,
s.Detail,
r != null ? r.ConsecutiveFailures : 0,
r != null ? r.LastCircuitBreakerOpenUtc : null,
r != null ? r.LastRecycleUtc : null)).ToListAsync(ct);
return new FocasDriverDetail(instance, config, parseError, hostStatuses);
}
}
/// <summary>Projected view of a FOCAS driver's parsed config. Unknown fields are ignored.</summary>
public sealed record FocasDriverConfigView
{
public List<FocasDeviceView>? Devices { get; set; }
public List<FocasTagView>? Tags { get; set; }
public FocasProbeView? Probe { get; set; }
public FocasAlarmProjectionView? AlarmProjection { get; set; }
public FocasHandleRecycleView? HandleRecycle { get; set; }
}
public sealed record FocasDeviceView
{
public string? HostAddress { get; set; }
public string? DeviceName { get; set; }
public string? Series { get; set; }
}
public sealed record FocasTagView
{
public string? Name { get; set; }
public string? DeviceHostAddress { get; set; }
public string? Address { get; set; }
public string? DataType { get; set; }
public bool Writable { get; set; } = true;
}
public sealed record FocasProbeView
{
public bool Enabled { get; set; } = true;
public string? Interval { get; set; }
}
public sealed record FocasAlarmProjectionView
{
public bool Enabled { get; set; }
public string? PollInterval { get; set; }
}
public sealed record FocasHandleRecycleView
{
public bool Enabled { get; set; }
public string? Interval { get; set; }
}
/// <summary>Composite payload returned to the Admin page.</summary>
public sealed record FocasDriverDetail(
DriverInstance Instance,
FocasDriverConfigView? Config,
string? ParseError,
IReadOnlyList<FocasHostStatusRow> HostStatuses);
public sealed record FocasHostStatusRow(
string NodeId,
string HostName,
string State,
DateTime StateChangedUtc,
DateTime LastSeenUtc,
string? Detail,
int ConsecutiveFailures,
DateTime? LastCircuitBreakerOpenUtc,
DateTime? LastRecycleUtc);
@@ -0,0 +1,71 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// #155 — Tag CRUD scoped to a draft generation. Tags are the canonical signal definitions
/// (one row per OPC UA variable) the Server materialises into the address space at startup.
/// Mirrors the shape of <see cref="EquipmentService"/>; writes are restricted to draft
/// generations only (published generations are immutable per the validation pipeline).
/// </summary>
public sealed class TagService(OtOpcUaConfigDbContext db)
{
/// <summary>Lists all tags in a generation, ordered by name. Optional driver / equipment filter.</summary>
public Task<List<Tag>> ListAsync(long generationId,
string? driverInstanceId = null,
string? equipmentId = null,
CancellationToken ct = default)
{
var query = db.Tags.AsNoTracking().Where(t => t.GenerationId == generationId);
if (!string.IsNullOrWhiteSpace(driverInstanceId))
query = query.Where(t => t.DriverInstanceId == driverInstanceId);
if (!string.IsNullOrWhiteSpace(equipmentId))
query = query.Where(t => t.EquipmentId == equipmentId);
return query.OrderBy(t => t.Name).ToListAsync(ct);
}
/// <summary>
/// Creates a new tag row in the given draft. TagId is auto-derived as a GUID — the
/// human-friendly Name is the user-facing identifier.
/// </summary>
public async Task<Tag> CreateAsync(long draftId, Tag input, CancellationToken ct)
{
input.GenerationId = draftId;
if (string.IsNullOrWhiteSpace(input.TagId))
input.TagId = Guid.NewGuid().ToString("N");
db.Tags.Add(input);
await db.SaveChangesAsync(ct);
return input;
}
public async Task UpdateAsync(Tag updated, CancellationToken ct)
{
var existing = await db.Tags
.FirstOrDefaultAsync(t => t.TagRowId == updated.TagRowId, ct)
?? throw new InvalidOperationException($"Tag row {updated.TagRowId} not found");
// Editable fields. TagId / GenerationId are immutable; the Validation pipeline rejects
// changes that would break referential integrity (sp_ValidateDraft per decision #110).
existing.Name = updated.Name;
existing.DriverInstanceId = updated.DriverInstanceId;
existing.DeviceId = updated.DeviceId;
existing.EquipmentId = updated.EquipmentId;
existing.FolderPath = updated.FolderPath;
existing.DataType = updated.DataType;
existing.AccessLevel = updated.AccessLevel;
existing.WriteIdempotent = updated.WriteIdempotent;
existing.PollGroupId = updated.PollGroupId;
existing.TagConfig = updated.TagConfig;
await db.SaveChangesAsync(ct);
}
public async Task DeleteAsync(Guid tagRowId, CancellationToken ct)
{
var existing = await db.Tags.FirstOrDefaultAsync(t => t.TagRowId == tagRowId, ct);
if (existing is null) return;
db.Tags.Remove(existing);
await db.SaveChangesAsync(ct);
}
}
@@ -16,8 +16,8 @@
<PackageReference Include="Novell.Directory.Ldap.NETStandard" Version="3.6.0"/>
<PackageReference Include="Microsoft.AspNetCore.SignalR.Client" Version="10.0.0"/>
<PackageReference Include="Serilog.AspNetCore" Version="9.0.0"/>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.15.2"/>
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.15.2-beta.1"/>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.15.3"/>
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.15.3-beta.1"/>
</ItemGroup>
<ItemGroup>
@@ -25,6 +25,7 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core\ZB.MOM.WW.OtOpcUa.Core.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.Scripting\ZB.MOM.WW.OtOpcUa.Core.Scripting.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian\ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing\ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.csproj"/>
</ItemGroup>
<ItemGroup>
@@ -1,4 +1,5 @@
using System.Text.RegularExpressions;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Validation;
@@ -173,4 +174,65 @@ public static class DraftValidator
di.DriverInstanceId));
}
}
/// <summary>
/// Phase 6.3 Stream A.2 + task #148 part 2 — managed pre-publish guard for cluster
/// topology vs. <see cref="ServerCluster.RedundancyMode"/>. The SQL
/// <c>CK_ServerCluster_RedundancyMode_NodeCount</c> CHECK already enforces the
/// (NodeCount, RedundancyMode) pair on the row itself, but it cannot see the
/// <see cref="ClusterNode.Enabled"/> flag on child nodes — an operator can toggle
/// nodes off (effective count = 1) while leaving RedundancyMode at Hot and the
/// constraint stays green. This check catches that drift before publish so the
/// runtime doesn't boot into a topology the <see cref="Enums.RedundancyMode"/> claims
/// is invalid.
/// </summary>
/// <remarks>
/// Called from the publish pipeline separately from <see cref="Validate"/> because the
/// cluster/nodes rows aren't generation-versioned — they don't belong on
/// <see cref="DraftSnapshot"/>. Returns every failing rule in one pass, same shape as
/// <see cref="Validate"/>.
/// </remarks>
public static IReadOnlyList<ValidationError> ValidateClusterTopology(
ServerCluster cluster,
IReadOnlyList<ClusterNode> clusterNodes)
{
ArgumentNullException.ThrowIfNull(cluster);
ArgumentNullException.ThrowIfNull(clusterNodes);
var errors = new List<ValidationError>();
var enabledNodes = clusterNodes.Count(n => n.Enabled);
// Declared count must match declared mode (belt around the SQL CHECK).
var declaredOk = (cluster.NodeCount, cluster.RedundancyMode) switch
{
(1, RedundancyMode.None) => true,
(2, RedundancyMode.Warm) => true,
(2, RedundancyMode.Hot) => true,
_ => false,
};
if (!declaredOk)
errors.Add(new("ClusterRedundancyModeInvalid",
$"Cluster '{cluster.ClusterId}' declares NodeCount={cluster.NodeCount} + RedundancyMode={cluster.RedundancyMode}. " +
$"Supported combinations: (1, None), (2, Warm), (2, Hot).",
cluster.ClusterId));
// Enabled-node count must match declared count. Disabling a node to 1 while leaving
// mode at Hot/Warm would boot the runtime into InvalidTopology band.
if (enabledNodes != cluster.NodeCount)
errors.Add(new("ClusterEnabledNodeCountMismatch",
$"Cluster '{cluster.ClusterId}' declares NodeCount={cluster.NodeCount} but has {enabledNodes} Enabled nodes. " +
$"Toggle the missing node(s) back on or change RedundancyMode/NodeCount to match.",
cluster.ClusterId));
// Primary uniqueness — decision #84. Two Primary nodes is always an invariant violation
// regardless of mode; catch it here so publish fails loud rather than the runtime
// demoting both to ServiceLevelBand.InvalidTopology at boot.
var primaryCount = clusterNodes.Count(n => n.Enabled && n.RedundancyRole == RedundancyRole.Primary);
if (primaryCount > 1)
errors.Add(new("ClusterMultiplePrimary",
$"Cluster '{cluster.ClusterId}' has {primaryCount} Enabled Primary nodes. At most one Primary per cluster.",
cluster.ClusterId));
return errors;
}
}
@@ -0,0 +1,19 @@
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <summary>
/// Point-in-time state of a single historian cluster node, included inside
/// <see cref="HistorianHealthSnapshot.Nodes"/> when the backend is clustered.
/// </summary>
/// <param name="Name">Node identifier — backend-specific (typically a hostname).</param>
/// <param name="IsHealthy">True when the node is currently considered usable for reads.</param>
/// <param name="CooldownUntil">When the next retry against an unhealthy node is allowed; null when no cooldown is active.</param>
/// <param name="FailureCount">Consecutive failures observed against this node since the last success.</param>
/// <param name="LastError">Diagnostic text from the last failure against this node; null when no failures.</param>
/// <param name="LastFailureTime">UTC of the last failure against this node; null when no failures.</param>
public sealed record HistorianClusterNodeState(
string Name,
bool IsHealthy,
DateTime? CooldownUntil,
int FailureCount,
string? LastError,
DateTime? LastFailureTime);
@@ -0,0 +1,32 @@
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <summary>
/// Point-in-time runtime health of a historian data source. Returned by
/// <see cref="IHistorianDataSource.GetHealthSnapshot"/> and projected onto the
/// server status dashboard.
/// </summary>
/// <param name="TotalQueries">Lifetime count of read calls received.</param>
/// <param name="TotalSuccesses">Subset of <paramref name="TotalQueries"/> that completed without error.</param>
/// <param name="TotalFailures">Subset of <paramref name="TotalQueries"/> that ended in error.</param>
/// <param name="ConsecutiveFailures">Failures since the last success — non-zero means the source is currently degraded.</param>
/// <param name="LastSuccessTime">UTC of the most recent successful read; null if none yet.</param>
/// <param name="LastFailureTime">UTC of the most recent failed read; null if none yet.</param>
/// <param name="LastError">Diagnostic text from the most recent failure; null when no failures recorded.</param>
/// <param name="ProcessConnectionOpen">True when the source's process-data connection is currently established.</param>
/// <param name="EventConnectionOpen">True when the source's event-data connection is currently established. Some backends share one connection — implementations may report the same value here as <paramref name="ProcessConnectionOpen"/>.</param>
/// <param name="ActiveProcessNode">Cluster node currently serving process reads; null when no node is active or the backend is non-clustered.</param>
/// <param name="ActiveEventNode">Cluster node currently serving event reads; null when no node is active or the backend is non-clustered.</param>
/// <param name="Nodes">Per-cluster-node state. Empty when the backend is non-clustered.</param>
public sealed record HistorianHealthSnapshot(
long TotalQueries,
long TotalSuccesses,
long TotalFailures,
int ConsecutiveFailures,
DateTime? LastSuccessTime,
DateTime? LastFailureTime,
string? LastError,
bool ProcessConnectionOpen,
bool EventConnectionOpen,
string? ActiveProcessNode,
string? ActiveEventNode,
IReadOnlyList<HistorianClusterNodeState> Nodes);
@@ -0,0 +1,74 @@
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <summary>
/// Server-side historian data source. Registered with the server's history router
/// and resolved per OPC UA namespace, independent of any driver's lifecycle.
/// </summary>
/// <remarks>
/// Distinct from <see cref="IHistoryProvider"/>:
/// <list type="bullet">
/// <item><see cref="IHistoryProvider"/> is a *driver capability* — the server
/// dispatches to it via the driver instance.</item>
/// <item><see cref="IHistorianDataSource"/> is a *server registration* — the
/// server resolves it via namespace and calls it directly, so a single
/// historian (e.g. Wonderware) can serve many drivers' nodes, and drivers can
/// restart without dropping history availability.</item>
/// </list>
/// All values returned use the shared <see cref="DataValueSnapshot"/> /
/// <see cref="HistoricalEvent"/> shapes; backend-specific quality / type encodings
/// are translated to OPC UA <c>StatusCode</c> uints inside the data source.
/// </remarks>
public interface IHistorianDataSource : IDisposable
{
/// <summary>
/// Read raw historical samples for a single tag over a time range.
/// </summary>
Task<HistoryReadResult> ReadRawAsync(
string fullReference,
DateTime startUtc,
DateTime endUtc,
uint maxValuesPerNode,
CancellationToken cancellationToken);
/// <summary>
/// Read processed (interval-bucketed) samples — average / min / max / count / etc.
/// A bucket with no source data returns a sample whose
/// <see cref="DataValueSnapshot.StatusCode"/> indicates BadNoData.
/// </summary>
Task<HistoryReadResult> ReadProcessedAsync(
string fullReference,
DateTime startUtc,
DateTime endUtc,
TimeSpan interval,
HistoryAggregateType aggregate,
CancellationToken cancellationToken);
/// <summary>
/// Read one sample per requested timestamp — OPC UA HistoryReadAtTime service.
/// Implementations interpolate or return prior-boundary samples per their
/// backend's policy. The returned list MUST be the same length and order as
/// <paramref name="timestampsUtc"/>; gaps are returned as Bad-quality snapshots.
/// </summary>
Task<HistoryReadResult> ReadAtTimeAsync(
string fullReference,
IReadOnlyList<DateTime> timestampsUtc,
CancellationToken cancellationToken);
/// <summary>
/// Read historical alarm / event records — OPC UA HistoryReadEvents service.
/// Distinct from any live event stream; sources here come from the historian's
/// event log. <paramref name="sourceName"/> is null to return all sources.
/// </summary>
Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName,
DateTime startUtc,
DateTime endUtc,
int maxEvents,
CancellationToken cancellationToken);
/// <summary>
/// Point-in-time health snapshot for diagnostics and dashboards. Pure
/// observation; never blocks on backend I/O.
/// </summary>
HistorianHealthSnapshot GetHealthSnapshot();
}
@@ -62,10 +62,41 @@ public interface IVariableHandle
/// <param name="SourceName">Human-readable alarm name used for the <c>SourceName</c> event field.</param>
/// <param name="InitialSeverity">Severity at address-space build time; updates arrive via <see cref="IAlarmConditionSink"/>.</param>
/// <param name="InitialDescription">Initial description; updates arrive via <see cref="IAlarmConditionSink"/>.</param>
/// <param name="InAlarmRef">
/// Driver-side full reference for the boolean attribute that toggles when the
/// alarm condition becomes active. Consumed by the server-level alarm-condition
/// service to subscribe to active/inactive transitions. Null when the driver
/// reports alarm transitions through some other channel.
/// </param>
/// <param name="PriorityRef">
/// Driver-side full reference for the integer attribute carrying the alarm's
/// current priority / severity. Live updates flow through the same subscription
/// pipeline as <paramref name="InAlarmRef"/>. Null when the driver does not
/// expose live priority changes.
/// </param>
/// <param name="DescAttrNameRef">
/// Driver-side full reference for the string attribute carrying the human-readable
/// description / message. Null when the driver does not expose a live description.
/// </param>
/// <param name="AckedRef">
/// Driver-side full reference for the boolean attribute that toggles when the
/// alarm is acknowledged. Null when acknowledgement is not observable on the
/// driver side.
/// </param>
/// <param name="AckMsgWriteRef">
/// Driver-side full reference the server writes to acknowledge the condition,
/// typically the alarm's <c>.AckMsg</c> attribute. Null when the driver does not
/// accept acknowledgement writes (or routes them through a separate API).
/// </param>
public sealed record AlarmConditionInfo(
string SourceName,
AlarmSeverity InitialSeverity,
string? InitialDescription);
string? InitialDescription,
string? InAlarmRef = null,
string? PriorityRef = null,
string? DescAttrNameRef = null,
string? AckedRef = null,
string? AckMsgWriteRef = null);
/// <summary>
/// Sink a concrete address-space builder returns from <see cref="IVariableHandle.MarkAsAlarmCondition"/>.
@@ -269,6 +269,15 @@ public sealed class ScriptedAlarmEngine : IDisposable
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct)
{
var inputs = BuildReadCache(state.Inputs);
// Cold-start guard — skip the predicate when any referenced upstream tag has no
// cached value yet (the upstream subscription hasn't delivered its first push).
// Without this, predicates that cast `(double)ctx.GetTag(path).Value` throw NRE on
// every tick until the cache fills, spamming the log with identical stack traces.
// Bad quality is treated the same: the input isn't available at the predicate's
// expected type, so the only defensible move is to hold the prior condition state.
if (!AreInputsReady(inputs)) return seed;
var context = new AlarmPredicateContext(inputs, state.Logger, _clock);
bool predicateTrue;
@@ -305,6 +314,27 @@ public sealed class ScriptedAlarmEngine : IDisposable
return d;
}
/// <summary>
/// Returns true when every entry in <paramref name="cache"/> has a non-null value
/// and a Good-quality <see cref="DataValueSnapshot.StatusCode"/>. A false here lets
/// callers short-circuit script evaluation — predicates that unconditionally cast
/// <c>ctx.GetTag(path).Value</c> to a numeric type would otherwise throw
/// <see cref="NullReferenceException"/> until the upstream subscription delivers
/// its first push.
/// </summary>
private static bool AreInputsReady(IReadOnlyDictionary<string, DataValueSnapshot> cache)
{
foreach (var kv in cache)
{
if (kv.Value is null || kv.Value.Value is null) return false;
// OPC UA Part 4 StatusCode: bit 31 = severity 10 (Bad). Treat Good + Uncertain
// as "ready"; Uncertain carries a value the script can still inspect + make a
// qualified decision from. Only outright Bad is skipped.
if ((kv.Value.StatusCode & 0x80000000u) != 0) return false;
}
return true;
}
private void EmitEvent(AlarmState state, AlarmConditionState condition, EmissionKind kind)
{
// Suppressed kind means shelving ate the emission — we don't fire for subscribers
@@ -228,6 +228,14 @@ public sealed class VirtualTagEngine : IDisposable
try
{
var ctxCache = BuildReadCache(state.Reads);
// Cold-start guard — hold the prior value when any upstream input is still
// unset or Bad-quality. Evaluating with nulls would throw inside the script
// (scripts cast ctx.GetTag(path).Value directly) and produce a persistent
// BadInternalError result until the upstream cache fills. Keeping the prior
// snapshot is more honest: the virtual tag simply hasn't been computed yet.
if (!AreInputsReady(ctxCache)) return;
var context = new VirtualTagContext(
ctxCache,
(p, v) => OnScriptSetVirtualTag(p, v),
@@ -278,6 +286,23 @@ public sealed class VirtualTagEngine : IDisposable
return map;
}
/// <summary>
/// Returns true when every entry in <paramref name="cache"/> has a non-null value
/// and a Good/Uncertain-quality <see cref="DataValueSnapshot.StatusCode"/>. Scripts
/// cast <c>ctx.GetTag(path).Value</c> unconditionally; short-circuiting before the
/// script runs keeps a not-yet-cached upstream from faulting the virtual tag with
/// <c>BadInternalError</c>.
/// </summary>
private static bool AreInputsReady(IReadOnlyDictionary<string, DataValueSnapshot> cache)
{
foreach (var kv in cache)
{
if (kv.Value is null || kv.Value.Value is null) return false;
if ((kv.Value.StatusCode & 0x80000000u) != 0) return false;
}
return true;
}
private void OnScriptSetVirtualTag(string path, object? value)
{
if (!_tags.ContainsKey(path))
@@ -19,6 +19,8 @@ public sealed class DriverFactoryRegistry
{
private readonly Dictionary<string, Func<string, string, IDriver>> _factories
= new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<string, DriverTier> _tiers
= new(StringComparer.OrdinalIgnoreCase);
private readonly object _lock = new();
/// <summary>
@@ -33,7 +35,15 @@ public sealed class DriverFactoryRegistry
/// itself — the bootstrapper calls it via <see cref="DriverHost.RegisterAsync"/>
/// so the host's per-driver retry semantics apply uniformly.
/// </param>
public void Register(string driverType, Func<string, string, IDriver> factory)
/// <param name="tier">
/// Stability tier per <c>docs/v2/driver-stability.md</c>. Defaults to
/// <see cref="DriverTier.A"/> (in-process, lowest recycle footprint); Tier C
/// drivers SHOULD pass this explicitly so the scheduled-recycle wiring in
/// <c>DriverInstanceBootstrapper</c> can skip in-process drivers by tier check
/// rather than by driver-type allow-list.
/// </param>
public void Register(string driverType, Func<string, string, IDriver> factory,
DriverTier tier = DriverTier.A)
{
ArgumentException.ThrowIfNullOrWhiteSpace(driverType);
ArgumentNullException.ThrowIfNull(factory);
@@ -43,6 +53,7 @@ public sealed class DriverFactoryRegistry
throw new InvalidOperationException(
$"DriverType '{driverType}' factory already registered for this process");
_factories[driverType] = factory;
_tiers[driverType] = tier;
}
}
@@ -57,6 +68,17 @@ public sealed class DriverFactoryRegistry
lock (_lock) return _factories.GetValueOrDefault(driverType);
}
/// <summary>
/// Look up the tier recorded alongside the factory. Returns <see cref="DriverTier.A"/>
/// for unknown driver types — a missing registration is already a skipped-bootstrap
/// case upstream; we don't double-surface that failure here.
/// </summary>
public DriverTier GetTier(string driverType)
{
ArgumentException.ThrowIfNullOrWhiteSpace(driverType);
lock (_lock) return _tiers.GetValueOrDefault(driverType, DriverTier.A);
}
public IReadOnlyCollection<string> RegisteredTypes
{
get { lock (_lock) return [.. _factories.Keys]; }
@@ -1,3 +1,4 @@
using System.Text.Json;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
@@ -157,7 +158,7 @@ public static class EquipmentNodeWalker
private static void AddTagVariable(IAddressSpaceBuilder equipmentBuilder, Tag tag)
{
var attr = new DriverAttributeInfo(
FullName: tag.TagConfig,
FullName: ExtractFullName(tag.TagConfig),
DriverDataType: ParseDriverDataType(tag.DataType),
IsArray: false,
ArrayDim: null,
@@ -166,6 +167,38 @@ public static class EquipmentNodeWalker
equipmentBuilder.Variable(tag.Name, tag.Name, attr);
}
/// <summary>
/// Cross-driver TagConfig convention — the Config DB's <c>CK_Tag_TagConfig_IsJson</c>
/// check constraint requires TagConfig to be a JSON object, and every shipped driver
/// (Galaxy / Modbus / AB CIP / S7 / FOCAS / TwinCAT / ABLegacy) stores the wire-level
/// address in a top-level <c>FullName</c> field. Extracting it here keeps the walker
/// driver-agnostic while giving the driver the plain address string its backend
/// expects at read-time — the raw JSON would otherwise be passed verbatim to
/// <c>IReadable.ReadAsync</c> and the driver would fail to resolve the tag.
/// </summary>
/// <remarks>
/// Falls back to the raw <paramref name="tagConfig"/> if it doesn't parse as JSON or
/// the <c>FullName</c> field is absent. This preserves the pre-refactor behaviour for
/// any legacy row that slipped past the check constraint or any future driver that
/// wants an opaque non-JSON reference.
/// </remarks>
internal static string ExtractFullName(string tagConfig)
{
if (string.IsNullOrWhiteSpace(tagConfig)) return tagConfig;
try
{
using var doc = JsonDocument.Parse(tagConfig);
if (doc.RootElement.ValueKind == JsonValueKind.Object
&& doc.RootElement.TryGetProperty("FullName", out var fullName)
&& fullName.ValueKind == JsonValueKind.String)
{
return fullName.GetString() ?? tagConfig;
}
}
catch (JsonException) { /* fall through */ }
return tagConfig;
}
/// <summary>
/// Parse <see cref="Tag.DataType"/> (stored as the <see cref="DriverDataType"/> enum
/// name string, decision #138) into the enum value. Unknown names fall back to
@@ -28,6 +28,16 @@ public sealed record DriverResilienceOptions
/// </summary>
public int BulkheadMaxQueue { get; init; } = 64;
/// <summary>
/// Periodic scheduled recycle interval for Tier C out-of-process hosts, in seconds.
/// Null (the default) = no scheduled recycle; the driver's Host process keeps running
/// indefinitely unless a memory breach or operator action triggers a recycle. Only
/// respected for <see cref="DriverTier.C"/>; Tier A/B recycle would tear down every
/// OPC UA session, so the loader ignores non-null values for those tiers + logs a
/// warning (per decisions #74 / #145).
/// </summary>
public int? RecycleIntervalSeconds { get; init; }
/// <summary>
/// Look up the effective policy for a capability, falling back to tier defaults when no
/// override is configured. Never returns null.
@@ -91,12 +91,27 @@ public static class DriverResilienceOptionsParser
}
}
// Scheduled recycle is Tier C only — reject a configured interval on Tier A/B as a
// misconfiguration surface rather than silently honouring it (recycling an in-process
// driver would kill every OPC UA session + every co-hosted driver, per decision #74).
int? recycleIntervalSeconds = null;
if (shape.RecycleIntervalSeconds is int secs)
{
if (secs <= 0)
parseDiagnostic ??= $"RecycleIntervalSeconds must be positive; got {secs} — ignoring.";
else if (tier != DriverTier.C)
parseDiagnostic ??= $"RecycleIntervalSeconds is Tier C only; Tier {tier} driver will not scheduled-recycle.";
else
recycleIntervalSeconds = secs;
}
return new DriverResilienceOptions
{
Tier = tier,
CapabilityPolicies = merged,
BulkheadMaxConcurrent = shape.BulkheadMaxConcurrent ?? baseOptions.BulkheadMaxConcurrent,
BulkheadMaxQueue = shape.BulkheadMaxQueue ?? baseOptions.BulkheadMaxQueue,
RecycleIntervalSeconds = recycleIntervalSeconds,
};
}
@@ -104,6 +119,7 @@ public static class DriverResilienceOptionsParser
{
public int? BulkheadMaxConcurrent { get; set; }
public int? BulkheadMaxQueue { get; set; }
public int? RecycleIntervalSeconds { get; set; }
public Dictionary<string, CapabilityPolicyShape>? CapabilityPolicies { get; set; }
}
@@ -5,11 +5,11 @@ using ZB.MOM.WW.OtOpcUa.Driver.Cli.Common;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Commands;
/// <summary>
/// Probes a Fanuc CNC: opens a FOCAS session + reads one PMC address. No public
/// simulator exists — this command only produces meaningful results against a real
/// CNC with Fwlib32.dll present. Against a dev box it surfaces
/// <c>BadCommunicationError</c> (DLL missing) which is still a useful signal that
/// the CLI wire-up is correct.
/// Probes a Fanuc CNC: opens a FOCAS session + reads one PMC address. Uses the managed
/// <c>WireFocasClient</c> on TCP:8193. Against an unreachable endpoint it surfaces
/// <c>BadCommunicationError</c> which is still a useful signal that the CLI wire-up is
/// correct. Also runs cleanly against the focas-mock Docker fixture in
/// <c>tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/</c>.
/// </summary>
[Command("probe", Description = "Verify the CNC is reachable + a sample FOCAS read succeeds.")]
public sealed class ProbeCommand : FocasCommandBase
@@ -38,10 +38,9 @@ public abstract class FocasCommandBase : DriverCommandBase
/// <summary>
/// Build a <see cref="FocasDriverOptions"/> with the CNC target this base collected
/// + the tag list a subclass supplies. Probe disabled; the default
/// <see cref="FwlibFocasClientFactory"/> attempts <c>Fwlib32.dll</c> P/Invoke, which
/// throws <see cref="DllNotFoundException"/> at first call when the DLL is absent —
/// surfaced through the driver as <c>BadCommunicationError</c>.
/// + the tag list a subclass supplies. Probe disabled; the driver's default managed
/// wire client opens a TCP:8193 session to the CNC and surfaces unreachable endpoints
/// as <c>BadCommunicationError</c>.
/// </summary>
protected FocasDriverOptions BuildOptions(IReadOnlyList<FocasTagDefinition> tags) => new()
{
@@ -4,9 +4,9 @@ return await new CliApplicationBuilder()
.AddCommandsFromThisAssembly()
.SetExecutableName("otopcua-focas-cli")
.SetDescription(
"OtOpcUa FOCAS test-client — ad-hoc probe + PMC/param/macro reads/writes + polled " +
"subscriptions against Fanuc CNCs via the FOCAS/2 protocol. Requires a real CNC + a " +
"licensed Fwlib32.dll on PATH (or next to the executable) — no public simulator " +
"exists. Addresses use FocasAddressParser syntax: R100, X0.0, PARAM:1815/0, MACRO:500.")
"OtOpcUa FOCAS test-client — ad-hoc probe + PMC/param/macro reads + polled " +
"subscriptions against Fanuc CNCs via the FOCAS/2 protocol. Uses the managed " +
"WireFocasClient on TCP:8193 directly; no native dependencies. Addresses use " +
"FocasAddressParser syntax: R100, X0.0, PARAM:1815/0, MACRO:500.")
.Build()
.RunAsync(args);
@@ -22,4 +22,7 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.FOCAS\ZB.MOM.WW.OtOpcUa.Driver.FOCAS.csproj"/>
</ItemGroup>
<!-- CLI runs the managed WireFocasClient and talks to the CNC over TCP:8193
directly — no Fwlib64.dll copy step needed. -->
</Project>
@@ -1,122 +0,0 @@
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Backend;
/// <summary>
/// In-memory <see cref="IFocasBackend"/> for tests + an operational stub mode when
/// <c>OTOPCUA_FOCAS_BACKEND=fake</c>. Keeps per-address values keyed by a canonical
/// string; RMW semantics honor PMC bit-writes against the containing byte so the
/// <c>PmcBitWriteRequest</c> path can be exercised end-to-end without hardware.
/// </summary>
public sealed class FakeFocasBackend : IFocasBackend
{
private readonly object _gate = new();
private long _nextSessionId;
private readonly HashSet<long> _openSessions = [];
private readonly Dictionary<string, byte[]> _pmcValues = [];
private readonly Dictionary<string, byte[]> _paramValues = [];
private readonly Dictionary<string, byte[]> _macroValues = [];
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest request, CancellationToken ct)
{
lock (_gate)
{
var id = ++_nextSessionId;
_openSessions.Add(id);
return Task.FromResult(new OpenSessionResponse { Success = true, SessionId = id });
}
}
public Task CloseSessionAsync(CloseSessionRequest request, CancellationToken ct)
{
lock (_gate) { _openSessions.Remove(request.SessionId); }
return Task.CompletedTask;
}
public Task<ReadResponse> ReadAsync(ReadRequest request, CancellationToken ct)
{
lock (_gate)
{
if (!_openSessions.Contains(request.SessionId))
return Task.FromResult(new ReadResponse { Success = false, StatusCode = 0x80020000u, Error = "session-not-open" });
var store = StoreFor(request.Address.Kind);
var key = CanonicalKey(request.Address);
store.TryGetValue(key, out var value);
return Task.FromResult(new ReadResponse
{
Success = true,
StatusCode = 0,
ValueBytes = value ?? MessagePackSerializer.Serialize((int)0),
ValueTypeCode = request.DataType,
SourceTimestampUtcUnixMs = System.DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
});
}
}
public Task<WriteResponse> WriteAsync(WriteRequest request, CancellationToken ct)
{
lock (_gate)
{
if (!_openSessions.Contains(request.SessionId))
return Task.FromResult(new WriteResponse { Success = false, StatusCode = 0x80020000u, Error = "session-not-open" });
var store = StoreFor(request.Address.Kind);
store[CanonicalKey(request.Address)] = request.ValueBytes ?? [];
return Task.FromResult(new WriteResponse { Success = true, StatusCode = 0 });
}
}
public Task<PmcBitWriteResponse> PmcBitWriteAsync(PmcBitWriteRequest request, CancellationToken ct)
{
lock (_gate)
{
if (!_openSessions.Contains(request.SessionId))
return Task.FromResult(new PmcBitWriteResponse { Success = false, StatusCode = 0x80020000u, Error = "session-not-open" });
if (request.BitIndex is < 0 or > 7)
return Task.FromResult(new PmcBitWriteResponse { Success = false, StatusCode = 0x803C0000u, Error = "bit-out-of-range" });
var key = CanonicalKey(request.Address);
_pmcValues.TryGetValue(key, out var current);
current ??= MessagePackSerializer.Serialize((byte)0);
var b = MessagePackSerializer.Deserialize<byte>(current);
var mask = (byte)(1 << request.BitIndex);
b = request.Value ? (byte)(b | mask) : (byte)(b & ~mask);
_pmcValues[key] = MessagePackSerializer.Serialize(b);
return Task.FromResult(new PmcBitWriteResponse { Success = true, StatusCode = 0 });
}
}
public Task<ProbeResponse> ProbeAsync(ProbeRequest request, CancellationToken ct)
{
lock (_gate)
{
return Task.FromResult(new ProbeResponse
{
Healthy = _openSessions.Contains(request.SessionId),
ObservedAtUtcUnixMs = System.DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
});
}
}
private Dictionary<string, byte[]> StoreFor(int kind) => kind switch
{
0 => _pmcValues,
1 => _paramValues,
2 => _macroValues,
_ => _pmcValues,
};
private static string CanonicalKey(FocasAddressDto addr) =>
addr.Kind switch
{
0 => $"{addr.PmcLetter}{addr.Number}",
1 => $"P{addr.Number}",
2 => $"M{addr.Number}",
_ => $"?{addr.Number}",
};
}
@@ -1,24 +0,0 @@
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Backend;
/// <summary>
/// The Host's view of a FOCAS session. One implementation wraps the real
/// <c>Fwlib32.dll</c> via P/Invoke (lands with the real Fwlib32 integration follow-up,
/// since no hardware is available today); a second implementation —
/// <see cref="FakeFocasBackend"/> — is used by tests.
/// Both live on .NET 4.8 x86 so the Host can be deployed in either mode without
/// changing the pipe server.
/// Invoked via <c>FwlibFrameHandler</c> in the Ipc namespace.
/// </summary>
public interface IFocasBackend
{
Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest request, CancellationToken ct);
Task CloseSessionAsync(CloseSessionRequest request, CancellationToken ct);
Task<ReadResponse> ReadAsync(ReadRequest request, CancellationToken ct);
Task<WriteResponse> WriteAsync(WriteRequest request, CancellationToken ct);
Task<PmcBitWriteResponse> PmcBitWriteAsync(PmcBitWriteRequest request, CancellationToken ct);
Task<ProbeResponse> ProbeAsync(ProbeRequest request, CancellationToken ct);
}
@@ -1,37 +0,0 @@
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Backend;
/// <summary>
/// Safe default when the deployment hasn't configured a real Fwlib32 backend.
/// Returns structured failure responses instead of throwing so the Proxy can map the
/// error to <c>BadDeviceFailure</c> and surface a clear operator message pointing at
/// <c>docs/v2/focas-deployment.md</c>. Used when <c>OTOPCUA_FOCAS_BACKEND</c> is unset
/// or set to <c>unconfigured</c>.
/// </summary>
public sealed class UnconfiguredFocasBackend : IFocasBackend
{
private const uint BadDeviceFailure = 0x80550000u;
private const string Reason =
"FOCAS Host is running without a real Fwlib32 backend. Set OTOPCUA_FOCAS_BACKEND=fwlib32 " +
"and ensure Fwlib32.dll is on PATH — see docs/v2/focas-deployment.md.";
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest request, CancellationToken ct) =>
Task.FromResult(new OpenSessionResponse { Success = false, Error = Reason, ErrorCode = "NoFwlibBackend" });
public Task CloseSessionAsync(CloseSessionRequest request, CancellationToken ct) => Task.CompletedTask;
public Task<ReadResponse> ReadAsync(ReadRequest request, CancellationToken ct) =>
Task.FromResult(new ReadResponse { Success = false, StatusCode = BadDeviceFailure, Error = Reason });
public Task<WriteResponse> WriteAsync(WriteRequest request, CancellationToken ct) =>
Task.FromResult(new WriteResponse { Success = false, StatusCode = BadDeviceFailure, Error = Reason });
public Task<PmcBitWriteResponse> PmcBitWriteAsync(PmcBitWriteRequest request, CancellationToken ct) =>
Task.FromResult(new PmcBitWriteResponse { Success = false, StatusCode = BadDeviceFailure, Error = Reason });
public Task<ProbeResponse> ProbeAsync(ProbeRequest request, CancellationToken ct) =>
Task.FromResult(new ProbeResponse { Healthy = false, Error = Reason, ObservedAtUtcUnixMs = System.DateTimeOffset.UtcNow.ToUnixTimeMilliseconds() });
}
@@ -1,111 +0,0 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Ipc;
/// <summary>
/// Real FOCAS frame handler. Deserializes each request DTO, delegates to
/// <see cref="IFocasBackend"/>, re-serializes the response. The backend owns the
/// Fwlib32 handle + STA thread — the handler is pure dispatch.
/// </summary>
public sealed class FwlibFrameHandler : IFrameHandler
{
private readonly IFocasBackend _backend;
private readonly ILogger _logger;
public FwlibFrameHandler(IFocasBackend backend, ILogger logger)
{
_backend = backend ?? throw new ArgumentNullException(nameof(backend));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task HandleAsync(FocasMessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct)
{
try
{
switch (kind)
{
case FocasMessageKind.Heartbeat:
{
var hb = MessagePackSerializer.Deserialize<Heartbeat>(body);
await writer.WriteAsync(FocasMessageKind.HeartbeatAck,
new HeartbeatAck
{
MonotonicTicks = hb.MonotonicTicks,
HostUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
}, ct).ConfigureAwait(false);
return;
}
case FocasMessageKind.OpenSessionRequest:
{
var req = MessagePackSerializer.Deserialize<OpenSessionRequest>(body);
var resp = await _backend.OpenSessionAsync(req, ct).ConfigureAwait(false);
await writer.WriteAsync(FocasMessageKind.OpenSessionResponse, resp, ct).ConfigureAwait(false);
return;
}
case FocasMessageKind.CloseSessionRequest:
{
var req = MessagePackSerializer.Deserialize<CloseSessionRequest>(body);
await _backend.CloseSessionAsync(req, ct).ConfigureAwait(false);
return;
}
case FocasMessageKind.ReadRequest:
{
var req = MessagePackSerializer.Deserialize<ReadRequest>(body);
var resp = await _backend.ReadAsync(req, ct).ConfigureAwait(false);
await writer.WriteAsync(FocasMessageKind.ReadResponse, resp, ct).ConfigureAwait(false);
return;
}
case FocasMessageKind.WriteRequest:
{
var req = MessagePackSerializer.Deserialize<WriteRequest>(body);
var resp = await _backend.WriteAsync(req, ct).ConfigureAwait(false);
await writer.WriteAsync(FocasMessageKind.WriteResponse, resp, ct).ConfigureAwait(false);
return;
}
case FocasMessageKind.PmcBitWriteRequest:
{
var req = MessagePackSerializer.Deserialize<PmcBitWriteRequest>(body);
var resp = await _backend.PmcBitWriteAsync(req, ct).ConfigureAwait(false);
await writer.WriteAsync(FocasMessageKind.PmcBitWriteResponse, resp, ct).ConfigureAwait(false);
return;
}
case FocasMessageKind.ProbeRequest:
{
var req = MessagePackSerializer.Deserialize<ProbeRequest>(body);
var resp = await _backend.ProbeAsync(req, ct).ConfigureAwait(false);
await writer.WriteAsync(FocasMessageKind.ProbeResponse, resp, ct).ConfigureAwait(false);
return;
}
default:
await writer.WriteAsync(FocasMessageKind.ErrorResponse,
new ErrorResponse { Code = "unknown-kind", Message = $"Kind {kind} is not handled by the Host" },
ct).ConfigureAwait(false);
return;
}
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
_logger.Error(ex, "FwlibFrameHandler error processing {Kind}", kind);
await writer.WriteAsync(FocasMessageKind.ErrorResponse,
new ErrorResponse { Code = "backend-exception", Message = ex.Message },
ct).ConfigureAwait(false);
}
}
public IDisposable AttachConnection(FrameWriter writer) => IFrameHandler.NoopAttachment.Instance;
}
@@ -1,31 +0,0 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Ipc;
/// <summary>
/// Dispatches a single IPC frame to the backend. Implementations own the FOCAS session
/// state and translate request DTOs into Fwlib32 calls.
/// </summary>
public interface IFrameHandler
{
Task HandleAsync(FocasMessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct);
/// <summary>
/// Called once per accepted connection after the Hello handshake. Lets the handler
/// attach server-pushed event sinks (data-change notifications, runtime-status
/// changes) to the connection's <paramref name="writer"/>. Returns an
/// <see cref="IDisposable"/> the pipe server disposes when the connection closes —
/// backends use it to unsubscribe from their push sources.
/// </summary>
IDisposable AttachConnection(FrameWriter writer);
public sealed class NoopAttachment : IDisposable
{
public static readonly NoopAttachment Instance = new();
public void Dispose() { }
}
}
@@ -1,41 +0,0 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Ipc;
/// <summary>
/// Placeholder handler that returns <c>ErrorResponse{Code=not-implemented}</c> for every
/// FOCAS data-plane request. Exists so PR B can ship the pipe server + ACL + handshake
/// plumbing before PR C moves the Fwlib32 calls. Heartbeats are handled fully so the
/// supervisor's liveness detector stays happy.
/// </summary>
public sealed class StubFrameHandler : IFrameHandler
{
public Task HandleAsync(FocasMessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct)
{
if (kind == FocasMessageKind.Heartbeat)
{
var hb = MessagePackSerializer.Deserialize<Heartbeat>(body);
return writer.WriteAsync(FocasMessageKind.HeartbeatAck,
new HeartbeatAck
{
MonotonicTicks = hb.MonotonicTicks,
HostUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
}, ct);
}
return writer.WriteAsync(FocasMessageKind.ErrorResponse,
new ErrorResponse
{
Code = "not-implemented",
Message = $"Kind {kind} is stubbed — Fwlib32 lift lands in PR C",
},
ct);
}
public IDisposable AttachConnection(FrameWriter writer) => IFrameHandler.NoopAttachment.Instance;
}
@@ -1,72 +0,0 @@
using System;
using System.Security.Principal;
using System.Threading;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Ipc;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host;
/// <summary>
/// Entry point for the <c>OtOpcUaFocasHost</c> Windows service / console host. The
/// supervisor (Proxy-side) spawns this process per FOCAS driver instance and passes the
/// pipe name, allowed-SID, and per-process shared secret as environment variables. In
/// PR B the backend is <see cref="StubFrameHandler"/> — PR C swaps in the real
/// Fwlib32-backed handler once the session state + STA thread move out of the .NET 10
/// driver.
/// </summary>
public static class Program
{
public static int Main(string[] args)
{
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Information()
.WriteTo.File(
@"%ProgramData%\OtOpcUa\focas-host-.log".Replace("%ProgramData%", Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData)),
rollingInterval: RollingInterval.Day)
.CreateLogger();
try
{
var pipeName = Environment.GetEnvironmentVariable("OTOPCUA_FOCAS_PIPE") ?? "OtOpcUaFocas";
var allowedSidValue = Environment.GetEnvironmentVariable("OTOPCUA_ALLOWED_SID")
?? throw new InvalidOperationException(
"OTOPCUA_ALLOWED_SID not set — the FOCAS Proxy supervisor must pass the server principal SID");
var sharedSecret = Environment.GetEnvironmentVariable("OTOPCUA_FOCAS_SECRET")
?? throw new InvalidOperationException(
"OTOPCUA_FOCAS_SECRET not set — the FOCAS Proxy supervisor must pass the per-process secret at spawn time");
var allowedSid = new SecurityIdentifier(allowedSidValue);
using var server = new PipeServer(pipeName, allowedSid, sharedSecret, Log.Logger);
using var cts = new CancellationTokenSource();
Console.CancelKeyPress += (_, e) => { e.Cancel = true; cts.Cancel(); };
Log.Information("OtOpcUaFocasHost starting — pipe={Pipe} allowedSid={Sid}",
pipeName, allowedSidValue);
var backendKind = (Environment.GetEnvironmentVariable("OTOPCUA_FOCAS_BACKEND") ?? "unconfigured")
.ToLowerInvariant();
IFocasBackend backend = backendKind switch
{
"fake" => new FakeFocasBackend(),
"unconfigured" => new UnconfiguredFocasBackend(),
"fwlib32" => new UnconfiguredFocasBackend(), // real Fwlib32 backend lands with hardware integration follow-up
_ => new UnconfiguredFocasBackend(),
};
Log.Information("OtOpcUaFocasHost backend={Backend}", backendKind);
var handler = new FwlibFrameHandler(backend, Log.Logger);
server.RunAsync(handler, cts.Token).GetAwaiter().GetResult();
Log.Information("OtOpcUaFocasHost stopped cleanly");
return 0;
}
catch (Exception ex)
{
Log.Fatal(ex, "OtOpcUaFocasHost fatal");
return 2;
}
finally { Log.CloseAndFlush(); }
}
}
@@ -1,133 +0,0 @@
using System;
using System.IO;
using System.IO.MemoryMappedFiles;
using System.Text;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Stability;
/// <summary>
/// Ring-buffer of the last N IPC operations, written into a memory-mapped file. On a
/// hard crash the Proxy-side supervisor reads the MMF after the corpse is gone to see
/// what was in flight at the moment the Host died. Single-writer (the Host), multi-reader
/// (the supervisor) — the file format is identical to the Galaxy Tier-C
/// <c>PostMortemMmf</c> so a single reader tool can work both.
/// </summary>
/// <remarks>
/// File layout:
/// <code>
/// [16-byte header: magic(4) | version(4) | capacity(4) | writeIndex(4)]
/// [capacity × 256-byte entries: each is [8-byte utcUnixMs | 8-byte opKind | 240-byte UTF-8 message]]
/// </code>
/// Magic is 'OFPC' (0x4F46_5043) to distinguish a FOCAS file from the Galaxy MMF.
/// </remarks>
public sealed class PostMortemMmf : IDisposable
{
private const int Magic = 0x4F465043; // 'OFPC'
private const int Version = 1;
private const int HeaderBytes = 16;
public const int EntryBytes = 256;
private const int MessageOffset = 16;
private const int MessageCapacity = EntryBytes - MessageOffset;
public int Capacity { get; }
public string Path { get; }
private readonly MemoryMappedFile _mmf;
private readonly MemoryMappedViewAccessor _accessor;
private readonly object _writeGate = new();
public PostMortemMmf(string path, int capacity = 1000)
{
if (capacity <= 0) throw new ArgumentOutOfRangeException(nameof(capacity));
Capacity = capacity;
Path = path;
var fileBytes = HeaderBytes + capacity * EntryBytes;
Directory.CreateDirectory(System.IO.Path.GetDirectoryName(path)!);
var fs = new FileStream(path, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.Read);
fs.SetLength(fileBytes);
_mmf = MemoryMappedFile.CreateFromFile(fs, null, fileBytes,
MemoryMappedFileAccess.ReadWrite, HandleInheritability.None, leaveOpen: false);
_accessor = _mmf.CreateViewAccessor(0, fileBytes, MemoryMappedFileAccess.ReadWrite);
if (_accessor.ReadInt32(0) != Magic)
{
_accessor.Write(0, Magic);
_accessor.Write(4, Version);
_accessor.Write(8, capacity);
_accessor.Write(12, 0);
}
}
public void Write(long opKind, string message)
{
lock (_writeGate)
{
var idx = _accessor.ReadInt32(12);
var offset = HeaderBytes + idx * EntryBytes;
_accessor.Write(offset + 0, DateTimeOffset.UtcNow.ToUnixTimeMilliseconds());
_accessor.Write(offset + 8, opKind);
var msgBytes = Encoding.UTF8.GetBytes(message ?? string.Empty);
var copy = Math.Min(msgBytes.Length, MessageCapacity - 1);
_accessor.WriteArray(offset + MessageOffset, msgBytes, 0, copy);
_accessor.Write(offset + MessageOffset + copy, (byte)0);
var next = (idx + 1) % Capacity;
_accessor.Write(12, next);
}
}
public PostMortemEntry[] ReadAll()
{
var magic = _accessor.ReadInt32(0);
if (magic != Magic) return new PostMortemEntry[0];
var capacity = _accessor.ReadInt32(8);
var writeIndex = _accessor.ReadInt32(12);
var entries = new PostMortemEntry[capacity];
var count = 0;
for (var i = 0; i < capacity; i++)
{
var slot = (writeIndex + i) % capacity;
var offset = HeaderBytes + slot * EntryBytes;
var ts = _accessor.ReadInt64(offset + 0);
if (ts == 0) continue;
var op = _accessor.ReadInt64(offset + 8);
var msgBuf = new byte[MessageCapacity];
_accessor.ReadArray(offset + MessageOffset, msgBuf, 0, MessageCapacity);
var nulTerm = Array.IndexOf<byte>(msgBuf, 0);
var msg = Encoding.UTF8.GetString(msgBuf, 0, nulTerm < 0 ? MessageCapacity : nulTerm);
entries[count++] = new PostMortemEntry(ts, op, msg);
}
Array.Resize(ref entries, count);
return entries;
}
public void Dispose()
{
_accessor.Dispose();
_mmf.Dispose();
}
}
public readonly struct PostMortemEntry
{
public long UtcUnixMs { get; }
public long OpKind { get; }
public string Message { get; }
public PostMortemEntry(long utcUnixMs, long opKind, string message)
{
UtcUnixMs = utcUnixMs;
OpKind = opKind;
Message = message;
}
}
@@ -1,40 +0,0 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net48</TargetFramework>
<!-- Fwlib32.dll is 32-bit only — x86 target is mandatory. Matches the Galaxy.Host
bitness constraint but for a different native library. -->
<PlatformTarget>x86</PlatformTarget>
<Prefer32Bit>true</Prefer32Bit>
<Nullable>enable</Nullable>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host</RootNamespace>
<AssemblyName>OtOpcUa.Driver.FOCAS.Host</AssemblyName>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="System.IO.Pipes.AccessControl" Version="5.0.0"/>
<PackageReference Include="System.Memory" Version="4.5.5"/>
<PackageReference Include="System.Threading.Tasks.Extensions" Version="4.5.4"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared\ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>
@@ -1,39 +0,0 @@
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>
/// Wire shape for a parsed FOCAS address. Mirrors <c>FocasAddress</c> in the driver
/// package but lives in Shared so the Host (.NET 4.8) can decode without taking a
/// reference to the .NET 10 driver assembly. The Proxy serializes from its own
/// <c>FocasAddress</c>; the Host maps back to its local equivalent.
/// </summary>
[MessagePackObject]
public sealed class FocasAddressDto
{
/// <summary>0 = Pmc, 1 = Parameter, 2 = Macro. Matches <c>FocasAreaKind</c> enum order.</summary>
[Key(0)] public int Kind { get; set; }
/// <summary>PMC letter — null for Parameter / Macro.</summary>
[Key(1)] public string? PmcLetter { get; set; }
[Key(2)] public int Number { get; set; }
/// <summary>Optional bit index (0-7 for PMC, 0-31 for Parameter).</summary>
[Key(3)] public int? BitIndex { get; set; }
}
/// <summary>
/// 0 = Bit, 1 = Byte, 2 = Int16, 3 = Int32, 4 = Float32, 5 = Float64, 6 = String.
/// Matches <c>FocasDataType</c> enum order so both sides can cast <c>(int)</c>.
/// </summary>
public static class FocasDataTypeCode
{
public const int Bit = 0;
public const int Byte = 1;
public const int Int16 = 2;
public const int Int32 = 3;
public const int Float32 = 4;
public const int Float64 = 5;
public const int String = 6;
}
@@ -1,57 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>
/// Length-prefixed framing. Each IPC frame is:
/// <c>[4-byte big-endian length][1-byte message kind][MessagePack body]</c>.
/// Length is the body size only; the kind byte is not part of the prefixed length.
/// Mirrors the Galaxy Tier-C framing so operators see one wire format across hosts.
/// </summary>
public static class Framing
{
public const int LengthPrefixSize = 4;
public const int KindByteSize = 1;
/// <summary>
/// Maximum permitted body length (16 MiB). Protects the receiver from a hostile or
/// misbehaving peer sending an oversized length prefix.
/// </summary>
public const int MaxFrameBodyBytes = 16 * 1024 * 1024;
}
/// <summary>
/// Wire identifier for each contract. Values are stable — new contracts append, never
/// reuse. Ranges kept aligned with Galaxy so an operator reading a hex dump doesn't have
/// to context-switch between drivers.
/// </summary>
public enum FocasMessageKind : byte
{
Hello = 0x01,
HelloAck = 0x02,
Heartbeat = 0x03,
HeartbeatAck = 0x04,
OpenSessionRequest = 0x10,
OpenSessionResponse = 0x11,
CloseSessionRequest = 0x12,
ReadRequest = 0x30,
ReadResponse = 0x31,
WriteRequest = 0x32,
WriteResponse = 0x33,
PmcBitWriteRequest = 0x34,
PmcBitWriteResponse = 0x35,
SubscribeRequest = 0x40,
SubscribeResponse = 0x41,
UnsubscribeRequest = 0x42,
OnDataChangeNotification = 0x43,
ProbeRequest = 0x70,
ProbeResponse = 0x71,
RuntimeStatusChange = 0x72,
RecycleHostRequest = 0xF0,
RecycleStatusResponse = 0xF1,
ErrorResponse = 0xFE,
}
@@ -1,63 +0,0 @@
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>
/// First frame of every FOCAS Proxy -> Host connection. Advertises protocol major/minor
/// and the per-process shared secret the Proxy passed to the Host at spawn time. Major
/// mismatch is fatal; minor is advisory.
/// </summary>
[MessagePackObject]
public sealed class Hello
{
public const int CurrentMajor = 1;
public const int CurrentMinor = 0;
[Key(0)] public int ProtocolMajor { get; set; } = CurrentMajor;
[Key(1)] public int ProtocolMinor { get; set; } = CurrentMinor;
[Key(2)] public string PeerName { get; set; } = string.Empty;
/// <summary>
/// Per-process shared secret verified on the Host side against the value passed by the
/// supervisor at spawn time. Protects against a local attacker connecting to the pipe
/// after authenticating via the pipe ACL.
/// </summary>
[Key(3)] public string SharedSecret { get; set; } = string.Empty;
[Key(4)] public string[] Features { get; set; } = System.Array.Empty<string>();
}
[MessagePackObject]
public sealed class HelloAck
{
[Key(0)] public int ProtocolMajor { get; set; } = Hello.CurrentMajor;
[Key(1)] public int ProtocolMinor { get; set; } = Hello.CurrentMinor;
/// <summary>True if the Host accepted the hello; false + <see cref="RejectReason"/> filled if not.</summary>
[Key(2)] public bool Accepted { get; set; }
[Key(3)] public string? RejectReason { get; set; }
[Key(4)] public string HostName { get; set; } = string.Empty;
}
[MessagePackObject]
public sealed class Heartbeat
{
[Key(0)] public long MonotonicTicks { get; set; }
}
[MessagePackObject]
public sealed class HeartbeatAck
{
[Key(0)] public long MonotonicTicks { get; set; }
[Key(1)] public long HostUtcUnixMs { get; set; }
}
[MessagePackObject]
public sealed class ErrorResponse
{
/// <summary>Stable symbolic code — e.g. <c>InvalidAddress</c>, <c>SessionNotFound</c>, <c>Fwlib32Crashed</c>.</summary>
[Key(0)] public string Code { get; set; } = string.Empty;
[Key(1)] public string Message { get; set; } = string.Empty;
}
@@ -1,47 +0,0 @@
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>Lightweight connectivity probe — maps to <c>cnc_rdcncstat</c> on the Host.</summary>
[MessagePackObject]
public sealed class ProbeRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public int TimeoutMs { get; set; } = 2000;
}
[MessagePackObject]
public sealed class ProbeResponse
{
[Key(0)] public bool Healthy { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public long ObservedAtUtcUnixMs { get; set; }
}
/// <summary>Per-host runtime status — fan-out target when the Host observes the CNC going unreachable without the Proxy asking.</summary>
[MessagePackObject]
public sealed class RuntimeStatusChangeNotification
{
[Key(0)] public long SessionId { get; set; }
/// <summary>Running | Stopped | Unknown.</summary>
[Key(1)] public string RuntimeStatus { get; set; } = string.Empty;
[Key(2)] public long ObservedAtUtcUnixMs { get; set; }
}
[MessagePackObject]
public sealed class RecycleHostRequest
{
/// <summary>Soft | Hard. Soft drains subscriptions first; Hard kills immediately.</summary>
[Key(0)] public string Kind { get; set; } = "Soft";
[Key(1)] public string Reason { get; set; } = string.Empty;
}
[MessagePackObject]
public sealed class RecycleStatusResponse
{
[Key(0)] public bool Accepted { get; set; }
[Key(1)] public int GraceSeconds { get; set; } = 15;
[Key(2)] public string? Error { get; set; }
}
@@ -1,85 +0,0 @@
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>
/// Read one FOCAS address. Multi-read is the Proxy's responsibility — it batches
/// per-tag reads into parallel <see cref="ReadRequest"/> frames the Host services on its
/// STA thread. Keeping the IPC read single-address keeps the Host side trivial; FOCAS
/// itself has no multi-read primitive that spans area kinds.
/// </summary>
[MessagePackObject]
public sealed class ReadRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public FocasAddressDto Address { get; set; } = new();
[Key(2)] public int DataType { get; set; }
[Key(3)] public int TimeoutMs { get; set; } = 2000;
}
[MessagePackObject]
public sealed class ReadResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
/// <summary>OPC UA status code mapped by the Host via <c>FocasStatusMapper</c> — 0 = Good.</summary>
[Key(2)] public uint StatusCode { get; set; }
/// <summary>MessagePack-serialized boxed value. <c>null</c> when <see cref="Success"/> is false.</summary>
[Key(3)] public byte[]? ValueBytes { get; set; }
/// <summary>Matches <see cref="FocasDataTypeCode"/> so the Proxy knows how to deserialize.</summary>
[Key(4)] public int ValueTypeCode { get; set; }
[Key(5)] public long SourceTimestampUtcUnixMs { get; set; }
}
[MessagePackObject]
public sealed class WriteRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public FocasAddressDto Address { get; set; } = new();
[Key(2)] public int DataType { get; set; }
[Key(3)] public byte[]? ValueBytes { get; set; }
[Key(4)] public int ValueTypeCode { get; set; }
[Key(5)] public int TimeoutMs { get; set; } = 2000;
}
[MessagePackObject]
public sealed class WriteResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
/// <summary>OPC UA status code — 0 = Good.</summary>
[Key(2)] public uint StatusCode { get; set; }
}
/// <summary>
/// PMC bit read-modify-write. Handled as a first-class operation (not two separate
/// read+write round-trips) so the critical section stays on the Host — serializing
/// concurrent bit writers to the same parent byte is Host-side via
/// <c>SemaphoreSlim</c> keyed on <c>(PmcLetter, Number)</c>. Mirrors the in-process
/// pattern from <c>FocasPmcBitRmw</c>.
/// </summary>
[MessagePackObject]
public sealed class PmcBitWriteRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public FocasAddressDto Address { get; set; } = new();
/// <summary>The bit index to set/clear. 0-7.</summary>
[Key(2)] public int BitIndex { get; set; }
[Key(3)] public bool Value { get; set; }
[Key(4)] public int TimeoutMs { get; set; } = 2000;
}
[MessagePackObject]
public sealed class PmcBitWriteResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public uint StatusCode { get; set; }
}
@@ -1,31 +0,0 @@
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>
/// Open a FOCAS session against the CNC at <see cref="HostAddress"/>. One session per
/// configured device. The Host owns the Fwlib32 handle; the Proxy tracks only the
/// opaque <see cref="OpenSessionResponse.SessionId"/> returned on success.
/// </summary>
[MessagePackObject]
public sealed class OpenSessionRequest
{
[Key(0)] public string HostAddress { get; set; } = string.Empty;
[Key(1)] public int TimeoutMs { get; set; } = 2000;
[Key(2)] public int CncSeries { get; set; }
}
[MessagePackObject]
public sealed class OpenSessionResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public long SessionId { get; set; }
[Key(2)] public string? Error { get; set; }
[Key(3)] public string? ErrorCode { get; set; }
}
[MessagePackObject]
public sealed class CloseSessionRequest
{
[Key(0)] public long SessionId { get; set; }
}
@@ -1,61 +0,0 @@
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Contracts;
/// <summary>
/// Subscribe the Host to polling a set of tags on behalf of the Proxy. FOCAS is
/// poll-only — there are no CNC-initiated callbacks — so the Host runs the poll loop and
/// pushes <see cref="OnDataChangeNotification"/> frames whenever a value differs from
/// the last observation. Delta-only + per-group interval keeps the wire quiet.
/// </summary>
[MessagePackObject]
public sealed class SubscribeRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public long SubscriptionId { get; set; }
[Key(2)] public int IntervalMs { get; set; } = 1000;
[Key(3)] public SubscribeItem[] Items { get; set; } = System.Array.Empty<SubscribeItem>();
}
[MessagePackObject]
public sealed class SubscribeItem
{
/// <summary>Opaque correlation id the Proxy uses to route notifications back to the right OPC UA MonitoredItem.</summary>
[Key(0)] public long MonitoredItemId { get; set; }
[Key(1)] public FocasAddressDto Address { get; set; } = new();
[Key(2)] public int DataType { get; set; }
}
[MessagePackObject]
public sealed class SubscribeResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
/// <summary>Items the Host refused (address mismatch, unsupported type). Empty on full success.</summary>
[Key(2)] public long[] RejectedMonitoredItemIds { get; set; } = System.Array.Empty<long>();
}
[MessagePackObject]
public sealed class UnsubscribeRequest
{
[Key(0)] public long SubscriptionId { get; set; }
}
[MessagePackObject]
public sealed class OnDataChangeNotification
{
[Key(0)] public long SubscriptionId { get; set; }
[Key(1)] public DataChange[] Changes { get; set; } = System.Array.Empty<DataChange>();
}
[MessagePackObject]
public sealed class DataChange
{
[Key(0)] public long MonitoredItemId { get; set; }
[Key(1)] public uint StatusCode { get; set; }
[Key(2)] public byte[]? ValueBytes { get; set; }
[Key(3)] public int ValueTypeCode { get; set; }
[Key(4)] public long SourceTimestampUtcUnixMs { get; set; }
}
@@ -1,23 +0,0 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<Nullable>enable</Nullable>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared</RootNamespace>
</PropertyGroup>
<ItemGroup>
<!-- MessagePack for IPC. Netstandard 2.0 consumable by both .NET 10 (Proxy) + .NET 4.8 (Host). -->
<PackageReference Include="MessagePack" Version="2.5.187"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>
@@ -0,0 +1,195 @@
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// <summary>
/// Polls each device's CNC active-alarm list via <see cref="IFocasClient.ReadAlarmsAsync"/>
/// on a timer and translates raise / clear transitions into <see cref="IAlarmSource"/>
/// events on the owning <see cref="FocasDriver"/>. One poll loop per subscription; the
/// loop fans out across every configured device and diffs the (<c>AlarmNumber</c>,
/// <c>Type</c>) keyed active-alarm set between ticks.
/// </summary>
/// <remarks>
/// FOCAS alarms are flat per session — the CNC exposes a single active-alarm list via
/// <c>cnc_rdalmmsg2</c>, not per-node structures the way Galaxy / AbCip ALMD do. So the
/// projection ignores <c>sourceNodeIds</c> at the member level: every alarm event is
/// raised with <c>SourceNodeId=device-host-address</c>. Callers that want per-device
/// filtering can pass the specific host addresses as <c>sourceNodeIds</c> and the
/// projection will skip devices not listed.
/// </remarks>
internal sealed class FocasAlarmProjection : IAsyncDisposable
{
private readonly FocasDriver _driver;
private readonly TimeSpan _pollInterval;
private readonly Dictionary<long, Subscription> _subs = new();
private readonly Lock _subsLock = new();
private long _nextId;
public FocasAlarmProjection(FocasDriver driver, TimeSpan pollInterval)
{
_driver = driver;
_pollInterval = pollInterval;
}
public Task<IAlarmSubscriptionHandle> SubscribeAsync(
IReadOnlyList<string> sourceNodeIds, CancellationToken cancellationToken)
{
var id = Interlocked.Increment(ref _nextId);
var handle = new FocasAlarmSubscriptionHandle(id);
var cts = new CancellationTokenSource();
// Empty filter = listen to every configured device. Otherwise only devices whose
// host address appears in sourceNodeIds are polled.
var filter = sourceNodeIds.Count == 0
? null
: new HashSet<string>(sourceNodeIds, StringComparer.OrdinalIgnoreCase);
var sub = new Subscription(handle, filter, cts);
lock (_subsLock) _subs[id] = sub;
sub.Loop = Task.Run(() => RunPollLoopAsync(sub, cts.Token), cts.Token);
return Task.FromResult<IAlarmSubscriptionHandle>(handle);
}
public async Task UnsubscribeAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken)
{
if (handle is not FocasAlarmSubscriptionHandle h) return;
Subscription? sub;
lock (_subsLock)
{
if (!_subs.Remove(h.Id, out sub)) return;
}
try { sub.Cts.Cancel(); } catch { }
try { await sub.Loop.ConfigureAwait(false); } catch { }
sub.Cts.Dispose();
}
/// <summary>
/// FOCAS has no ack wire call — the CNC clears alarms on its own when the underlying
/// condition resolves. Swallow the request so capability negotiation succeeds, rather
/// than surfacing a confusing "not supported" error to the operator.
/// </summary>
public Task AcknowledgeAsync(
IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements, CancellationToken cancellationToken) =>
Task.CompletedTask;
public async ValueTask DisposeAsync()
{
List<Subscription> snap;
lock (_subsLock) { snap = [.. _subs.Values]; _subs.Clear(); }
foreach (var sub in snap)
{
try { sub.Cts.Cancel(); } catch { }
try { await sub.Loop.ConfigureAwait(false); } catch { }
sub.Cts.Dispose();
}
}
/// <summary>
/// One poll-tick for one device. Diffs the new alarm list against the previous snapshot,
/// emits raise + clear events. Extracted so tests can drive a tick without spinning up
/// the full Task.Run loop.
/// </summary>
internal void Tick(Subscription sub, string deviceHostAddress, IReadOnlyList<FocasActiveAlarm> current)
{
var prev = sub.LastByDevice.GetValueOrDefault(deviceHostAddress) ?? [];
var nowKeys = current.Select(a => AlarmKey(a)).ToHashSet();
var prevKeys = prev.Select(a => AlarmKey(a)).ToHashSet();
foreach (var a in current)
{
if (prevKeys.Contains(AlarmKey(a))) continue;
_driver.InvokeAlarmEvent(new AlarmEventArgs(
sub.Handle,
SourceNodeId: deviceHostAddress,
ConditionId: $"{deviceHostAddress}#{AlarmKey(a)}",
AlarmType: MapAlarmType(a.Type),
Message: a.Message,
Severity: MapSeverity(a.Type),
SourceTimestampUtc: DateTime.UtcNow));
}
foreach (var a in prev)
{
if (nowKeys.Contains(AlarmKey(a))) continue;
_driver.InvokeAlarmEvent(new AlarmEventArgs(
sub.Handle,
SourceNodeId: deviceHostAddress,
ConditionId: $"{deviceHostAddress}#{AlarmKey(a)}",
AlarmType: MapAlarmType(a.Type),
Message: $"{a.Message} (cleared)",
Severity: MapSeverity(a.Type),
SourceTimestampUtc: DateTime.UtcNow));
}
sub.LastByDevice[deviceHostAddress] = [.. current];
}
private async Task RunPollLoopAsync(Subscription sub, CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
try
{
foreach (var (host, alarms) in await _driver.ReadActiveAlarmsAcrossDevicesAsync(sub.DeviceFilter, ct).ConfigureAwait(false))
{
Tick(sub, host, alarms);
}
}
catch (OperationCanceledException) when (ct.IsCancellationRequested) { break; }
catch { /* per-tick failures are non-fatal — next tick retries */ }
try { await Task.Delay(_pollInterval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { break; }
}
}
private static string AlarmKey(FocasActiveAlarm a) => $"{a.Type}:{a.AlarmNumber}";
/// <summary>Map FOCAS type to a human-readable category; falls back to the numeric type.</summary>
internal static string MapAlarmType(short type) => type switch
{
FocasAlarmType.Parameter => "Parameter",
FocasAlarmType.PulseCode => "PulseCode",
FocasAlarmType.Overtravel => "Overtravel",
FocasAlarmType.Overheat => "Overheat",
FocasAlarmType.Servo => "Servo",
FocasAlarmType.DataIo => "DataIo",
FocasAlarmType.MemoryCheck => "MemoryCheck",
FocasAlarmType.MacroAlarm => "MacroAlarm",
_ => $"Type{type}",
};
/// <summary>
/// Project FOCAS alarm types into the driver-agnostic 4-band severity. Overtravel /
/// Servo / Emergency-equivalents are Critical; Parameter + Macro are Medium; rest land
/// at High (everything else on a CNC is safety-relevant).
/// </summary>
internal static AlarmSeverity MapSeverity(short type) => type switch
{
FocasAlarmType.Overtravel => AlarmSeverity.Critical,
FocasAlarmType.Servo => AlarmSeverity.Critical,
FocasAlarmType.PulseCode => AlarmSeverity.Critical,
FocasAlarmType.Parameter => AlarmSeverity.Medium,
FocasAlarmType.MacroAlarm => AlarmSeverity.Medium,
_ => AlarmSeverity.High,
};
internal sealed class Subscription(
FocasAlarmSubscriptionHandle handle,
HashSet<string>? deviceFilter,
CancellationTokenSource cts)
{
public FocasAlarmSubscriptionHandle Handle { get; } = handle;
public HashSet<string>? DeviceFilter { get; } = deviceFilter;
public CancellationTokenSource Cts { get; } = cts;
public Task Loop { get; set; } = Task.CompletedTask;
public Dictionary<string, IReadOnlyList<FocasActiveAlarm>> LastByDevice { get; } =
new(StringComparer.OrdinalIgnoreCase);
}
}
/// <summary>Handle returned by <see cref="FocasAlarmProjection.SubscribeAsync"/>.</summary>
public sealed record FocasAlarmSubscriptionHandle(long Id) : IAlarmSubscriptionHandle
{
public string DiagnosticId => $"focas-alarm-sub-{Id}";
}
@@ -16,7 +16,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// fail fast.
/// </remarks>
public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery, ISubscribable,
IHostConnectivityProbe, IPerCallHostResolver, IDisposable, IAsyncDisposable
IHostConnectivityProbe, IPerCallHostResolver, IAlarmSource, IDisposable, IAsyncDisposable
{
private readonly FocasDriverOptions _options;
private readonly string _driverInstanceId;
@@ -24,10 +24,12 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
private readonly PollGroupEngine _poll;
private readonly Dictionary<string, DeviceState> _devices = new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<string, FocasTagDefinition> _tagsByName = new(StringComparer.OrdinalIgnoreCase);
private FocasAlarmProjection? _alarmProjection;
private DriverHealth _health = new(DriverState.Unknown, null, null);
public event EventHandler<DataChangeEventArgs>? OnDataChange;
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
public event EventHandler<AlarmEventArgs>? OnAlarmEvent;
public FocasDriver(FocasDriverOptions options, string driverInstanceId,
IFocasClientFactory? clientFactory = null)
@@ -35,7 +37,7 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
ArgumentNullException.ThrowIfNull(options);
_options = options;
_driverInstanceId = driverInstanceId;
_clientFactory = clientFactory ?? new FwlibFocasClientFactory();
_clientFactory = clientFactory ?? new Wire.WireFocasClientFactory();
_poll = new PollGroupEngine(
reader: ReadAsync,
onChange: (handle, tagRef, snapshot) =>
@@ -85,6 +87,30 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
_ = Task.Run(() => ProbeLoopAsync(state, ct), ct);
}
}
if (_options.HandleRecycle.Enabled)
{
foreach (var state in _devices.Values)
{
state.RecycleCts = new CancellationTokenSource();
var ct = state.RecycleCts.Token;
_ = Task.Run(() => RecycleLoopAsync(state, ct), ct);
}
}
if (_options.AlarmProjection.Enabled)
_alarmProjection = new FocasAlarmProjection(this, _options.AlarmProjection.PollInterval);
if (_options.FixedTree.Enabled)
{
foreach (var state in _devices.Values)
{
state.FixedTreeCts = new CancellationTokenSource();
var ct = state.FixedTreeCts.Token;
_ = Task.Run(() => FixedTreeLoopAsync(state, ct), ct);
}
}
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
}
catch (Exception ex)
@@ -104,11 +130,22 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
public async Task ShutdownAsync(CancellationToken cancellationToken)
{
await _poll.DisposeAsync().ConfigureAwait(false);
if (_alarmProjection is { } proj)
{
await proj.DisposeAsync().ConfigureAwait(false);
_alarmProjection = null;
}
foreach (var state in _devices.Values)
{
try { state.ProbeCts?.Cancel(); } catch { }
state.ProbeCts?.Dispose();
state.ProbeCts = null;
try { state.RecycleCts?.Cancel(); } catch { }
state.RecycleCts?.Dispose();
state.RecycleCts = null;
try { state.FixedTreeCts?.Cancel(); } catch { }
state.FixedTreeCts?.Dispose();
state.FixedTreeCts = null;
state.DisposeClient();
}
_devices.Clear();
@@ -136,6 +173,16 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
for (var i = 0; i < fullReferences.Count; i++)
{
var reference = fullReferences[i];
// Fixed-tree T1 — fixed-tree references are synthesized from the cached
// dynamic snapshot + sysinfo; no P/Invoke per Read since the poll loop
// already fires them on cadence.
if (_options.FixedTree.Enabled && TryReadFixedTree(reference, now) is { } fx)
{
results[i] = fx;
continue;
}
if (!_tagsByName.TryGetValue(reference, out var def))
{
results[i] = new DataValueSnapshot(null, FocasStatusMapper.BadNodeIdUnknown, null, now);
@@ -241,6 +288,80 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
{
var label = device.DeviceName ?? device.HostAddress;
var deviceFolder = root.Folder(device.HostAddress, label);
// Fixed-tree T1 — Identity + Axes subtrees, populated once per session
// from cnc_sysinfo + cnc_rdaxisname at init time and kept in DeviceState.
if (_options.FixedTree.Enabled
&& _devices.TryGetValue(device.HostAddress, out var state)
&& state.FixedTreeCache is { } cache)
{
var identity = deviceFolder.Folder("Identity", "Identity");
EmitIdentityVariable(identity, device.HostAddress, "SeriesNumber", FocasDriverDataType.String);
EmitIdentityVariable(identity, device.HostAddress, "Version", FocasDriverDataType.String);
EmitIdentityVariable(identity, device.HostAddress, "MaxAxes", FocasDriverDataType.Int32);
EmitIdentityVariable(identity, device.HostAddress, "CncType", FocasDriverDataType.String);
EmitIdentityVariable(identity, device.HostAddress, "MtType", FocasDriverDataType.String);
EmitIdentityVariable(identity, device.HostAddress, "AxisCount", FocasDriverDataType.Int32);
var axesFolder = deviceFolder.Folder("Axes", "Axes");
foreach (var axis in cache.Axes)
{
var axisFolder = axesFolder.Folder(axis.Display, axis.Display);
EmitAxisVariable(axisFolder, device.HostAddress, axis.Display, "AbsolutePosition");
EmitAxisVariable(axisFolder, device.HostAddress, axis.Display, "MachinePosition");
EmitAxisVariable(axisFolder, device.HostAddress, axis.Display, "RelativePosition");
EmitAxisVariable(axisFolder, device.HostAddress, axis.Display, "DistanceToGo");
if (cache.Capabilities.ServoLoad)
EmitAxisVariable(axisFolder, device.HostAddress, axis.Display, "ServoLoad");
}
EmitAxisVariable(axesFolder, device.HostAddress, "FeedRate", "Actual");
EmitAxisVariable(axesFolder, device.HostAddress, "SpindleSpeed", "Actual");
// Spindle subtree — one folder per discovered spindle, suppressed
// entirely on series that don't export cnc_rdspdlname. Per-spindle
// Load + MaxRpm each gated on their own capability probe.
if (cache.Capabilities.Spindles)
{
var spindleRoot = deviceFolder.Folder("Spindle", "Spindle");
for (var i = 0; i < cache.Spindles.Count; i++)
{
var s = cache.Spindles[i];
var name = string.IsNullOrEmpty(s.Display) ? $"S{i + 1}" : s.Display;
var spindleFolder = spindleRoot.Folder(name, name);
if (cache.Capabilities.SpindleLoad)
EmitFixedVariable(spindleFolder, device.HostAddress, $"Spindle/{name}", "Load", DriverDataType.Int32);
if (cache.Capabilities.SpindleMaxRpm && i < cache.SpindleMaxRpms.Count)
EmitFixedVariable(spindleFolder, device.HostAddress, $"Spindle/{name}", "MaxRpm", DriverDataType.Int32);
}
}
// Fixed-tree T2 — Program + OperationMode subtrees (gated on capability).
if (cache.Capabilities.ProgramInfo)
{
var program = deviceFolder.Folder("Program", "Program");
EmitFixedVariable(program, device.HostAddress, "Program", "Name", DriverDataType.String);
EmitFixedVariable(program, device.HostAddress, "Program", "ONumber", DriverDataType.Int32);
EmitFixedVariable(program, device.HostAddress, "Program", "Number", DriverDataType.Int32);
EmitFixedVariable(program, device.HostAddress, "Program", "MainNumber", DriverDataType.Int32);
EmitFixedVariable(program, device.HostAddress, "Program", "Sequence", DriverDataType.Int32);
EmitFixedVariable(program, device.HostAddress, "Program", "BlockCount", DriverDataType.Int32);
var opMode = deviceFolder.Folder("OperationMode", "OperationMode");
EmitFixedVariable(opMode, device.HostAddress, "OperationMode", "Mode", DriverDataType.Int32);
EmitFixedVariable(opMode, device.HostAddress, "OperationMode", "ModeText", DriverDataType.String);
}
// Fixed-tree T3 — Timers subtree (power-on / operating / cutting / cycle).
if (cache.Capabilities.Timers)
{
var timers = deviceFolder.Folder("Timers", "Timers");
EmitFixedVariable(timers, device.HostAddress, "Timers", "PowerOnSeconds", DriverDataType.Float64);
EmitFixedVariable(timers, device.HostAddress, "Timers", "OperatingSeconds", DriverDataType.Float64);
EmitFixedVariable(timers, device.HostAddress, "Timers", "CuttingSeconds", DriverDataType.Float64);
EmitFixedVariable(timers, device.HostAddress, "Timers", "CycleSeconds", DriverDataType.Float64);
}
}
var tagsForDevice = _options.Tags.Where(t =>
string.Equals(t.DeviceHostAddress, device.HostAddress, StringComparison.OrdinalIgnoreCase));
foreach (var tag in tagsForDevice)
@@ -261,6 +382,72 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
return Task.CompletedTask;
}
private enum FocasDriverDataType { String, Int32, Float64 }
private static void EmitIdentityVariable(
IAddressSpaceBuilder folder, string deviceHost, string field, FocasDriverDataType type)
{
var fullName = FixedTreeReference(deviceHost, $"Identity/{field}");
folder.Variable(field, field, new DriverAttributeInfo(
FullName: fullName,
DriverDataType: type switch
{
FocasDriverDataType.Int32 => DriverDataType.Int32,
FocasDriverDataType.Float64 => DriverDataType.Float64,
_ => DriverDataType.String,
},
IsArray: false,
ArrayDim: null,
SecurityClass: SecurityClassification.ViewOnly,
IsHistorized: false,
IsAlarm: false,
WriteIdempotent: false));
}
private static void EmitAxisVariable(
IAddressSpaceBuilder folder, string deviceHost, string axisName, string field)
{
var fullName = FixedTreeReference(deviceHost, $"Axes/{axisName}/{field}");
folder.Variable(field, field, new DriverAttributeInfo(
FullName: fullName,
DriverDataType: DriverDataType.Float64,
IsArray: false,
ArrayDim: null,
SecurityClass: SecurityClassification.ViewOnly,
IsHistorized: false,
IsAlarm: false,
WriteIdempotent: false));
}
/// <summary>
/// Emit a variable under a named fixed-tree folder (Program, OperationMode,
/// …). Full-reference shape is <c>{deviceHost}/{folderPath}/{field}</c>.
/// </summary>
private static void EmitFixedVariable(
IAddressSpaceBuilder folder, string deviceHost, string folderPath,
string field, DriverDataType type)
{
var fullName = FixedTreeReference(deviceHost, $"{folderPath}/{field}");
folder.Variable(field, field, new DriverAttributeInfo(
FullName: fullName,
DriverDataType: type,
IsArray: false,
ArrayDim: null,
SecurityClass: SecurityClassification.ViewOnly,
IsHistorized: false,
IsAlarm: false,
WriteIdempotent: false));
}
/// <summary>
/// Canonical full-reference shape for a fixed-tree node. Keeps the device
/// host as a prefix so multi-device configs don't collide, and the rest is
/// the path inside the tree. Matches what poll-loop snapshots publish +
/// what <see cref="ReadAsync"/> looks up.
/// </summary>
internal static string FixedTreeReference(string deviceHost, string path) =>
$"{deviceHost}/{path}";
// ---- ISubscribable (polling overlay via shared engine) ----
public Task<ISubscriptionHandle> SubscribeAsync(
@@ -298,6 +485,310 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
}
}
/// <summary>
/// Per-device fixed-tree poll loop. First tick resolves sysinfo + axis names
/// (once) so <see cref="DiscoverAsync"/> can render the subtree on its next
/// invocation; every tick thereafter fires a <c>cnc_rddynamic2</c> per axis
/// and publishes OnDataChange for the axis positions + feed rate + spindle
/// speed.
/// </summary>
private async Task FixedTreeLoopAsync(DeviceState state, CancellationToken ct)
{
// Bootstrap: identity + axis names + per-optional-API capability probe.
// Each optional call is attempted once; failures (EW_FUNC / EW_NOOPT / EW_VERSION)
// record the capability as unsupported and suppress the corresponding nodes
// in DiscoverAsync + the poll loop.
while (!ct.IsCancellationRequested && state.FixedTreeCache is null)
{
try
{
var client = await EnsureConnectedAsync(state, ct).ConfigureAwait(false);
var sys = await client.GetSysInfoAsync(ct).ConfigureAwait(false);
var axes = await client.GetAxisNamesAsync(ct).ConfigureAwait(false);
// Optional-API probes — each returns empty / throws when unsupported.
var spindles = await SafeProbe(() => client.GetSpindleNamesAsync(ct), []);
var spindleMaxRpms = await SafeProbe(() => client.GetSpindleMaxRpmsAsync(ct), []);
var servoLoads = await SafeProbe(() => client.GetServoLoadsAsync(ct), []);
var programInfo = await SafeTryProbe(() => client.GetProgramInfoAsync(ct));
var timer = await SafeTryProbe(() => client.GetTimerAsync(FocasTimerKind.PowerOn, ct));
var spindleLoad = await SafeProbe(() => client.GetSpindleLoadsAsync(ct), []);
var caps = new FocasFixedTreeCapabilities(
Spindles: spindles.Count > 0,
SpindleLoad: spindleLoad.Count > 0,
SpindleMaxRpm: spindleMaxRpms.Count > 0,
ServoLoad: servoLoads.Count > 0,
ProgramInfo: programInfo is not null,
Timers: timer is not null);
state.FixedTreeCache = new FocasFixedTreeCache(
sys, [.. axes], [.. spindles], [.. spindleMaxRpms], caps);
}
catch (OperationCanceledException) when (ct.IsCancellationRequested) { return; }
catch
{
try { await Task.Delay(TimeSpan.FromSeconds(2), ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
}
}
// Prime the spindle-loads cache from bootstrap if supported — avoids a
// "tree is there but reads say BadNodeIdUnknown" window on startup.
if (state.FixedTreeCache?.Capabilities is { SpindleLoad: true })
{
try
{
var client2 = await EnsureConnectedAsync(state, ct).ConfigureAwait(false);
var loads = await client2.GetSpindleLoadsAsync(ct).ConfigureAwait(false);
for (var i = 0; i < loads.Count; i++) state.LastSpindleLoads[i] = loads[i];
}
catch { /* first-tick poll will retry */ }
}
var programPollDue = DateTime.MinValue;
var timerPollDue = DateTime.MinValue;
while (!ct.IsCancellationRequested)
{
try
{
var client = await EnsureConnectedAsync(state, ct).ConfigureAwait(false);
var cache = state.FixedTreeCache;
if (cache is null) break;
FocasDynamicSnapshot? firstAxisSnap = null;
for (var i = 0; i < cache.Axes.Count; i++)
{
var axisIndex = i + 1; // FOCAS uses 1-based axis indexing
var axis = cache.Axes[i];
var snap = await client.ReadDynamicAsync(axisIndex, ct).ConfigureAwait(false);
PublishAxisSnapshot(state, axis, snap);
if (i == 0) { firstAxisSnap = snap; PublishRateSnapshot(state, snap); }
}
// Servo loads + spindle loads — both return bulk arrays, so folding
// into the axis cadence is cheap. Each is gated by the bootstrap
// capability probe — unsupported on this series = silent skip.
if (cache.Capabilities.ServoLoad)
{
try
{
var loads = await client.GetServoLoadsAsync(ct).ConfigureAwait(false);
PublishServoLoads(state, loads);
}
catch { /* transient — next tick retries */ }
}
if (cache.Capabilities.SpindleLoad)
{
try
{
var loads = await client.GetSpindleLoadsAsync(ct).ConfigureAwait(false);
for (var i = 0; i < loads.Count; i++) state.LastSpindleLoads[i] = loads[i];
}
catch { /* transient */ }
}
// Program-info poll runs on its own cadence — much slower than the axis
// poll because program / mode transitions are operator-driven.
var programInterval = _options.FixedTree.ProgramPollInterval;
if (cache.Capabilities.ProgramInfo
&& programInterval > TimeSpan.Zero && DateTime.UtcNow >= programPollDue)
{
try
{
var program = await client.GetProgramInfoAsync(ct).ConfigureAwait(false);
state.LastProgramInfo = program;
if (firstAxisSnap is { } s) state.LastProgramAxisRef = s;
}
catch { /* transient — next tick retries */ }
programPollDue = DateTime.UtcNow + programInterval;
}
// Timers — slowest cadence. Fires 4 FWLIB calls per tick (one per kind).
var timerInterval = _options.FixedTree.TimerPollInterval;
if (cache.Capabilities.Timers
&& timerInterval > TimeSpan.Zero && DateTime.UtcNow >= timerPollDue)
{
foreach (FocasTimerKind kind in Enum.GetValues<FocasTimerKind>())
{
try
{
var t = await client.GetTimerAsync(kind, ct).ConfigureAwait(false);
state.LastTimers[kind] = t;
}
catch { /* per-kind failures are non-fatal */ }
}
timerPollDue = DateTime.UtcNow + timerInterval;
}
}
catch (OperationCanceledException) when (ct.IsCancellationRequested) { break; }
catch { /* next tick retries — transient blips are expected */ }
try { await Task.Delay(_options.FixedTree.PollInterval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { break; }
}
}
/// <summary>
/// Cache a fresh axis snapshot. The poll loop doesn't fire <c>OnDataChange</c>
/// directly — subscribers go through the normal <c>SubscribeAsync</c> →
/// <see cref="PollGroupEngine"/> → <see cref="ReadAsync"/> path, which hits
/// <see cref="TryReadFixedTree"/> and returns these cached values.
/// </summary>
private static void PublishAxisSnapshot(DeviceState state, FocasAxisName axis, FocasDynamicSnapshot snap)
{
var host = state.Options.HostAddress;
state.LastFixedSnapshots[FixedTreeReference(host, $"Axes/{axis.Display}/AbsolutePosition")] = snap.AbsolutePosition;
state.LastFixedSnapshots[FixedTreeReference(host, $"Axes/{axis.Display}/MachinePosition")] = snap.MachinePosition;
state.LastFixedSnapshots[FixedTreeReference(host, $"Axes/{axis.Display}/RelativePosition")] = snap.RelativePosition;
state.LastFixedSnapshots[FixedTreeReference(host, $"Axes/{axis.Display}/DistanceToGo")] = snap.DistanceToGo;
}
private static void PublishRateSnapshot(DeviceState state, FocasDynamicSnapshot snap)
{
var host = state.Options.HostAddress;
state.LastFixedSnapshots[FixedTreeReference(host, "Axes/FeedRate/Actual")] = snap.ActualFeedRate;
state.LastFixedSnapshots[FixedTreeReference(host, "Axes/SpindleSpeed/Actual")] = snap.ActualSpindleSpeed;
}
/// <summary>
/// Cache servo-load percentages keyed by axis name. Stored separately from
/// <c>LastFixedSnapshots</c> (which is int-typed) so the double-valued load
/// values don't need casting on every read.
/// </summary>
private static void PublishServoLoads(DeviceState state, IReadOnlyList<FocasServoLoad> loads)
{
foreach (var load in loads)
state.LastServoLoads[load.AxisName] = load.LoadPercent;
}
private static object? TimerValue(DeviceState state, FocasTimerKind kind) =>
state.LastTimers.TryGetValue(kind, out var t) ? (object)t.TotalSeconds : null;
/// <summary>
/// Call an optional probe that returns a collection; swallow any exception
/// and return <paramref name="fallback"/>. Used by bootstrap to capture
/// per-series capability without letting one failed probe take down the
/// entire bootstrap sequence.
/// </summary>
private static async Task<IReadOnlyList<T>> SafeProbe<T>(
Func<Task<IReadOnlyList<T>>> probe, IReadOnlyList<T> fallback)
{
try { return await probe().ConfigureAwait(false); }
catch { return fallback; }
}
/// <summary>
/// Nullable variant — probe returns a single object or null on failure.
/// </summary>
private static async Task<T?> SafeTryProbe<T>(Func<Task<T>> probe) where T : class
{
try { return await probe().ConfigureAwait(false); }
catch { return null; }
}
/// <summary>
/// Read cached last-fixed-tree snapshots. Returns the projected value when
/// the reference looks like a fixed-tree FullName; null when it doesn't
/// (callers fall through to the user-authored tag path).
/// </summary>
private DataValueSnapshot? TryReadFixedTree(string reference, DateTime now)
{
foreach (var state in _devices.Values)
{
if (!reference.StartsWith(state.Options.HostAddress + "/", StringComparison.OrdinalIgnoreCase)) continue;
if (state.LastFixedSnapshots.TryGetValue(reference, out var raw))
return new DataValueSnapshot((double)raw, FocasStatusMapper.Good, now, now);
// Servo-load match: reference shape is "{host}/Axes/{name}/ServoLoad"
var suffixFull = reference[(state.Options.HostAddress.Length + 1)..];
if (suffixFull.StartsWith("Axes/", StringComparison.Ordinal) && suffixFull.EndsWith("/ServoLoad", StringComparison.Ordinal))
{
var axisName = suffixFull["Axes/".Length..^"/ServoLoad".Length];
if (state.LastServoLoads.TryGetValue(axisName, out var load))
return new DataValueSnapshot(load, FocasStatusMapper.Good, now, now);
}
// Spindle matches: "{host}/Spindle/{name}/Load" + "{host}/Spindle/{name}/MaxRpm"
if (suffixFull.StartsWith("Spindle/", StringComparison.Ordinal)
&& state.FixedTreeCache is { } spindleCache)
{
var tail = suffixFull["Spindle/".Length..];
var slash = tail.IndexOf('/');
if (slash > 0)
{
var spindleName = tail[..slash];
var field = tail[(slash + 1)..];
var idx = -1;
for (var i = 0; i < spindleCache.Spindles.Count; i++)
{
var s = spindleCache.Spindles[i];
var display = string.IsNullOrEmpty(s.Display) ? $"S{i + 1}" : s.Display;
if (string.Equals(display, spindleName, StringComparison.OrdinalIgnoreCase)) { idx = i; break; }
}
if (idx >= 0)
{
object? value = field switch
{
"Load" => state.LastSpindleLoads.TryGetValue(idx, out var l) ? (object)l : null,
"MaxRpm" => idx < spindleCache.SpindleMaxRpms.Count ? (object)spindleCache.SpindleMaxRpms[idx] : null,
_ => null,
};
if (value is not null)
return new DataValueSnapshot(value, FocasStatusMapper.Good, now, now);
}
}
}
// Identity strings + program / op-mode fields aren't cached as doubles —
// re-derive from the struct caches.
if (state.FixedTreeCache is { } cache)
{
var suffix = reference[(state.Options.HostAddress.Length + 1)..];
var value = suffix switch
{
"Identity/SeriesNumber" => (object)cache.SysInfo.Series,
"Identity/Version" => cache.SysInfo.Version,
"Identity/MaxAxes" => cache.SysInfo.MaxAxis,
"Identity/CncType" => cache.SysInfo.CncType,
"Identity/MtType" => cache.SysInfo.MtType,
"Identity/AxisCount" => cache.SysInfo.AxesCount,
"Program/Name" => (object?)state.LastProgramInfo?.Name,
"Program/ONumber" => state.LastProgramInfo?.ONumber,
"Program/BlockCount" => state.LastProgramInfo?.BlockCount,
"Program/Number" => state.LastProgramAxisRef?.ProgramNumber,
"Program/MainNumber" => state.LastProgramAxisRef?.MainProgramNumber,
"Program/Sequence" => state.LastProgramAxisRef?.SequenceNumber,
"OperationMode/Mode" => state.LastProgramInfo?.Mode,
"OperationMode/ModeText" => state.LastProgramInfo is { } pi
? FocasOpMode.ToText(pi.Mode) : null,
"Timers/PowerOnSeconds" => TimerValue(state, FocasTimerKind.PowerOn),
"Timers/OperatingSeconds" => TimerValue(state, FocasTimerKind.Operating),
"Timers/CuttingSeconds" => TimerValue(state, FocasTimerKind.Cutting),
"Timers/CycleSeconds" => TimerValue(state, FocasTimerKind.Cycle),
_ => null,
};
if (value is not null)
return new DataValueSnapshot(value, FocasStatusMapper.Good, now, now);
}
}
return null;
}
private async Task RecycleLoopAsync(DeviceState state, CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
try { await Task.Delay(_options.HandleRecycle.Interval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { break; }
// Close the current handle — the next Read / Write / Probe call triggers
// EnsureConnectedAsync, which reopens a fresh one. We don't block here on
// reconnect because the goal is just to release the FWLIB handle slot; a
// readable tick one probe cycle later is an acceptable cost.
try { state.DisposeClient(); }
catch { /* already disposed or race — next EnsureConnected recovers */ }
}
}
private void TransitionDeviceState(DeviceState state, HostState newState)
{
HostState old;
@@ -312,6 +803,50 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
new HostStatusChangedEventArgs(state.Options.HostAddress, old, newState));
}
// ---- IAlarmSource ----
public Task<IAlarmSubscriptionHandle> SubscribeAlarmsAsync(
IReadOnlyList<string> sourceNodeIds, CancellationToken cancellationToken)
{
if (_alarmProjection is null)
throw new NotSupportedException(
"FOCAS alarm projection is disabled — set FocasDriverOptions.AlarmProjection.Enabled=true to opt in.");
return _alarmProjection.SubscribeAsync(sourceNodeIds, cancellationToken);
}
public Task UnsubscribeAlarmsAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken) =>
_alarmProjection is { } p ? p.UnsubscribeAsync(handle, cancellationToken) : Task.CompletedTask;
public Task AcknowledgeAsync(
IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements, CancellationToken cancellationToken) =>
_alarmProjection is { } p ? p.AcknowledgeAsync(acknowledgements, cancellationToken) : Task.CompletedTask;
internal void InvokeAlarmEvent(AlarmEventArgs args) => OnAlarmEvent?.Invoke(this, args);
/// <summary>
/// Poll every configured device's active-alarm list in one pass. Used by the alarm
/// projection — kept <c>internal</c> rather than <c>public</c> because callers that
/// want alarm events should subscribe through <c>IAlarmSource</c> instead.
/// </summary>
internal async Task<IReadOnlyList<(string HostAddress, IReadOnlyList<FocasActiveAlarm> Alarms)>>
ReadActiveAlarmsAcrossDevicesAsync(HashSet<string>? deviceFilter, CancellationToken ct)
{
var result = new List<(string, IReadOnlyList<FocasActiveAlarm>)>();
foreach (var state in _devices.Values)
{
if (deviceFilter is not null && !deviceFilter.Contains(state.Options.HostAddress)) continue;
try
{
var client = await EnsureConnectedAsync(state, ct).ConfigureAwait(false);
var alarms = await client.ReadAlarmsAsync(ct).ConfigureAwait(false);
result.Add((state.Options.HostAddress, alarms));
}
catch (OperationCanceledException) when (ct.IsCancellationRequested) { break; }
catch { /* surface a device-local fault on the next tick */ }
}
return result;
}
// ---- IPerCallHostResolver ----
public string ResolveHost(string fullReference)
@@ -341,6 +876,33 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
public void Dispose() => DisposeAsync().AsTask().GetAwaiter().GetResult();
public async ValueTask DisposeAsync() => await ShutdownAsync(CancellationToken.None).ConfigureAwait(false);
/// <summary>
/// Per-device fixed-tree cache populated once at first successful connect and
/// read-only thereafter. Used by <see cref="DiscoverAsync"/> to render the
/// tree + by <see cref="TryReadFixedTree"/> for synchronous Identity/* reads.
/// </summary>
internal sealed record FocasFixedTreeCache(
FocasSysInfo SysInfo,
IReadOnlyList<FocasAxisName> Axes,
IReadOnlyList<FocasSpindleName> Spindles,
IReadOnlyList<int> SpindleMaxRpms,
FocasFixedTreeCapabilities Capabilities);
/// <summary>
/// Per-device optional-API capability flags — which of the "this may or may not
/// exist on this CNC series" calls succeeded at bootstrap. Drives per-series
/// node suppression so a 16i that doesn't export <c>cnc_rdspmaxrpm</c> simply
/// doesn't get a <c>Spindle/{name}/MaxRpm</c> node (instead of surfacing
/// <c>BadDeviceFailure</c> on every read).
/// </summary>
internal sealed record FocasFixedTreeCapabilities(
bool Spindles, // cnc_rdspdlname returned 1+ spindle names
bool SpindleLoad, // cnc_rdspload bootstrap probe succeeded
bool SpindleMaxRpm, // cnc_rdspmaxrpm bootstrap probe succeeded
bool ServoLoad, // cnc_rdsvmeter bootstrap probe returned data
bool ProgramInfo, // cnc_exeprgname2 + cnc_rdblkcount + cnc_rdopmode work
bool Timers); // cnc_rdtimer works for at least PowerOn
internal sealed class DeviceState(FocasHostAddress parsedAddress, FocasDeviceOptions options)
{
public FocasHostAddress ParsedAddress { get; } = parsedAddress;
@@ -351,6 +913,16 @@ public sealed class FocasDriver : IDriver, IReadable, IWritable, ITagDiscovery,
public HostState HostState { get; set; } = HostState.Unknown;
public DateTime HostStateChangedUtc { get; set; } = DateTime.UtcNow;
public CancellationTokenSource? ProbeCts { get; set; }
public CancellationTokenSource? RecycleCts { get; set; }
public CancellationTokenSource? FixedTreeCts { get; set; }
public FocasFixedTreeCache? FixedTreeCache { get; set; }
public Dictionary<string, int> LastFixedSnapshots { get; } = new(StringComparer.OrdinalIgnoreCase);
public FocasProgramInfo? LastProgramInfo { get; set; }
/// <summary>Cached first-axis dynamic snapshot — feeds Program/Number, /MainNumber, /Sequence.</summary>
public FocasDynamicSnapshot? LastProgramAxisRef { get; set; }
public Dictionary<FocasTimerKind, FocasTimer> LastTimers { get; } = [];
public Dictionary<string, double> LastServoLoads { get; } = new(StringComparer.OrdinalIgnoreCase);
public Dictionary<int, int> LastSpindleLoads { get; } = [];
public void DisposeClient()
{
@@ -1,32 +1,28 @@
using System.Text.Json;
using System.Text.Json.Serialization;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Wire;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// <summary>
/// Static factory registration helper for <see cref="FocasDriver"/>. Server's Program.cs
/// calls <see cref="Register"/> once at startup; the bootstrapper (task #248) then
/// materialises FOCAS DriverInstance rows from the central config DB into live driver
/// instances. Mirrors <c>GalaxyProxyDriverFactoryExtensions</c>; no dependency on
/// Microsoft.Extensions.DependencyInjection so the driver project stays DI-free.
/// Static factory registration helper for <see cref="FocasDriver"/>. Server's
/// Program.cs calls <see cref="Register"/> once at startup; the bootstrapper
/// then materialises FOCAS DriverInstance rows from the central config DB
/// into live driver instances.
/// </summary>
/// <remarks>
/// The DriverConfig JSON selects the <see cref="IFocasClientFactory"/> backend:
/// <list type="bullet">
/// <item><c>"Backend": "ipc"</c> (default) — wires <see cref="IpcFocasClientFactory"/>
/// against a named-pipe <see cref="FocasIpcClient"/> talking to a separate
/// <c>Driver.FOCAS.Host</c> process (Tier-C isolation). Requires <c>PipeName</c> +
/// <c>SharedSecret</c>.</item>
/// <item><c>"Backend": "fwlib"</c> — direct in-process Fwlib32.dll P/Invoke via
/// <see cref="FwlibFocasClientFactory"/>. Use only when the main server is licensed
/// for FOCAS and you accept the native-crash blast-radius trade-off.</item>
/// <item><c>"Backend": "unimplemented"</c> — returns the no-op factory; useful for
/// scaffolding DriverInstance rows before the Host is deployed so the server boots.</item>
/// <item><c>"Backend": "wire"</c> (default) — pure-managed FOCAS2 wire
/// client (<see cref="WireFocasClientFactory"/>) speaking directly to
/// the CNC on TCP:8193.</item>
/// <item><c>"Backend": "unimplemented"</c> / <c>"none"</c> / <c>"stub"</c>
/// — returns the no-op factory; useful for scaffolding DriverInstance
/// rows before the CNC endpoint is reachable.</item>
/// </list>
/// Devices / Tags / Probe / Timeout / Series come from the same JSON and feed directly
/// into <see cref="FocasDriverOptions"/>.
/// Devices / Tags / Probe / Timeout / Series come from the same JSON and
/// feed directly into <see cref="FocasDriverOptions"/>.
/// </remarks>
public static class FocasDriverFactoryExtensions
{
@@ -92,45 +88,19 @@ public static class FocasDriverFactoryExtensions
internal static IFocasClientFactory BuildClientFactory(
FocasDriverConfigDto dto, string driverInstanceId)
{
var backend = (dto.Backend ?? "ipc").Trim().ToLowerInvariant();
var backend = (dto.Backend ?? "wire").Trim().ToLowerInvariant();
return backend switch
{
"ipc" => BuildIpcFactory(dto, driverInstanceId),
"fwlib" or "fwlib32" => new FwlibFocasClientFactory(),
"wire" => new WireFocasClientFactory(),
"unimplemented" or "none" or "stub" => new UnimplementedFocasClientFactory(),
_ => throw new InvalidOperationException(
$"FOCAS driver config for '{driverInstanceId}' has unknown Backend '{dto.Backend}'. " +
"Expected one of: ipc, fwlib, unimplemented."),
"Expected one of: wire, unimplemented. " +
"(The legacy 'ipc' / 'fwlib' backends were retired in the Wire migration — " +
"see docs/drivers/FOCAS.md.)"),
};
}
private static IpcFocasClientFactory BuildIpcFactory(
FocasDriverConfigDto dto, string driverInstanceId)
{
var pipeName = dto.PipeName
?? throw new InvalidOperationException(
$"FOCAS driver config for '{driverInstanceId}' missing required PipeName (Tier-C ipc backend)");
var sharedSecret = dto.SharedSecret
?? throw new InvalidOperationException(
$"FOCAS driver config for '{driverInstanceId}' missing required SharedSecret (Tier-C ipc backend)");
var connectTimeout = TimeSpan.FromMilliseconds(dto.ConnectTimeoutMs ?? 10_000);
var series = ParseSeries(dto.Series);
// Each IFocasClientFactory.Create() call opens a fresh pipe to the Host — matches the
// driver's one-client-per-device invariant. FocasIpcClient.ConnectAsync is awaited
// synchronously via GetAwaiter().GetResult() because IFocasClientFactory.Create is a
// sync contract; the blocking call lands inside FocasDriver.EnsureConnectedAsync,
// which immediately awaits IFocasClient.ConnectAsync afterwards so the perceived
// latency is identical to a fully-async factory.
return new IpcFocasClientFactory(
ipcClientFactory: () => FocasIpcClient.ConnectAsync(
pipeName: pipeName,
sharedSecret: sharedSecret,
connectTimeout: connectTimeout,
ct: CancellationToken.None).GetAwaiter().GetResult(),
series: series);
}
private static FocasCncSeries ParseSeries(string? raw)
{
if (string.IsNullOrWhiteSpace(raw)) return FocasCncSeries.Unknown;
@@ -162,9 +132,6 @@ public static class FocasDriverFactoryExtensions
internal sealed class FocasDriverConfigDto
{
public string? Backend { get; init; }
public string? PipeName { get; init; }
public string? SharedSecret { get; init; }
public int? ConnectTimeoutMs { get; init; }
public string? Series { get; init; }
public int? TimeoutMs { get; init; }
public List<FocasDeviceDto>? Devices { get; init; }
@@ -11,6 +11,77 @@ public sealed class FocasDriverOptions
public IReadOnlyList<FocasTagDefinition> Tags { get; init; } = [];
public FocasProbeOptions Probe { get; init; } = new();
public TimeSpan Timeout { get; init; } = TimeSpan.FromSeconds(2);
public FocasAlarmProjectionOptions AlarmProjection { get; init; } = new();
public FocasHandleRecycleOptions HandleRecycle { get; init; } = new();
public FocasFixedTreeOptions FixedTree { get; init; } = new();
}
/// <summary>
/// Fixed-node tree exposed by FOCAS per <c>docs/v2/driver-specs.md §7</c> —
/// <c>Identity/</c>, <c>Axes/{name}/</c>, etc. populated from
/// <c>cnc_sysinfo</c> / <c>cnc_rdaxisname</c> / <c>cnc_rddynamic2</c>. Disabled by
/// default so existing configs that only use user-authored tags don't grow new
/// nodes on upgrade.
/// </summary>
public sealed class FocasFixedTreeOptions
{
/// <summary>Enable the fixed-node tree for every configured device.</summary>
public bool Enabled { get; init; } = false;
/// <summary>
/// Poll cadence for <c>cnc_rddynamic2</c>. Each tick calls the API once per
/// configured axis + publishes OnDataChange for the axis subtree. Real CNCs
/// serve ~100ms loops comfortably; the default is conservative.
/// </summary>
public TimeSpan PollInterval { get; init; } = TimeSpan.FromMilliseconds(250);
/// <summary>
/// Poll cadence for program + operation-mode info. Slower than the axis
/// poll because program / mode transitions happen on operator timescales.
/// Zero / negative disables the program poll entirely.
/// </summary>
public TimeSpan ProgramPollInterval { get; init; } = TimeSpan.FromSeconds(1);
/// <summary>
/// Poll cadence for timers (power-on / operating / cutting / cycle).
/// These change at human timescales — default is 30s. Zero / negative
/// disables the timer poll entirely.
/// </summary>
public TimeSpan TimerPollInterval { get; init; } = TimeSpan.FromSeconds(30);
}
/// <summary>
/// Proactive session-recycle cadence. Fanuc CNCs have a finite FWLIB handle pool
/// (~510 concurrent connections) and certain series have documented handle-leak bugs
/// that manifest after long uptime. When <see cref="Enabled"/> is <c>true</c> the
/// driver closes + reopens each device's session on the <see cref="Interval"/> cadence,
/// forcing FWLIB to release its handle slot back to the pool. Reads / writes during
/// recycle wait for the reconnect rather than failing — worst case an operator sees a
/// brief read latency spike once per cadence.
/// </summary>
/// <remarks>
/// Disabled by default because a healthy CNC + driver doesn't need it. Enable when
/// field experience shows handle exhaustion against a specific series / firmware.
/// Typical tuning: 30 min for sites running multiple OtOpcUa instances against the
/// same CNC (they share the pool); 6 h for a single-client deployment.
/// </remarks>
public sealed class FocasHandleRecycleOptions
{
public bool Enabled { get; init; } = false;
public TimeSpan Interval { get; init; } = TimeSpan.FromHours(1);
}
/// <summary>
/// Controls the CNC active-alarm polling projection that surfaces FOCAS alarms via
/// <c>IAlarmSource</c>. Disabled by default — operators opt in by setting
/// <see cref="Enabled"/> in <c>appsettings.json</c>.
/// </summary>
public sealed class FocasAlarmProjectionOptions
{
public bool Enabled { get; init; } = false;
/// <summary>Poll cadence. One <c>cnc_rdalmmsg2</c> call per device per tick.</summary>
public TimeSpan PollInterval { get; init; } = TimeSpan.FromSeconds(2);
}
/// <summary>
@@ -1,328 +0,0 @@
using System.Buffers.Binary;
using System.Collections.Concurrent;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// <summary>
/// <see cref="IFocasClient"/> implementation backed by Fanuc's licensed
/// <c>Fwlib32.dll</c> via <see cref="FwlibNative"/> P/Invoke. The DLL is NOT shipped with
/// OtOpcUa; the deployment places it next to the server executable or on <c>PATH</c>
/// (per Fanuc licensing — see <c>docs/v2/focas-deployment.md</c>).
/// </summary>
/// <remarks>
/// <para>Construction is licence-safe — .NET P/Invoke is lazy, so instantiating this class
/// does NOT load <c>Fwlib32.dll</c>. The DLL only loads on the first wire call (Connect /
/// Read / Write / Probe). When missing, those calls throw <see cref="DllNotFoundException"/>
/// which the driver surfaces as <c>BadCommunicationError</c> through the normal exception
/// mapping.</para>
///
/// <para>Session-scoped handle — <c>cnc_allclibhndl3</c> opens one FWLIB handle per CNC;
/// all PMC / parameter / macro reads on that device go through the same handle. Dispose
/// calls <c>cnc_freelibhndl</c>.</para>
/// </remarks>
internal sealed class FwlibFocasClient : IFocasClient
{
private ushort _handle;
private bool _connected;
// Per-PMC-byte RMW lock registry. Bit writes to the same byte get serialised so two
// concurrent bit updates don't lose one another's modification. Key = "{addrType}:{byteAddr}".
private readonly ConcurrentDictionary<string, SemaphoreSlim> _rmwLocks = new();
private SemaphoreSlim GetRmwLock(short addrType, int byteAddr) =>
_rmwLocks.GetOrAdd($"{addrType}:{byteAddr}", _ => new SemaphoreSlim(1, 1));
public bool IsConnected => _connected;
public Task ConnectAsync(FocasHostAddress address, TimeSpan timeout, CancellationToken cancellationToken)
{
if (_connected) return Task.CompletedTask;
var timeoutMs = (int)Math.Max(1, timeout.TotalMilliseconds);
var ret = FwlibNative.AllcLibHndl3(address.Host, (ushort)address.Port, timeoutMs, out var handle);
if (ret != 0)
throw new InvalidOperationException(
$"FWLIB cnc_allclibhndl3 failed with EW_{ret} connecting to {address}.");
_handle = handle;
_connected = true;
return Task.CompletedTask;
}
public Task<(object? value, uint status)> ReadAsync(
FocasAddress address, FocasDataType type, CancellationToken cancellationToken)
{
if (!_connected) return Task.FromResult<(object?, uint)>((null, FocasStatusMapper.BadCommunicationError));
cancellationToken.ThrowIfCancellationRequested();
return address.Kind switch
{
FocasAreaKind.Pmc => Task.FromResult(ReadPmc(address, type)),
FocasAreaKind.Parameter => Task.FromResult(ReadParameter(address, type)),
FocasAreaKind.Macro => Task.FromResult(ReadMacro(address)),
_ => Task.FromResult<(object?, uint)>((null, FocasStatusMapper.BadNotSupported)),
};
}
public async Task<uint> WriteAsync(
FocasAddress address, FocasDataType type, object? value, CancellationToken cancellationToken)
{
if (!_connected) return FocasStatusMapper.BadCommunicationError;
cancellationToken.ThrowIfCancellationRequested();
return address.Kind switch
{
FocasAreaKind.Pmc when type == FocasDataType.Bit && address.BitIndex is int =>
await WritePmcBitAsync(address, Convert.ToBoolean(value), cancellationToken).ConfigureAwait(false),
FocasAreaKind.Pmc => WritePmc(address, type, value),
FocasAreaKind.Parameter => WriteParameter(address, type, value),
FocasAreaKind.Macro => WriteMacro(address, value),
_ => FocasStatusMapper.BadNotSupported,
};
}
/// <summary>
/// Read-modify-write one bit within a PMC byte. Acquires a per-byte semaphore so
/// concurrent bit writes against the same byte serialise and neither loses its update.
/// </summary>
private async Task<uint> WritePmcBitAsync(
FocasAddress address, bool newValue, CancellationToken cancellationToken)
{
var addrType = FocasPmcAddrType.FromLetter(address.PmcLetter ?? "") ?? (short)0;
var bit = address.BitIndex ?? 0;
if (bit is < 0 or > 7)
throw new InvalidOperationException(
$"PMC bit index {bit} out of range (0-7) for {address.Canonical}.");
var rmwLock = GetRmwLock(addrType, address.Number);
await rmwLock.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
// Read the parent byte.
var readBuf = new FwlibNative.IODBPMC { Data = new byte[40] };
var readRet = FwlibNative.PmcRdPmcRng(
_handle, addrType, FocasPmcDataType.Byte,
(ushort)address.Number, (ushort)address.Number, 8 + 1, ref readBuf);
if (readRet != 0) return FocasStatusMapper.MapFocasReturn(readRet);
var current = readBuf.Data[0];
var updated = newValue
? (byte)(current | (1 << bit))
: (byte)(current & ~(1 << bit));
// Write the updated byte.
var writeBuf = new FwlibNative.IODBPMC
{
TypeA = addrType,
TypeD = FocasPmcDataType.Byte,
DatanoS = (ushort)address.Number,
DatanoE = (ushort)address.Number,
Data = new byte[40],
};
writeBuf.Data[0] = updated;
var writeRet = FwlibNative.PmcWrPmcRng(_handle, 8 + 1, ref writeBuf);
return writeRet == 0 ? FocasStatusMapper.Good : FocasStatusMapper.MapFocasReturn(writeRet);
}
finally
{
rmwLock.Release();
}
}
public Task<bool> ProbeAsync(CancellationToken cancellationToken)
{
if (!_connected) return Task.FromResult(false);
var buf = new FwlibNative.ODBST();
var ret = FwlibNative.StatInfo(_handle, ref buf);
return Task.FromResult(ret == 0);
}
// ---- PMC ----
private (object? value, uint status) ReadPmc(FocasAddress address, FocasDataType type)
{
var addrType = FocasPmcAddrType.FromLetter(address.PmcLetter ?? "")
?? throw new InvalidOperationException($"Unknown PMC letter '{address.PmcLetter}'.");
var dataType = FocasPmcDataType.FromFocasDataType(type);
var length = PmcReadLength(type);
var buf = new FwlibNative.IODBPMC { Data = new byte[40] };
var ret = FwlibNative.PmcRdPmcRng(
_handle, addrType, dataType,
(ushort)address.Number, (ushort)address.Number, (ushort)length, ref buf);
if (ret != 0) return (null, FocasStatusMapper.MapFocasReturn(ret));
var value = type switch
{
FocasDataType.Bit => ExtractBit(buf.Data[0], address.BitIndex ?? 0),
FocasDataType.Byte => (object)(sbyte)buf.Data[0],
FocasDataType.Int16 => (object)BinaryPrimitives.ReadInt16LittleEndian(buf.Data),
FocasDataType.Int32 => (object)BinaryPrimitives.ReadInt32LittleEndian(buf.Data),
FocasDataType.Float32 => (object)BinaryPrimitives.ReadSingleLittleEndian(buf.Data),
FocasDataType.Float64 => (object)BinaryPrimitives.ReadDoubleLittleEndian(buf.Data),
_ => (object)buf.Data[0],
};
return (value, FocasStatusMapper.Good);
}
private uint WritePmc(FocasAddress address, FocasDataType type, object? value)
{
var addrType = FocasPmcAddrType.FromLetter(address.PmcLetter ?? "") ?? (short)0;
var dataType = FocasPmcDataType.FromFocasDataType(type);
var length = PmcWriteLength(type);
var buf = new FwlibNative.IODBPMC
{
TypeA = addrType,
TypeD = dataType,
DatanoS = (ushort)address.Number,
DatanoE = (ushort)address.Number,
Data = new byte[40],
};
EncodePmcValue(buf.Data, type, value, address.BitIndex);
var ret = FwlibNative.PmcWrPmcRng(_handle, (ushort)length, ref buf);
return ret == 0 ? FocasStatusMapper.Good : FocasStatusMapper.MapFocasReturn(ret);
}
private (object? value, uint status) ReadParameter(FocasAddress address, FocasDataType type)
{
var buf = new FwlibNative.IODBPSD { Data = new byte[32] };
var length = ParamReadLength(type);
var ret = FwlibNative.RdParam(_handle, (ushort)address.Number, axis: 0, (short)length, ref buf);
if (ret != 0) return (null, FocasStatusMapper.MapFocasReturn(ret));
var value = type switch
{
FocasDataType.Bit when address.BitIndex is int bit => ExtractBit(buf.Data[0], bit),
FocasDataType.Byte => (object)(sbyte)buf.Data[0],
FocasDataType.Int16 => (object)BinaryPrimitives.ReadInt16LittleEndian(buf.Data),
FocasDataType.Int32 => (object)BinaryPrimitives.ReadInt32LittleEndian(buf.Data),
_ => (object)BinaryPrimitives.ReadInt32LittleEndian(buf.Data),
};
return (value, FocasStatusMapper.Good);
}
private uint WriteParameter(FocasAddress address, FocasDataType type, object? value)
{
var buf = new FwlibNative.IODBPSD
{
Datano = (short)address.Number,
Type = 0,
Data = new byte[32],
};
var length = ParamReadLength(type);
EncodeParamValue(buf.Data, type, value);
var ret = FwlibNative.WrParam(_handle, (short)length, ref buf);
return ret == 0 ? FocasStatusMapper.Good : FocasStatusMapper.MapFocasReturn(ret);
}
private (object? value, uint status) ReadMacro(FocasAddress address)
{
var buf = new FwlibNative.ODBM();
var ret = FwlibNative.RdMacro(_handle, (short)address.Number, length: 8, ref buf);
if (ret != 0) return (null, FocasStatusMapper.MapFocasReturn(ret));
// Macro value = mcr_val / 10^dec_val. Convert to double so callers get the correct
// scaled value regardless of the decimal-point count the CNC reports.
var scaled = buf.McrVal / Math.Pow(10.0, buf.DecVal);
return (scaled, FocasStatusMapper.Good);
}
private uint WriteMacro(FocasAddress address, object? value)
{
// Write as integer + 0 decimal places — callers that need decimal precision can extend
// this via a future WriteMacroScaled overload. Consistent with what most HMIs do today.
var intValue = Convert.ToInt32(value);
var ret = FwlibNative.WrMacro(_handle, (short)address.Number, length: 8, intValue, decimalPointCount: 0);
return ret == 0 ? FocasStatusMapper.Good : FocasStatusMapper.MapFocasReturn(ret);
}
public void Dispose()
{
if (_connected)
{
try { FwlibNative.FreeLibHndl(_handle); } catch { }
_connected = false;
}
}
// ---- helpers ----
private static int PmcReadLength(FocasDataType type) => type switch
{
FocasDataType.Bit or FocasDataType.Byte => 8 + 1, // 8-byte header + 1 byte payload
FocasDataType.Int16 => 8 + 2,
FocasDataType.Int32 => 8 + 4,
FocasDataType.Float32 => 8 + 4,
FocasDataType.Float64 => 8 + 8,
_ => 8 + 1,
};
private static int PmcWriteLength(FocasDataType type) => PmcReadLength(type);
private static int ParamReadLength(FocasDataType type) => type switch
{
FocasDataType.Bit or FocasDataType.Byte => 4 + 1,
FocasDataType.Int16 => 4 + 2,
FocasDataType.Int32 => 4 + 4,
_ => 4 + 4,
};
private static bool ExtractBit(byte word, int bit) => (word & (1 << bit)) != 0;
internal static void EncodePmcValue(byte[] data, FocasDataType type, object? value, int? bitIndex)
{
switch (type)
{
case FocasDataType.Bit:
// PMC Bit writes with a non-null bitIndex go through WritePmcBitAsync's RMW path
// upstream. This branch only fires when a caller passes Bit with no bitIndex —
// treat the value as a whole-byte boolean (non-zero / zero).
data[0] = Convert.ToBoolean(value) ? (byte)1 : (byte)0;
break;
case FocasDataType.Byte:
data[0] = (byte)(sbyte)Convert.ToSByte(value);
break;
case FocasDataType.Int16:
BinaryPrimitives.WriteInt16LittleEndian(data, Convert.ToInt16(value));
break;
case FocasDataType.Int32:
BinaryPrimitives.WriteInt32LittleEndian(data, Convert.ToInt32(value));
break;
case FocasDataType.Float32:
BinaryPrimitives.WriteSingleLittleEndian(data, Convert.ToSingle(value));
break;
case FocasDataType.Float64:
BinaryPrimitives.WriteDoubleLittleEndian(data, Convert.ToDouble(value));
break;
default:
throw new NotSupportedException($"FocasDataType {type} not writable via PMC.");
}
_ = bitIndex; // bit-in-byte handled above
}
internal static void EncodeParamValue(byte[] data, FocasDataType type, object? value)
{
switch (type)
{
case FocasDataType.Byte:
data[0] = (byte)(sbyte)Convert.ToSByte(value);
break;
case FocasDataType.Int16:
BinaryPrimitives.WriteInt16LittleEndian(data, Convert.ToInt16(value));
break;
case FocasDataType.Int32:
BinaryPrimitives.WriteInt32LittleEndian(data, Convert.ToInt32(value));
break;
default:
BinaryPrimitives.WriteInt32LittleEndian(data, Convert.ToInt32(value));
break;
}
}
}
/// <summary>Default <see cref="IFocasClientFactory"/> — produces a fresh <see cref="FwlibFocasClient"/> per device.</summary>
public sealed class FwlibFocasClientFactory : IFocasClientFactory
{
public IFocasClient Create() => new FwlibFocasClient();
}
@@ -1,190 +0,0 @@
using System.Runtime.InteropServices;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// <summary>
/// P/Invoke surface for Fanuc FWLIB (<c>Fwlib32.dll</c>). Declarations extracted from
/// <c>fwlib32.h</c> in the strangesast/fwlib repo; the licensed DLL itself is NOT shipped
/// with OtOpcUa — the deployment places <c>Fwlib32.dll</c> next to the server executable
/// or on <c>PATH</c>.
/// </summary>
/// <remarks>
/// Deliberately narrow — only the calls <see cref="FwlibFocasClient"/> actually makes.
/// FOCAS has 800+ functions in <c>fwlib32.h</c>; pulling in every one would bloat the
/// P/Invoke surface + signal more coverage than this driver provides. Expand as capabilities
/// are added.
/// </remarks>
internal static class FwlibNative
{
private const string Library = "Fwlib32.dll";
// ---- Handle lifetime ----
/// <summary>Open an Ethernet FWLIB handle. Returns EW_OK (0) on success; handle written out.</summary>
[DllImport(Library, EntryPoint = "cnc_allclibhndl3", CharSet = CharSet.Ansi, ExactSpelling = true)]
public static extern short AllcLibHndl3(
[MarshalAs(UnmanagedType.LPStr)] string ipaddr,
ushort port,
int timeout,
out ushort handle);
[DllImport(Library, EntryPoint = "cnc_freelibhndl", ExactSpelling = true)]
public static extern short FreeLibHndl(ushort handle);
// ---- PMC ----
/// <summary>PMC range read. <paramref name="addrType"/> is the ADR_* enum; <paramref name="dataType"/> is 0 byte / 1 word / 2 long.</summary>
[DllImport(Library, EntryPoint = "pmc_rdpmcrng", ExactSpelling = true)]
public static extern short PmcRdPmcRng(
ushort handle,
short addrType,
short dataType,
ushort startNumber,
ushort endNumber,
ushort length,
ref IODBPMC buffer);
[DllImport(Library, EntryPoint = "pmc_wrpmcrng", ExactSpelling = true)]
public static extern short PmcWrPmcRng(
ushort handle,
ushort length,
ref IODBPMC buffer);
// ---- Parameters ----
[DllImport(Library, EntryPoint = "cnc_rdparam", ExactSpelling = true)]
public static extern short RdParam(
ushort handle,
ushort number,
short axis,
short length,
ref IODBPSD buffer);
[DllImport(Library, EntryPoint = "cnc_wrparam", ExactSpelling = true)]
public static extern short WrParam(
ushort handle,
short length,
ref IODBPSD buffer);
// ---- Macro variables ----
[DllImport(Library, EntryPoint = "cnc_rdmacro", ExactSpelling = true)]
public static extern short RdMacro(
ushort handle,
short number,
short length,
ref ODBM buffer);
[DllImport(Library, EntryPoint = "cnc_wrmacro", ExactSpelling = true)]
public static extern short WrMacro(
ushort handle,
short number,
short length,
int macroValue,
short decimalPointCount);
// ---- Status ----
[DllImport(Library, EntryPoint = "cnc_statinfo", ExactSpelling = true)]
public static extern short StatInfo(ushort handle, ref ODBST buffer);
// ---- Structs ----
/// <summary>
/// IODBPMC — PMC range I/O buffer. 8-byte header + 40-byte union. We marshal the union
/// as a fixed byte buffer + interpret per <see cref="FocasDataType"/> on the managed side.
/// </summary>
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct IODBPMC
{
public short TypeA;
public short TypeD;
public ushort DatanoS;
public ushort DatanoE;
// 40-byte union: cdata[5] / idata[5] / ldata[5] / fdata[5] / dbdata[5] — dbdata is the widest.
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 40)]
public byte[] Data;
}
/// <summary>
/// IODBPSD — CNC parameter I/O buffer. Axis-aware; for non-axis parameters pass axis=0.
/// Union payload is bytes / shorts / longs — we marshal 32 bytes as the widest slot.
/// </summary>
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct IODBPSD
{
public short Datano;
public short Type; // axis index (0 for non-axis)
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 32)]
public byte[] Data;
}
/// <summary>ODBM — macro variable read buffer. Value = <c>McrVal / 10^DecVal</c>.</summary>
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct ODBM
{
public short Datano;
public short Dummy;
public int McrVal; // long in C; 32-bit signed
public short DecVal; // decimal-point count
}
/// <summary>ODBST — CNC status info. Machine state, alarm flags, automatic / edit mode.</summary>
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct ODBST
{
public short Dummy;
public short TmMode;
public short Aut;
public short Run;
public short Motion;
public short Mstb;
public short Emergency;
public short Alarm;
public short Edit;
}
}
/// <summary>
/// PMC address-letter → FOCAS <c>ADR_*</c> numeric code. Per Fanuc FOCAS/2 spec the codes
/// are: G=0, F=1, Y=2, X=3, A=4, R=5, T=6, K=7, C=8, D=9, E=10. Exposed internally +
/// tested so the FwlibFocasClient translation is verifiable without the DLL loaded.
/// </summary>
internal static class FocasPmcAddrType
{
public static short? FromLetter(string letter) => letter.ToUpperInvariant() switch
{
"G" => 0,
"F" => 1,
"Y" => 2,
"X" => 3,
"A" => 4,
"R" => 5,
"T" => 6,
"K" => 7,
"C" => 8,
"D" => 9,
"E" => 10,
_ => null,
};
}
/// <summary>PMC data-type numeric codes per FOCAS/2: 0 = byte, 1 = word, 2 = long, 4 = float, 5 = double.</summary>
internal static class FocasPmcDataType
{
public const short Byte = 0;
public const short Word = 1;
public const short Long = 2;
public const short Float = 4;
public const short Double = 5;
public static short FromFocasDataType(FocasDataType t) => t switch
{
FocasDataType.Bit or FocasDataType.Byte => Byte,
FocasDataType.Int16 => Word,
FocasDataType.Int32 => Long,
FocasDataType.Float32 => Float,
FocasDataType.Float64 => Double,
_ => Byte,
};
}
@@ -5,15 +5,14 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// configured device; lifetime matches the device.
/// </summary>
/// <remarks>
/// <para><b>No default wire implementation ships with this assembly.</b> FWLIB
/// (<c>Fwlib32.dll</c>) is Fanuc-proprietary and requires a valid customer license — it
/// cannot legally be redistributed. The deployment team supplies an
/// <see cref="IFocasClientFactory"/> that wraps the licensed <c>Fwlib32.dll</c> via
/// P/Invoke and registers it at server startup.</para>
/// <para>The default implementation is <see cref="Wire.WireFocasClient"/> — a pure-managed
/// FOCAS/2 Ethernet client that speaks the wire protocol directly on TCP:8193. No
/// P/Invoke, no native DLLs, no out-of-process isolation.</para>
///
/// <para>The default <see cref="UnimplementedFocasClientFactory"/> throws with a pointer at
/// the deployment docs so misconfigured servers fail fast with a clear error rather than
/// mysteriously hanging.</para>
/// <para><see cref="UnimplementedFocasClientFactory"/> is a scaffolding backend that
/// throws on <see cref="IFocasClientFactory.Create"/> — selected by
/// <c>"Backend": "unimplemented"</c> so a DriverInstance row can be seeded before the CNC
/// endpoint is reachable without silently reading stale data.</para>
/// </remarks>
public interface IFocasClient : IDisposable
{
@@ -48,8 +47,208 @@ public interface IFocasClient : IDisposable
/// responds with any valid status.
/// </summary>
Task<bool> ProbeAsync(CancellationToken cancellationToken);
/// <summary>
/// Read active alarm messages from the CNC via <c>cnc_rdalmmsg2</c>. Returns
/// zero-or-more active alarms. Null / empty list means "no alarms currently
/// active". IAlarmSource projection polls this at a configurable interval +
/// emits transitions (raise / clear) on the driver's <c>OnAlarmEvent</c>.
/// </summary>
Task<IReadOnlyList<FocasActiveAlarm>> ReadAlarmsAsync(CancellationToken cancellationToken);
// ---- Fixed-tree T1 (identity + axis discovery + fast-poll dynamic bundle) ----
/// <summary>
/// Read CNC identity via <c>cnc_sysinfo</c>. Populates the <c>Identity/*</c>
/// subtree of the fixed-node surface. Callable once at session open; the
/// values don't change across the session.
/// </summary>
Task<FocasSysInfo> GetSysInfoAsync(CancellationToken cancellationToken);
/// <summary>
/// Read the CNC's configured axis names via <c>cnc_rdaxisname</c>. The driver
/// uses these to build the <c>Axes/{name}/</c> subtree and to index
/// <see cref="ReadDynamicAsync"/> calls.
/// </summary>
Task<IReadOnlyList<FocasAxisName>> GetAxisNamesAsync(CancellationToken cancellationToken);
/// <summary>
/// Read the CNC's configured spindle names via <c>cnc_rdspdlname</c>. Drives
/// the <c>Spindle/{name}/</c> subtree.
/// </summary>
Task<IReadOnlyList<FocasSpindleName>> GetSpindleNamesAsync(CancellationToken cancellationToken);
/// <summary>
/// Read the fast-poll dynamic bundle for one axis via <c>cnc_rddynamic2</c>.
/// Returns the current position quadruple (absolute / machine / relative /
/// distance-to-go) plus actual feed rate + actual spindle speed + alarm
/// flags + program / sequence numbers — one network round-trip per call.
/// </summary>
Task<FocasDynamicSnapshot> ReadDynamicAsync(int axisIndex, CancellationToken cancellationToken);
// ---- Fixed-tree T2 (program + operation mode) ----
/// <summary>
/// Aggregate program + operation-mode snapshot. One wire round-trip per
/// underlying FWLIB call — <c>cnc_rdblkcount</c>, <c>cnc_exeprgname2</c>,
/// <c>cnc_rdopmode</c>. The driver polls this on a slower cadence than
/// <see cref="ReadDynamicAsync"/> since program / mode transitions happen
/// on human-operator timescales.
/// </summary>
Task<FocasProgramInfo> GetProgramInfoAsync(CancellationToken cancellationToken);
// ---- Fixed-tree T3 (timers) ----
/// <summary>
/// Read one CNC cumulative timer. Kind selects PowerOn / Operating / Cutting /
/// Cycle. Values are seconds — the managed side already converted the native
/// minute+msec representation so downstream nodes display uniform units.
/// </summary>
Task<FocasTimer> GetTimerAsync(FocasTimerKind kind, CancellationToken cancellationToken);
// ---- Fixed-tree T3.5 (servo meters) ----
/// <summary>
/// Read the servo-load meter percentages across all configured axes.
/// Values are percentages (scaled by <c>10^Dec</c>). Empty list on a
/// disconnected session or unsupported CNC.
/// </summary>
Task<IReadOnlyList<FocasServoLoad>> GetServoLoadsAsync(CancellationToken cancellationToken);
// ---- Fixed-tree T3.6 (spindle meters) ----
/// <summary>
/// Read per-spindle load percentages. Result list index corresponds to
/// spindle index from <see cref="GetSpindleNamesAsync"/>. Empty list on a
/// disconnected session or when the CNC doesn't support the call (older
/// series like 16i may return EW_FUNC).
/// </summary>
Task<IReadOnlyList<int>> GetSpindleLoadsAsync(CancellationToken cancellationToken);
/// <summary>
/// Read per-spindle maximum RPM values. Static configuration, fetched once at
/// bootstrap. Index alignment as per <see cref="GetSpindleLoadsAsync"/>.
/// </summary>
Task<IReadOnlyList<int>> GetSpindleMaxRpmsAsync(CancellationToken cancellationToken);
}
/// <summary>One servo-meter entry — one axis's current load percentage.</summary>
public sealed record FocasServoLoad(string AxisName, double LoadPercent);
/// <summary>Which cumulative counter <see cref="IFocasClient.GetTimerAsync"/> reads.</summary>
public enum FocasTimerKind
{
/// <summary>Machine power-on hours — resets never.</summary>
PowerOn = 0,
/// <summary>Cycle operating time — resets when the operator clears the counter.</summary>
Operating = 1,
/// <summary>Cutting time — only counts while in cutting feed.</summary>
Cutting = 2,
/// <summary>Cycle time since the last program start.</summary>
Cycle = 3,
}
/// <summary>One cumulative timer reading. <see cref="TotalSeconds"/> is the canonical unit.</summary>
public sealed record FocasTimer(FocasTimerKind Kind, int Minutes, int Milliseconds)
{
/// <summary>Cumulative time in seconds — <c>Minutes * 60 + Milliseconds / 1000</c>.</summary>
public double TotalSeconds => Minutes * 60.0 + Milliseconds / 1000.0;
}
/// <summary>
/// CNC identity snapshot from <c>cnc_sysinfo</c>. Strings are trimmed ASCII.
/// </summary>
public sealed record FocasSysInfo(
int AddInfo,
int MaxAxis,
string CncType, // "M" (mill) / "T" (lathe)
string MtType,
string Series, // e.g. "30i"
string Version, // e.g. "A1.0"
int AxesCount);
/// <summary>One configured axis name (e.g. "X", "X1").</summary>
public sealed record FocasAxisName(string Name, string Suffix)
{
/// <summary>
/// Display name — name + suffix concatenated, trimmed. Empty suffix yields
/// just the name (the common case on single-channel CNCs).
/// </summary>
public string Display => string.IsNullOrEmpty(Suffix) ? Name : $"{Name}{Suffix}";
}
/// <summary>One configured spindle name (e.g. "S1").</summary>
public sealed record FocasSpindleName(string Name, string Suffix1, string Suffix2, string Suffix3)
{
public string Display
{
get
{
var s = Name + Suffix1 + Suffix2 + Suffix3;
return s.TrimEnd('\0', ' ');
}
}
}
/// <summary>
/// Fast-poll bundle for one axis. Position values are scaled integers; the caller
/// divides by <c>10^DecimalPlaces</c> to get the decimal value. DecimalPlaces is
/// currently left to the caller to supply (via device config or a future
/// <c>cnc_getfigure</c> path once that export lands).
/// </summary>
/// <summary>
/// Program + operation-mode snapshot. Name is the currently-executing
/// program filename (e.g. "O0001.NC"); ONumber is its Fanuc O-number (1-9999).
/// Mode is the numeric code from <c>cnc_rdopmode</c> — see <see cref="FocasOpMode"/>.
/// </summary>
public sealed record FocasProgramInfo(
string Name,
int ONumber,
int BlockCount,
int Mode);
/// <summary>Human-readable text for the <see cref="FocasProgramInfo.Mode"/> integer.</summary>
public static class FocasOpMode
{
public static string ToText(int mode) => mode switch
{
0 => "MDI",
1 => "AUTO",
2 => "TJOG",
3 => "EDIT",
4 => "HANDLE",
5 => "JOG",
6 => "TEACH_IN_HANDLE",
7 => "REFERENCE",
8 => "REMOTE",
9 => "TEST",
_ => $"Mode{mode}",
};
}
public sealed record FocasDynamicSnapshot(
int AxisIndex,
int AlarmFlags,
int ProgramNumber,
int MainProgramNumber,
int SequenceNumber,
int ActualFeedRate,
int ActualSpindleSpeed,
int AbsolutePosition,
int MachinePosition,
int RelativePosition,
int DistanceToGo);
/// <summary>
/// One active alarm surfaced by <see cref="IFocasClient.ReadAlarmsAsync"/>. Shape
/// mirrors <c>ODBALMMSG2</c> but normalises the message bytes to a .NET string.
/// </summary>
public sealed record FocasActiveAlarm(
int AlarmNumber,
short Type,
short Axis,
string Message);
/// <summary>Factory for <see cref="IFocasClient"/>s. One client per configured device.</summary>
public interface IFocasClientFactory
{
@@ -57,14 +256,32 @@ public interface IFocasClientFactory
}
/// <summary>
/// Default factory that throws at construction time — the deployment must register a real
/// factory. Keeps the driver assembly licence-clean while still allowing the skeleton to
/// compile + the abstraction tests to run.
/// Scaffolding factory throws on <see cref="Create"/> so a DriverInstance row can be
/// seeded ahead of the CNC endpoint being reachable without silently reading stale data.
/// Select via <c>"Backend": "unimplemented"</c> in driver config. Flip to
/// <c>"Backend": "wire"</c> once the CNC is provisioned.
/// </summary>
public sealed class UnimplementedFocasClientFactory : IFocasClientFactory
{
public IFocasClient Create() => throw new NotSupportedException(
"FOCAS driver has no wire client configured. Register a real IFocasClientFactory at " +
"server startup wrapping the licensed Fwlib32.dll — see docs/v2/focas-deployment.md. " +
"Fanuc licensing forbids shipping Fwlib32.dll in the OtOpcUa package.");
"FOCAS driver backend is 'unimplemented'. Switch to 'Backend: \"wire\"' in driver config " +
"once the CNC is provisioned — see docs/drivers/FOCAS.md.");
}
/// <summary>
/// Well-known FOCAS alarm types from <c>fwlib32.h</c> <c>ALM_TYPE_*</c>. Narrow subset —
/// the full list is ~15 types per model; these cover the universally-present categories.
/// </summary>
public static class FocasAlarmType
{
/// <summary>Pass to <see cref="IFocasClient.ReadAlarmsAsync"/>-equivalent to mean "any type".</summary>
public const int All = -1;
public const int Parameter = 0; // ALM_P
public const int PulseCode = 1; // ALM_Y (servo)
public const int Overtravel = 2; // ALM_O
public const int Overheat = 3; // ALM_H
public const int Servo = 4; // ALM_S
public const int DataIo = 5; // ALM_T
public const int MemoryCheck = 6; // ALM_M
public const int MacroAlarm = 13; // ALM_MC — used by #3006 etc.
}

Some files were not shown because too many files have changed in this diff Show More