Compare commits

..

65 Commits

Author SHA1 Message Date
Joseph Doherty 837172ab39 PR 5.8 — Per-platform ScanState probe parity scenarios
Closes Phase 5 scenario coverage. Both
GalaxyRuntimeProbeManager (legacy) and PerPlatformProbeWatcher (PR 4.7)
must surface the same per-host status stream:

- GetHostStatuses_emits_same_host_set_after_Discover — drives Discover
  on both backends, waits 1.5s for the probe watcher's first push, then
  asserts the platform-host set agrees (transport-entry names differ
  by design — legacy uses the Galaxy.Host process identity, mxgw uses
  MxAccess.ClientName, so we strip those before comparing).
- GetHostStatuses_state_per_platform_matches_across_backends — for
  every overlapping platform host, the HostState must be identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:31:09 -04:00
Joseph Doherty 80a0ca2651 PR 5.7 — Reconnect / disruption parity scenarios
- Reinitialize_returns_both_backends_to_Healthy — drives
  ReinitializeAsync on each backend, asserts DriverState.Healthy
  afterwards, then re-reads a 3-tag sample to confirm the runtime
  surface is back. Recovery latency isn't pinned tightly (legacy = pipe
  + MxAccess COM client, mxgw = re-Register gw session — different
  cadences are expected).
- Health_state_diverges_only_when_one_backend_is_in_recovery — soft
  pin that both backends sit in Healthy or Degraded after init.

A tighter fault-injection scenario (toxiproxy-style) is the 5.7
follow-up — landed when the parity rig grows that capability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:29:44 -04:00
Joseph Doherty 8d042c631b PR 5.6 — History-read parity scenarios
Galaxy history reads route through the server-owned HistoryRouter
(Phase 1, PR 1.3) — neither Galaxy backend implements IHistoryProvider
directly. Parity surface here is the routing decision:

- Discover_emits_same_historized_attribute_set_for_both_backends — the
  IsHistorized attribute set must agree symmetric-set-wise; that's what
  HistoryRouter consumes when deciding whether to route a HistoryRead to
  the Wonderware historian sidecar.
- Neither_Galaxy_backend_implements_IHistoryProvider_directly — pins
  the architectural decision so a regression that re-introduces a
  per-driver history path fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:29:01 -04:00
Joseph Doherty bbdbdf8afb PR 5.5 — Alarm transition parity scenarios
- Discover_emits_same_AlarmConditionInfo_per_alarm_attribute — both
  backends produce the same alarm-condition source-node-id set, with
  matching SourceName / InitialSeverity / InAlarmRef / DescAttrNameRef
  per condition. Skips when the rig's Galaxy carries no alarm-marked
  attributes.
- Discover_marks_at_least_one_alarm_attribute_when_dev_Galaxy_has_alarms
  — IsAlarm-marked variable count parity, soft-pinned (count must
  match across backends but doesn't have to be non-zero).

Alarm-event persistence (the SQLite store-and-forward → Wonderware
historian event store path) is exercised in PR 5.6 against the
historian sidecar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:28:13 -04:00
Joseph Doherty 982771df9a PR 5.4 — Write-by-classification parity scenarios
Both backends route a write through the same path keyed off the attribute's
SecurityClassification, so a single write request must produce the same
StatusCode on each:

- FreeAccess_or_Operate_write_returns_same_StatusCode_on_both_backends
  picks the first numeric FreeAccess/Operate attribute and writes 0.0.
- Configure_class_write_routes_through_secured_path_on_both_backends
  picks a Configure/Tune attribute, writes through the secured path,
  asserts StatusCode parity (the test doesn't care whether the write
  succeeds — only that both backends produce the same outcome).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:26:57 -04:00
Joseph Doherty 9db6da9c20 PR 5.3 — Subscribe + event-rate parity scenarios
- Subscribe_returns_a_handle_for_each_backend — both backends accept
  the same full-reference list and return a non-null handle, with
  symmetric Unsubscribe cleanup.
- Subscribe_event_rate_within_tolerance_for_a_3s_window — counts
  OnDataChange invocations on each backend across a 3s window and
  asserts the mxgw/legacy ratio sits in [0.5, 1.5]. Skips when the
  sampled tags don't change in the window (configuration-only Galaxy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:25:42 -04:00
Joseph Doherty 71443ecbf3 PR 5.2 — Browse + read parity scenarios
Three scenarios using ParityHarness.RequireBoth:

- Discover_emits_same_variable_set_for_both_backends — symmetric set diff
  on the full-reference set must be empty.
- Discover_emits_same_DataType_and_SecurityClass_per_attribute — meta
  triple (DriverDataType, SecurityClass, IsHistorized) must match per
  attribute.
- Read_returns_same_value_and_status_for_a_sampled_attribute — samples
  the first 5 discovered variables, reads through both backends, asserts
  StatusCode equality and value-CLR-type equality (raw values may drift
  between the two reads on a live Galaxy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:24:36 -04:00
Joseph Doherty 82cdf460c5 PR 5.1 — Driver.Galaxy.ParityTests project shell + ParityHarness
Side-by-side fixture that boots both backends against the same dev Galaxy:

- Legacy GalaxyProxyDriver against an out-of-process Galaxy.Host EXE
  (skipped when ZB SQL on localhost:1433 isn't reachable or when the EXE
  hasn't been built).
- New in-process GalaxyDriver against an mxaccessgw gateway at
  http://localhost:5120 by default (skipped when the gateway isn't
  reachable). Endpoint, API key, and client name are env-var overridable
  for the central parity host.

Per-backend availability is independent — each scenario decides whether
to RequireBoth, GetDriver(specific), or use RunOnAvailableAsync to drive
both with the same closure and diff snapshots. PR 5.2–5.8 land scenarios
on top of this shell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:22:04 -04:00
Joseph Doherty 21cac4c8c4 PR 4.W — Galaxy:Backend wiring + server-side factory registration
- GalaxyDriver.InitializeAsync now builds the production gw runtime (MxGatewayClient,
  GalaxyMxSession, GatewayGalaxySubscriber, GatewayGalaxyDataWriter,
  ReconnectSupervisor, HostConnectivityForwarder, PerPlatformProbeWatcher) when no
  test seams are pre-injected; Dispose tears the chain down in order.
- GetHealth surfaces supervisor.IsDegraded as DriverState.Degraded so a transport
  drop is observable without polling the supervisor directly.
- DiscoverAsync now refreshes the per-platform probe watcher's membership against
  $WinPlatform / $AppEngine objects after every discovery pass.
- OnPumpDataChange routes ScanState changes through the probe watcher in addition
  to fanning out OnDataChange to ISubscribable consumers.
- Server registers GalaxyDriver under "GalaxyMxGateway" alongside the legacy
  "Galaxy" GalaxyProxyDriver factory so DriverInstance rows can opt in.
- Bumped Server.Tests' Microsoft.Extensions.Logging.Abstractions to 10.0.7 to
  resolve the downgrade pulled in transitively via MxGateway.Client.
- Lifecycle factory tests switched to the internal seam-injection ctor so they
  no longer attempt a real gRPC connect during InitializeAsync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:10:31 -04:00
Joseph Doherty dae520b9c0 PR 4.7 — Host-connectivity probes (IHostConnectivityProbe scaffold)
HostStatusAggregator merges transport + per-platform host entries with
change-event diffing (re-asserting same state is a no-op so a stable
ScanState=Running burst doesn't fan out duplicates). PerPlatformProbeWatcher
ports the legacy GalaxyRuntimeProbeManager state machine onto the gw
subscription path: SubscribeBulk for `<tag>.ScanState`, idempotent
SyncPlatformsAsync (subscribe new, unsubscribe dropped), and a
DecodeState helper pinning bool/int/string ScanState values + bad-quality
fallback. HostConnectivityForwarder is the skeleton for the gw-6
StreamSessionHealth signal — until that mxaccessgw RPC ships, PR 4.5's
ReconnectSupervisor pushes transport state by calling SetTransport on
session connect/disconnect.

GalaxyDriver wiring (implement IHostConnectivityProbe, route OnDataChange
to PerPlatformProbeWatcher, expose GetHostStatuses() / OnHostStatusChanged,
push transport from supervisor) is deferred to PR 4.W to avoid conflict
with the rest of the Phase 4 deferred wiring (4.5 supervisor + 4.6
DeployWatcher).

Tests: 19 new
- HostStatusAggregatorTests (9): empty snapshot, new-host change with
  Unknown predecessor, same-state silence, transition diff, snapshot
  reflects every host, case-insensitive host names, Remove returns true
  for tracked, Remove false for unknown, concurrent updates don't corrupt.
- HostConnectivityForwarderTests (5): SetTransport routes under client
  name, transitions fire change, repeated same-state silent, empty client
  name throws, post-dispose throws.
- PerPlatformProbeWatcherTests (5 + theory pinning DecodeState's full
  truth table): subscribe N platforms, idempotent re-sync, removed
  platforms unsubscribed + dropped from aggregator, OnProbeValueChanged
  routing for Running/Stopped/bad-quality/foreign-ref, Dispose
  unsubscribes everything.

NOTE: build is currently broken because mxaccessgw/clients/dotnet/ has
been removed from C:\Users\dohertj2\Desktop\mxaccessgw — this PR's source
is internally consistent and isolated from the missing dependency, but the
existing Driver.Galaxy code (PRs 4.1–4.6) can't compile until the .NET
client is restored. Once it is, expect 116 + 19 = 135 tests in the
Driver.Galaxy.Tests project.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:47:13 -04:00
Joseph Doherty 123e3e48b9 PR 4.5 — ReconnectSupervisor
State machine that drives GalaxyDriver's recovery from gw transport
failure. Healthy → TransportLost → Reopening → Replaying → Healthy. Drivers
report failure signals; the supervisor runs reopen + replay with capped
exponential backoff (default 500ms → 30s) until both succeed.

Files:
- Runtime/ReconnectSupervisor.cs — state machine with snapshot, change
  event, last-error tracking, and a one-attempt-at-a-time recovery loop.
  Idempotent ReportTransportFailure: repeated failure reports during an
  in-flight recovery do not spawn parallel loops. Reopen + replay are
  caller-supplied callbacks (the driver injects them in the wire-up PR);
  reopen re-Registers the gw session, replay re-establishes every active
  subscription via gw's ReplaySubscriptionsCommand (mxaccessgw issue gw-3)
  or the SubscribeBulk fallback. Dispose cancels the loop cleanly.
- Public StateTransition record + IsDegraded predicate the driver maps
  to DriverState.Degraded for health snapshots.

Wiring (GalaxyDriver subscribes the supervisor to its EventPump's
transport-failure signal, exposes IsDegraded through GetHealth(), routes
reopen/replay callbacks through GalaxyMxSession + SubscriptionRegistry)
lands in PR 4.W to avoid conflict with the parallel host-probe track
(PR 4.7) and align the wire-up with the rest of Phase 4's plumbing.

9 supervisor tests (full state-machine traversal, retry-until-success on
both reopen and replay failures, idempotent failure reports, last-error
propagation, Dispose mid-recovery, post-dispose throws, fast-path Healthy
WaitForHealthy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:39:21 -04:00
Joseph Doherty 7922e573b1 PR 4.6 — DeployWatcher (IRediscoverable scaffold)
DeployWatcher consumes GalaxyRepositoryClient.WatchDeployEventsAsync,
suppresses the bootstrap event, and raises RediscoveryEventArgs whenever
time_of_last_deploy actually changes. Reconnect-on-error with capped
exponential backoff. GalaxyDriver wiring (IRediscoverable.OnRediscoveryNeeded
event + StartAsync inside InitializeAsync) lands in a follow-up so this PR
doesn't conflict with the parallel runtime track.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:33:37 -04:00
Joseph Doherty ce004c80ab PR 4.4 — ISubscribable + EventPump
Subscription path online. GalaxyDriver implements ISubscribable; subscribes
batches via gw SubscribeBulkAsync, runs a single shared EventPump consumer
of StreamEventsAsync, fans out OnDataChange events to every driver
subscription that observes the changed gw item handle.

Files:
- Runtime/GalaxySubscriptionHandle.cs — record implementing ISubscriptionHandle.
- Runtime/SubscriptionRegistry.cs — bookkeeping with forward (subscriptionId
  → bindings) and reverse (itemHandle → list of subscriptionIds) maps. The
  reverse map is the fan-out index so a single OnDataChange dispatches to
  every subscription that observes the changed handle.
- Runtime/IGalaxySubscriber.cs — driver-side seam: SubscribeBulk +
  UnsubscribeBulk + StreamEventsAsync. Production wraps GalaxyMxSession;
  tests substitute a fake driving synthetic MxEvents.
- Runtime/GatewayGalaxySubscriber.cs — production. Forwards to
  MxGatewaySession; bufferedUpdateIntervalMs is captured for now and
  becomes a SetBufferedUpdateInterval call once gw issue #102 / gw-9 lands
  (PR 6.3 picks this up).
- Runtime/EventPump.cs — long-running background consumer of
  StreamEventsAsync. Decodes MxValue + maps quality byte/MxStatusProxy via
  StatusCodeMap. Fan-out per subscriber resolves through the registry; bad
  handler exceptions are caught + logged, never break the dispatch loop.
  Filters out non-OnDataChange families (write-complete and operation-
  complete come back via InvokeAsync's reply path, not the event stream).

GalaxyDriver:
- Adds ISubscribable. SubscribeAsync allocates a subscription id,
  SubscribeBulks, builds the binding list (failed gw entries get
  ItemHandle=0 + a per-tag warn log), registers, and returns the handle.
  EventPump is started lazily on first subscribe; one pump per driver
  shared across all subscriptions.
- UnsubscribeAsync removes from the registry first (so stale events are
  filtered immediately) then calls UnsubscribeBulk best-effort. Foreign
  handles throw ArgumentException.
- ReadAsync NotSupportedException message updated: PR 4.4 no longer the
  pointer (deferred to a small follow-up that wraps the pump as a
  one-shot reader).
- Dispose tears down the pump first, then the repository client, then
  clears state.
- Internal ctor extended with optional subscriber parameter.

Tests (15 new, 109 Galaxy total):
- SubscriptionRegistryTests: monotonic id allocation, single+multi
  subscription fan-out, failed-handle exclusion, removal isolation, count
  invariants.
- GalaxyDriverSubscribeTests: handle allocation + value-change dispatch,
  multi-subscription fan-out, failed-tag silence, unsubscribe drops gw
  handle and stops dispatch, foreign handle throws, no-subscriber throws,
  empty-tag-list returns handle without calling gw.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:33:27 -04:00
Joseph Doherty a617086da1 PR 4.3 — IWritable + secured-write routing
Write path online. GalaxyDriver implements IWritable; routes by
SecurityClassification — SecuredWrite / VerifiedWrite tags go through
MxCommandKind.WriteSecured, everything else through MxGatewaySession.
WriteAsync. Per-tag classifications are captured during ITagDiscovery via a
SecurityCapturingBuilder wrapper that intercepts Variable() calls without
the discoverer needing to know about the driver's internal state.

Files:
- Runtime/MxValueEncoder.cs — boxed CLR value → MxValue. Covers seven Galaxy
  scalar types (bool/int8-32/uint8-32 → Int32, int64/uint64 → Int64, float,
  double, string, DateTime/DateTimeOffset → Timestamp) and 1-D array
  variants. Inverse of MxValueDecoder; round-trip pinned by tests.
  DateTime.Local converts to UTC; unsupported types throw ArgumentException.
- Runtime/IGalaxyDataWriter.cs — driver-side seam. Tests inject a fake to
  capture routing decisions; production path uses GatewayGalaxyDataWriter.
- Runtime/GatewayGalaxyDataWriter.cs — production. Lazy-AddItem caches
  itemHandles, encodes value, routes Write vs WriteSecured, translates
  MxCommandReply (ProtocolStatus → BadCommunicationError; first
  MxStatusProxy in statuses[] via StatusCodeMap.FromMxStatus). Per-tag
  exception isolation: one bad write doesn't fail the batch.
- GalaxyDriver: now implements IWritable. Discovery wraps the supplied
  IAddressSpaceBuilder in SecurityCapturingBuilder which records each
  attribute's SecurityClass into _securityByFullRef before delegating.
  WriteAsync resolves classification per tag (FreeAccess default for
  unknown tags — matches the legacy backend), routes through the injected
  writer. Throws NotSupportedException with PR 4.4 pointer when no writer
  is wired (production path requires GalaxyMxSession.Connect from PR 4.4).

Tests (32 new, 94 Galaxy total):
- MxValueEncoder: every scalar type, narrowing checks (sbyte/short/byte/
  ushort fit Int32; uint within Int32 range; ulong within Int64),
  DateTime.Local → UTC conversion, array variants for bool/double/string/
  DateTime, Dimensions populated, unsupported-type throws ArgumentException,
  encoder/decoder round-trip pin.
- GalaxyDriverWriteTests: WriteAsync routes through fake writer with
  values intact; theory exercises every SecurityClassification value through
  the discovery-then-write path; unknown-tag defaults to FreeAccess; empty-
  request short-circuit; no-writer fail-loud; post-dispose throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:24:22 -04:00
Joseph Doherty 85bdf0d58b PR 4.2 — IReadable abstraction + StatusCodeMap + MxValueDecoder
Read path scaffold + the byte→uint quality mapping table that the parity
matrix (PR 5.x) pins. PR 4.4 supplies the production GW-backed reader; this
PR ships the abstraction and the supporting infrastructure so 4.4 just
plugs the implementation in.

Files:
- Runtime/StatusCodeMap.cs — explicit OPC DA quality byte → OPC UA
  StatusCode uint mapping. Extends the legacy Galaxy.Host
  HistorianQualityMapper with named constants (Good / GoodLocalOverride,
  Uncertain + 4 substatuses, Bad + 7 substatuses, BadInternalError) and an
  MxStatusProxy → uint helper that honors success flag → detail byte →
  detected_by transport-error fallback. Unknown bytes fall back to category
  bucket with a once-per-session diagnostic log so field captures can
  extend the table.
- Runtime/MxValueDecoder.cs — gateway MxValue → boxed CLR value for the
  seven Galaxy data types (Boolean, Int32, Int64, Float32, Float64, String,
  DateTime) plus their array variants. Honors MxValue.IsNull and
  RawValue passthrough.
- Runtime/IGalaxyDataReader.cs — driver-side seam for one-shot reads. PR
  4.4 ships the production wrapper around MxGatewaySession.SubscribeBulk +
  StreamEvents + UnsubscribeBulk; this PR exposes the contract so
  GalaxyDriver.ReadAsync wires through it.
- Runtime/GalaxyMxSession.cs — wrapper around MxGatewaySession that owns
  the Register handle. ConnectAsync opens session + Register; AttachForTests
  lets tests bypass real gw construction. PR 4.3/4.4/4.5 add write,
  subscribe, and reconnect surfaces.

GalaxyDriver:
- Implements IReadable. ReadAsync routes through the injected
  IGalaxyDataReader (test seam) when present; production path throws
  NotSupportedException pointing at PR 4.4 — protects deployments running
  this PR from silent wrong reads while signaling that the legacy-host
  backend (Galaxy:Backend=legacy-host) handles reads in the meantime.
- Internal ctor extended with optional dataReader parameter (default null,
  preserves PR 4.0/4.1 callers).

Tests: 42 new — exhaustive byte→uint table for StatusCodeMap (15 known
codes + category-bucket fallback for unknowns + MxStatusProxy precedence
rules + OPC UA top-byte invariants), every MxValue oneof case for the
decoder (bool/int32/int64/float/double/string/timestamp/3 array variants/
raw bytes/null), GalaxyDriver IReadable wiring (route-through, empty-
request, no-reader-throws, post-dispose-throws, status-code preservation).
62 Galaxy tests total pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:15:42 -04:00
Joseph Doherty ecba5cedf9 PR 4.1 — ITagDiscovery via GalaxyRepositoryClient + AlarmRefBuilder
Browse path online. GalaxyDriver now implements ITagDiscovery against the
gateway's GalaxyRepositoryClient (PR 0.1's mxaccessgw browse RPC) and feeds
the address-space builder one folder per gobject + one variable per dynamic
attribute, with alarm-bearing attributes carrying all five sub-attribute refs
the server-level AlarmConditionService (PR 2.2) needs.

Files:
- Browse/IGalaxyHierarchySource.cs — driver-side seam between the discoverer
  and the gateway. Test fakes return canned hierarchies so the discoverer's
  translation logic is exercised without a real gRPC channel.
- Browse/GatewayGalaxyHierarchySource.cs — production wrapper around
  GalaxyRepositoryClient.DiscoverHierarchyAsync (paged internally).
- Browse/GalaxyDiscoverer.cs — translates GalaxyObject → IAddressSpaceBuilder
  calls. Browse name = contained_name (falls back to tag_name); full
  reference = attr.full_tag_reference when set, else tag_name + "." +
  attribute_name. Skips objects/attributes with empty identity.
- Browse/DataTypeMap.cs — mx_data_type → DriverDataType (port from legacy
  GalaxyProxyDriver.MapDataType, same fallback to String for unknown codes).
- Browse/SecurityMap.cs — security_classification → SecurityClassification
  (port from legacy GalaxyProxyDriver.MapSecurity).
- Browse/AlarmRefBuilder.cs — populates the five sub-attribute refs by
  Galaxy convention (.InAlarm/.Priority/.DescAttrName/.Acked/.AckMsg). The
  same convention the legacy GalaxyAlarmTracker hard-coded; concentrated
  here so PR 2.2's service receives complete AlarmConditionInfo rows.

GalaxyDriver:
- Added internal ctor accepting IGalaxyHierarchySource? for test injection.
  Default lazily builds GatewayGalaxyHierarchySource around a
  GalaxyRepositoryClient constructed from options on first DiscoverAsync.
- Owned GalaxyRepositoryClient disposed in Dispose.
- ApiKey resolution is currently a passthrough of ApiKeySecretRef — PR 4.W
  (or follow-up) wires DPAPI-backed secret resolution.

csproj: path-based ProjectReference to mxaccessgw (the user is shipping
that repo on a parallel track; both repos sit side-by-side on the dev box).
Tests project also references MxGateway.Contracts directly to construct
GalaxyObject / GalaxyAttribute fixtures.

Tests: 10 new in Browse/GalaxyDiscovererTests.cs covering folder-per-object,
variable-per-attribute, full-ref defaulting + gw-supplied override, browse-
name fallback, every metadata field propagation, alarm sub-attribute ref
population, non-alarm rows skip MarkAsAlarmCondition, empty-identity skips,
empty-attribute-name skips, end-to-end through GalaxyDriver.DiscoverAsync.
20 total Galaxy tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:06:02 -04:00
Joseph Doherty f6a4f919e2 PR 4.0 — Driver.Galaxy project skeleton + factory
New in-process .NET 10 driver project at
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/. The Tier-A replacement for
Driver.Galaxy.Host + Driver.Galaxy.Proxy. PR 4.0 ships only the IDriver
shape + factory + options; capability bodies (browse, read, write,
subscribe, deploy-watch, host probes) land in PRs 4.1–4.7.

Files:
- Driver.Galaxy.csproj — net10 x64, AnyCPU+x64 platforms, references
  Core.Abstractions + Core. No MxGatewayClient ProjectReference yet — that
  comes in PR 4.2 once the gw NuGet package is wired (the user is
  shipping mxaccessgw on a parallel track).
- Config/GalaxyDriverOptions.cs — nested record hierarchy
  (Gateway/MxAccess/Repository/Reconnect) mirroring the JSON shape spelled
  out in lmx_mxgw_impl.md PR 4.0 acceptance section.
- GalaxyDriver.cs — minimal IDriver impl. Initialize/Shutdown toggle
  DriverHealth between Healthy/Unknown; Reinitialize bumps the timestamp;
  GetMemoryFootprint=0 (PR 4.4 wires SubscriptionRegistry size);
  FlushOptionalCachesAsync no-op. Logs intent on lifecycle calls so
  partial deployments are diagnosable.
- GalaxyDriverFactoryExtensions.cs — JSON parser, default fill-ins,
  validation throw on missing required fields. Driver type name
  "GalaxyMxGateway" intentionally distinct from legacy "Galaxy" so both
  factories coexist during parity testing (Phase 5). PR 4.W's
  Galaxy:Backend switch picks one or the other.

Tests:
- 10 tests in Driver.Galaxy.Tests covering minimal-config defaults, full
  override path, three required-field error cases, factory registration
  via DriverFactoryRegistry.TryGet, lifecycle health transitions
  (Init → Shutdown → Reinit), Dispose idempotency, and post-disposal
  ObjectDisposedException.

slnx: registers the new Driver.Galaxy + Driver.Galaxy.Tests projects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:57:31 -04:00
Joseph Doherty 854827090a PR 3.W — Phase 3 wire-up: Wonderware sidecar DI registration
Solution + DI plumbing to complete Phase 3. With this PR the .NET 10 server
can boot with the Wonderware historian sidecar in the loop, gated by config
so existing deployments are unaffected.

slnx: registers Driver.Historian.Wonderware (net48 sidecar),
Driver.Historian.Wonderware.Client (net10 client), and both test projects.

Server.csproj: adds ProjectReference to the .NET 10 client.

Program.cs: reads Historian:Wonderware:* configuration. When Enabled=true,
constructs a WonderwareHistorianClient singleton and:
  - Registers it as IAlarmHistorianWriter so the SqliteStoreAndForwardSink
    drain (task #248) can pick it up.
  - Registers a WonderwareHistorianBootstrap hosted service that, on
    StartAsync, calls IHistoryRouter.Register(prefix, client) under the
    configured DriverInstancePrefix (default "galaxy") — lets the
    HistoryRead* dispatch in DriverNodeManager find the sidecar via
    longest-prefix-match resolution.

When Enabled=false (the default), DriverNodeManager keeps using its
internal LegacyDriverHistoryAdapter for the read path and the existing
NullAlarmHistorianSink stays in place — drop-in compatible with every
deployment that hasn't moved off Galaxy.Host yet.

42 server integration tests + 10 client tests pass. Full solution build
clean (0/0).

Note: scripts/install/Install-Services.ps1 and
src/.../Server/appsettings.json carry intermixed user WIP and are NOT
committed in this PR. Equivalent edits applied locally:

  Install-Services.ps1: new -InstallWonderwareHistorian switch installs the
  OtOpcUaWonderwareHistorian service alongside OtOpcUaGalaxyHost;
  generates a fresh historian shared secret; OtOpcUa service depends on
  both when historian sidecar is installed.

  Server/appsettings.json: new Historian.Wonderware section with
  Enabled=false default, PipeName/SharedSecret/PeerName/
  DriverInstancePrefix/ConnectTimeoutSeconds/CallTimeoutSeconds keys.

Both pieces should land in a follow-up commit once the user's WIP on those
files clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:48:47 -04:00
Joseph Doherty 14947fde51 PR 3.4 — Wonderware historian sidecar .NET 10 client
New project Driver.Historian.Wonderware.Client (net10 x64) implements both
Core.Abstractions.IHistorianDataSource (read paths consumed by the server's
IHistoryRouter) and Core.AlarmHistorian.IAlarmHistorianWriter (alarm-event
drain consumed by SqliteStoreAndForwardSink) against the sidecar's PR 3.3
pipe protocol.

Wire-format files (Framing/MessageKind, Hello, Contracts, FrameReader,
FrameWriter) are byte-identical mirrors of the sidecar's net48 originals —
the sidecar can't be referenced as a ProjectReference because of the
runtime/bitness gap, so we duplicate and pin the wire bytes via tests.

PipeChannel owns one bidirectional NamedPipeClientStream + Hello handshake +
serializes calls. Single in-flight at a time (semaphore); transport failures
trigger one in-flight reconnect-and-retry before propagating. Connect is
abstracted behind a Func<CancellationToken, Task<Stream>> so tests inject
in-process pipes.

WonderwareHistorianClient maps:
- HistorianSampleDto.Quality (raw OPC DA byte) → OPC UA StatusCode uint via
  QualityMapper (port of HistorianQualityMapper from sidecar).
- HistorianAggregateSampleDto.Value=null → BadNoData (0x800E0000).
- WriteAlarmEventsReply.PerEventOk[i]=true → Ack, false → RetryPlease.
  Whole-call failure or transport exception → RetryPlease for every event in
  the batch (drain worker handles backoff).
- AlarmHistorianEvent → AlarmHistorianEventDto with severity bucketed via
  AlarmSeverity-to-ushort mapping (Low=250, Medium=500, High=700, Crit=900).

GetHealthSnapshot tracks transport success + sidecar-reported failure
separately; ConsecutiveFailures rises on operation-level errors, not just
transport drops.

10 round-trip tests via FakeSidecarServer (in-process net10 fake using the
client's own framing): byte→uint quality mapping, null-bucket BadNoData,
at-time order preservation, event-field round-trip, sidecar error surfacing,
WriteBatch per-event status, whole-call retry-please mapping, Hello
shared-secret rejection, transport-drop reconnect-and-retry, health snapshot
counters.

PR 3.W will register this client as IHistorianDataSource + IAlarmHistorianWriter
in OpcUaServerService DI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:40:56 -04:00
Joseph Doherty 9f7a4ac769 PR 3.3 — Wonderware sidecar pipe protocol + dispatcher
Sidecar now serves a length-prefixed, kind-tagged MessagePack pipe protocol
mirroring Galaxy.Host's: 4-byte BE length + 1-byte MessageKind + body, 16 MiB
cap. Hello handshake validates per-process shared secret + protocol major
version + caller SID via ImpersonateNamedPipeClient before any work frame
runs.

Five contract pairs ship in this PR:

  ReadRawRequest          ↔ ReadRawReply
  ReadProcessedRequest    ↔ ReadProcessedReply
  ReadAtTimeRequest       ↔ ReadAtTimeReply
  ReadEventsRequest       ↔ ReadEventsReply
  WriteAlarmEventsRequest ↔ WriteAlarmEventsReply

Timestamps cross the wire as DateTime ticks (long) to dodge MessagePack's
DateTime kind/timezone quirks; both sides convert with DateTime(ticks, Utc).
Sample values cross as MessagePack-serialized byte[] so the .NET 10 client
(PR 3.4) deserializes per the tag's mx_data_type without the sidecar needing
to know OPC UA types.

HistorianFrameHandler dispatches by MessageKind to IHistorianDataSource (the
PR 3.2 lifted interface) for reads, and to a new IAlarmEventWriter strategy
for the alarm-event persistence path. Per-call exceptions surface as
Success=false replies so a single bad request doesn't kill the connection.
WriteAlarmEvents replies carry per-event success flags; the SQLite
store-and-forward sink retries failed slots on the next drain tick.

Program.cs spins the pipe server when OTOPCUA_HISTORIAN_ENABLED=true. Pipe-
only mode (default false) preserves PR 3.1's smoke-test behaviour: the host
still validates env vars and waits for Ctrl-C, but doesn't initialize the
Wonderware SDK.

Sidecar test project gains 8 round-trip tests (37 total now): every contract
pair round-trips through FrameReader/FrameWriter via in-memory streams, the
handler surfaces historian exceptions cleanly, WriteAlarmEvents per-event
status flows through, and the no-writer-configured path returns a clean
error reply.

Added MessagePack 2.5.187 to the sidecar csproj.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:27:17 -04:00
Joseph Doherty bc7ec746c5 PR 1+2.W — Wire HistoryRouter + AlarmConditionService into DI
Server-side singletons threaded through OpcUaApplicationHost → OtOpcUaServer
→ DriverNodeManager construction. New ctor parameters are last-position
optional with null defaults so every existing test construction site
(OpcUaServerIntegrationTests, AlarmSubscribeIntegrationTests, etc.) keeps
working unchanged.

Program.cs:
  AddSingleton<IHistoryRouter, HistoryRouter>();
  AddSingleton<AlarmConditionService>();

The router stays empty after this PR. DriverNodeManager's internal
LegacyDriverHistoryAdapter handles every driver that still implements
IHistoryProvider; PR 3.W will register the Wonderware sidecar as a router
source; PR 7.2 retires the legacy fallback entirely.

44 alarm + history + integration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:13:51 -04:00
Joseph Doherty 9365beb966 PR 3.2 — Lift Wonderware Historian SDK code to sidecar
Move all historian implementation files from Driver.Galaxy.Host/Backend/Historian/
to Driver.Historian.Wonderware/Backend/. Sidecar now owns the aahClientManaged /
aahClientCommon SDK references; Galaxy.Host project-references the sidecar so
MxAccessGalaxyBackend keeps building until PR 7.2 retires Galaxy.Host entirely.

10 source files moved (preserving git history via git mv):
  IHistorianDataSource, HistorianDataSource, HistorianClusterEndpointPicker,
  HistorianClusterNodeState, HistorianConfiguration, HistorianEventDto,
  HistorianHealthSnapshot, HistorianQualityMapper, HistorianSample,
  IHistorianConnectionFactory.

2 historian tests moved alongside (HistorianClusterEndpointPickerTests,
HistorianQualityMapperTests). Sidecar test project now hosts 29 tests (1 PR 3.1
smoke + 28 moved historian tests, all passing).

Galaxy.Host's remaining 6 historian-flavored tests (HistorianWiringTests,
HistoryReadAtTimeTests, HistoryReadEventsTests, HistoryReadProcessedTests)
keep passing via the project reference — using directives updated to reach
the new namespace.

Sidecar deliberately speaks no Core.Abstractions — its surface is the legacy
List<HistorianSample> shape; PR 3.4's .NET 10 client translates to the
Core.Abstractions shapes added in PR 1.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:13:13 -04:00
Joseph Doherty ef22a61c39 v2 mxgw migration — Phase 1+2+3.1 wiring (7 PRs)
Foundational PRs from lmx_mxgw_impl.md, all green. Bodies only — DI/wiring
deferred to PR 1+2.W (combined wire-up) and PR 3.W.

PR 1.1 — IHistorianDataSource lifted to Core.Abstractions/Historian/
  Reuses existing DataValueSnapshot + HistoricalEvent shapes; sidecar (PR
  3.4) translates byte-quality → uint StatusCode internally.

PR 1.2 — IHistoryRouter + HistoryRouter on the server
  Longest-prefix-match resolution, case-insensitive, ObjectDisposed-guarded,
  swallow-on-shutdown disposal of misbehaving sources.

PR 1.3 — DriverNodeManager.HistoryRead* dispatch through IHistoryRouter
  Per-tag resolution with LegacyDriverHistoryAdapter wrapping
  `_driver as IHistoryProvider` so existing tests + drivers keep working
  until PR 7.2 retires the fallback.

PR 2.1 — AlarmConditionInfo extended with five sub-attribute refs
  InAlarmRef / PriorityRef / DescAttrNameRef / AckedRef / AckMsgWriteRef.
  Optional defaulted parameters preserve all existing 3-arg call sites.

PR 2.2 — AlarmConditionService state machine in Server/Alarms/
  Driver-agnostic port of GalaxyAlarmTracker. Sub-attribute refs come from
  AlarmConditionInfo, values arrive as DataValueSnapshot, ack writes route
  through IAlarmAcknowledger. State machine preserves Active/Acknowledged/
  Inactive transitions, Acked-on-active reset, post-disposal silence.

PR 2.3 — DriverNodeManager wires AlarmConditionService
  MarkAsAlarmCondition registers each alarm-bearing variable with the
  service; DriverWritableAcknowledger routes ack-message writes through
  the driver's IWritable + CapabilityInvoker. Service-raised transitions
  route via OnAlarmServiceTransition → matching ConditionSink. Legacy
  IAlarmSource path unchanged for null service.

PR 3.1 — Driver.Historian.Wonderware shell project (net48 x86)
  Console host shell + smoke test; SDK references + code lift come in
  PR 3.2.

Tests: 9 (PR 1.1) + 5 (PR 2.1) + 10 (PR 1.2) + 19 (PR 2.2) + 1 (PR 3.1)
all pass. Existing AlarmSubscribeIntegrationTests + HistoryReadIntegrationTests
unchanged.

Plan + audit docs (lmx_backend.md, lmx_mxgw.md, lmx_mxgw_impl.md)
included so parallel subagent worktrees can read them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:03:36 -04:00
Joseph Doherty 012c42a846 Task #156 — TagsTab: per-tag advanced Modbus fields (Deadband, UnitId, CoalesceProhibited)
#155 wired the basic tag form (Name / Driver / Equipment / DataType / Access /
WriteIdempotent + ModbusAddressEditor for the address). The per-tag knobs added
across #141 / #142 / #143 still required operators to hand-edit TagConfig JSON.
This commit exposes them through an "Advanced" expander.

UI changes (TagsTab.razor):

- Collapsible "▶ Advanced (Deadband / UnitId override / CoalesceProhibited)"
  button below the address editor, visible only when the selected driver is
  Modbus. Collapsed by default — basic form covers the typical edit workflow.
- Three numeric / checkbox inputs with inline help text explaining each knob's
  purpose and when to use it.
- _showAdvanced auto-opens on Edit when any of the advanced fields are present
  in the existing TagConfig — operators see immediately what's been configured.

Save-side serialization:

- New RefreshTagConfigJson serializes the address + advanced fields into a
  structured JSON object using a Dictionary<string, object?>. Fields with
  default / empty values are omitted to keep diffs in the existing draft-diff
  viewer minimal — a tag with only an address still produces
  `{"addressString":"40001:F"}` and not a full superset object with nulls.
- OnAddressChanged + OnAdvancedChanged both delegate to RefreshTagConfigJson
  so any input change keeps TagConfig in sync.

Read-side hydration:

- New HydrateModbusFromTagConfig parses an existing TagConfig JSON and
  populates _modbusAddress + the three advanced fields. Falls back to empty
  defaults on malformed JSON. ResetAdvanced is called before hydration on
  every form open so leftover state from a previous edit doesn't leak.

ResetAdvanced helper introduced + called from StartAdd so a fresh "New tag"
form starts with everything cleared.

Tests (1 new in TagServiceTests):
- TagConfig_With_Advanced_Modbus_Fields_RoundTrips_Through_Factory — creates a
  tag whose TagConfig carries addressString + deadband + unitId +
  coalesceProhibited, persists via TagService, reloads, asserts every field
  survives. Then constructs a wrapping driver-config JSON and feeds it to
  ModbusDriverFactoryExtensions.CreateInstance — confirms the field NAMES the
  UI emits match what BuildTag's DTO consumes. If the UI's JSON shape ever
  drifts from the factory's expected DTO, this test catches it before users do.

119 + 1 = 120 Admin tests green. Solution build clean.
2026-04-25 04:22:50 -04:00
Joseph Doherty ec57df1009 Task #155 — TagService + TagsTab CRUD UI for Modbus tags
Closes the remaining loop on user-visible Modbus tag editing. Pre-#155 tags
arrived only via SQL seeding or runtime ITagDiscovery; the Admin UI had no
interactive surface for creating / editing / deleting tag rows.

Changes:

- TagService.cs (Admin/Services/) — CRUD wrapper around OtOpcUaConfigDbContext.Tags.
  ListAsync supports optional driver / equipment filters; CreateAsync auto-derives
  TagId; UpdateAsync persists editable fields; DeleteAsync removes the row. Mirrors
  the EquipmentService shape.
- TagsTab.razor (Components/Pages/Clusters/) — list + filter + add/edit/remove form.
  The address/config editor is conditional: when the selected DriverInstance is
  Modbus, ModbusAddressEditor (#145) renders with live-parse preview; otherwise a
  generic JSON textarea (matches the DriversTab pattern from #147). Save-side
  serializes the address-string into TagConfig as `{"addressString":"..."}` JSON.
- ClusterDetail.razor — new "Tags" tab in the cluster-detail nav strip + the routing
  switch.
- Program.cs — TagService registered as a scoped DI service.

Drive-by fix: ModbusDriverFactoryExtensions.CreateInstance promoted from internal
to public — Admin.Tests was using it via reflection-friendly internal access that
broke under the #153 logger overload addition. Public is the right access modifier
anyway since the Server-side bootstrapper calls it from a different assembly.

Drive-by fix #2: ModbusDriverConfigDto was missing MaxReadGap (#143) — surfaced by
the #147 round-trip test that flips MaxReadGap=12 in the view model and asserts
it lands on the resolved options. Added the field + binding line. Confirms #143's
DriverConfig JSON binding was incomplete since the original commit; no production
deployment configured this knob through JSON until now so the gap stayed hidden.

Tests (4 new TagServiceTests):
- Create_And_List_Surfaces_The_Tag — CreateAsync auto-assigns TagId; list returns
  the row.
- List_Filters_By_DriverInstance — driver-scoped filter works.
- Update_Persists_Editable_Fields — Name / DataType / AccessLevel / TagConfig all
  persist through Update.
- Delete_Removes_The_Row — basic delete verification.

113 + 4 (TagService) + 2 (DriversTab round-trip restored after compile fix) = 119
Admin tests green. Solution build clean.

Caveat: bUnit-style render tests for TagsTab still aren't included — Admin.Tests
doesn't have bUnit set up. The TagService logic is fully covered; the razor
component's parser/save glue is exercised by hand at runtime for now.
2026-04-25 01:51:02 -04:00
Joseph Doherty 802366c2c6 Task #154 — driver-diagnostics RPC: HTTP endpoint + Admin client
Foundation for surfacing per-driver runtime state from the Server process to
the Admin UI. #152 shipped GetAutoProhibitedRanges() as an in-process
accessor; #154 makes it reachable across processes.

Server side (HealthEndpointsHost):
- New URL family: /diagnostics/drivers/{driverInstanceId}/{driverType}/{topic}
- First wired topic: /diagnostics/drivers/{id}/modbus/auto-prohibited
- Driver-agnostic at the URL level — future driver types add their own
  segments[3] cases (e.g. /diagnostics/drivers/{id}/s7/dropped-pdus).
- 404 when the driver instance doesn't exist; 400 when the driver exists
  but isn't a Modbus driver (the per-type endpoint is wrong for this row).
- Response shape is flat JSON (unitId / region / startAddress / endAddress /
  lastProbedUtc / bisectionPending) so consumers don't have to reference the
  Driver.Modbus assembly's ModbusAutoProhibition record.
- Re-uses the existing HttpListener bound to localhost:4841 — same auth /
  reachability story as /healthz and /readyz.

Admin side:
- DriverDiagnosticsClient (Services/) — HttpClient wrapper that fetches the
  per-driver Modbus prohibition list. Returns null on 404/400 (driver
  missing or wrong type); throws on transport failures.
- ModbusAutoProhibitionsResponse + ModbusAutoProhibitionRow flat DTOs —
  client doesn't take a dep on Driver.Modbus.
- ModbusDiagnostics.razor at /modbus/diagnostics/{driverInstanceId} —
  table view with BISECTING (warning yellow) / ISOLATED (danger red)
  badges, relative timestamps (e.g. "5m ago"), Refresh button. Errors
  surface inline rather than swallowing.
- HttpClient registration in Program.cs reads
  DriverDiagnostics:ServerBaseUrl from appsettings.json (default
  http://localhost:4841/ for same-host deployments).

Tests (3 new in HealthEndpointsHostTests):
- Diagnostics_ReturnsModbusAutoProhibitions_ForLiveDriver — registers a
  Modbus driver with a programmable transport that protects register 102,
  records the prohibition via a coalesced ReadAsync, hits the endpoint,
  asserts the returned JSON matches (unitId / region / start / end / pending).
- Diagnostics_404_When_Driver_Not_Found
- Diagnostics_400_When_Driver_Is_Wrong_Type

Architecture note: the Admin-side bUnit-style component test isn't included
because Admin.Tests doesn't have bUnit set up. The DriverDiagnosticsClient
is unit-testable on its own with a mock HandlerStub if needed — left as a
follow-up alongside the broader bUnit setup task.

The diagnostic page is now reachable at /modbus/diagnostics/{driverId} from
any Admin instance pointing at a Server endpoint URL. Future driver types
(S7, AbCip) plug into the same channel by adding their own URL segments
in HealthEndpointsHost.WriteDriverDiagnosticsAsync.
2026-04-25 01:32:21 -04:00
Joseph Doherty 8004394892 Task #153 — ModbusDriver: inject ILogger so prohibition events reach a sink
#152 left a hook for structured logging when an auto-prohibition first
fires; this commit completes the wiring.

Changes:
- ModbusDriver constructor takes an optional ILogger<ModbusDriver> (defaults
  to NullLogger). Existing standalone callers stay compile-clean.
- RecordAutoProhibition logs LogWarning on first-fire only (re-fires of the
  same range stay quiet via the existing isNew de-dupe). Format includes
  DriverInstanceId, UnitId, Region, Start, End, Span — log aggregators can
  filter / count by any field.
- New LogProhibitionCleared helper called by both StraightReprobeAsync (when
  the re-probe succeeds on a single-register range) and BisectAndReprobeAsync
  (per-half clearing + a single combined line when both halves succeed).
- ModbusDriverFactoryExtensions.Register accepts an optional ILoggerFactory.
  Captured at registration time and used in the factory closure to construct
  a per-driver logger. Server bootstrap code that already has an ILoggerFactory
  in DI threads it through with a single argument addition; old call sites
  (Register(registry)) keep working with a null logger.

Tests (2 new ModbusLoggerInjectionTests):
- First_Failure_Emits_Single_Warning_Subsequent_Refire_Stays_Quiet — pins
  the de-dupe behaviour. First scan logs one warning with the expected
  structured fields; second scan with the same prohibition stays silent.
- Reprobe_Clearing_Prohibition_Emits_Information_Log — protected register
  unlocked between record and re-probe; re-probe success emits an info log
  containing "cleared".

CapturingLogger test harness is purpose-built (xUnit doesn't ship a logger
mock by default and adding Moq is overkill for two tests).

240 + 2 = 242 unit tests green.
2026-04-25 01:26:20 -04:00
Joseph Doherty b8df230eb8 Task #152 — Modbus coalescing: surface auto-prohibitions through diagnostics
Auto-prohibited ranges (#148) were previously visible only through an
internal AutoProhibitedRangeCount accessor used by tests. Production
operators had no way to see what the planner had learned without pulling
logs or inspecting driver state.

Changes:

- New public record `ModbusAutoProhibition(UnitId, Region, StartAddress,
  EndAddress, LastProbedUtc, BisectionPending)` — operator-facing snapshot
  shape. Lives in the addressing assembly's logical namespace alongside
  the other public types.
- `ModbusDriver.GetAutoProhibitedRanges()` returns
  `IReadOnlyList<ModbusAutoProhibition>` — a copy of the live prohibition
  map. Lock-protected snapshot so consumers don't race with the re-probe
  loop.
- RecordAutoProhibition tracks first-fire vs re-fire via the dictionary
  insert path, leaving a hook to add structured logging once an ILogger
  is plumbed through (currently elided to keep the constructor minimal
  for testability — a future change can wire ILogger and emit a single
  warning per first-fire).

Tests (1 new, additive to the 6 in ModbusCoalescingAutoRecoveryTests):
- GetAutoProhibitedRanges_Surfaces_Operator_Visible_Snapshot — confirms
  the snapshot shape: empty before any failure, populated with correct
  UnitId/Region/Start/End/BisectionPending after a failed coalesced read,
  LastProbedUtc within the recent past.

Docs:
- docs/v2/modbus-addressing.md — new "Coalescing auto-recovery" subsection
  consolidates the #148/#150/#151/#152 surface in one place. Documents
  the diagnostic accessor + flags the in-process consumption pattern
  (Server health endpoints today; Admin UI when an RPC channel exists).

239 + 1 = 240 unit tests green.

Caveat: the Admin UI surfacing (table render, "clear all prohibitions"
button) is intentionally NOT shipped here. Admin can't reach a live
ModbusDriver instance without a driver-diagnostics RPC channel that
doesn't exist yet — that's a larger architectural piece. For now the
data is queryable in-process by the Server's health endpoints; once an
RPC channel lands, Admin can wire the existing GetAutoProhibitedRanges
into a Blazor table without further driver changes.
2026-04-25 01:19:10 -04:00
Joseph Doherty f823c81c96 Task #150 — Modbus coalescing: bisection-style range narrowing
Pre-#150 a coalesced read failure recorded the FULL failed range as
permanently prohibited. Healthy registers around the actual protected
register stayed in per-tag mode forever (until ReinitializeAsync). The
re-probe loop shipped in #151 retried the whole range as a single block,
which would either succeed (clearing everything) or fail (changing
nothing).

Post-#150 the re-probe loop bisects multi-register prohibitions:

- _autoProhibited refactored from Dictionary<key, DateTime> to
  Dictionary<key, ProhibitionState> where ProhibitionState carries
  LastProbedUtc + SplitPending. Multi-register prohibitions enter with
  SplitPending=true; single-register prohibitions enter with
  SplitPending=false (already minimal).
- ReprobeLoopAsync delegates the per-pass work to
  RunReprobeOnceForTestAsync (also exposed for synchronous test driving).
  Each entry routes to BisectAndReprobeAsync (split-pending + multi-reg)
  or StraightReprobeAsync (single-reg / non-split-pending).
- Bisection: split (start, end) at mid = (start+end)/2. Try (start, mid)
  and (mid+1, end) as separate coalesced reads. Each FAILED half re-enters
  the prohibition map with SplitPending = (its end > its start). SUCCEEDED
  halves vanish, freeing the planner to coalesce across them on the next
  scan.
- Convergence: log2(span) re-probe ticks pin the prohibition to the
  actual single offending register(s). For a 100-register block with one
  protected address that's ~7 ticks.

Tests (3 new ModbusCoalescingBisectionTests):
- Bisection_Narrows_Multi_Register_Prohibition_Per_Reprobe — 11 tags
  100..110 with protected address 105. After 4 re-probe passes the
  prohibition collapses from (100..110) → (100..105) → (103..105) →
  (105..105).
- Bisection_Clears_When_Both_Halves_Are_Healthy — transient failure
  scenario; protection lifted before re-probe; both bisection halves
  succeed and the parent vanishes entirely.
- Bisection_Splits_Into_Two_When_Both_Halves_Still_Fail — TwoHoleTransport
  with protected addresses 102 + 108 in the same coalesced range. After
  bisection both halves still fail (each contains one of the protected
  addresses); the prohibition map grows to 2 entries.

236 + 3 = 239 unit tests green. Solution build clean.
2026-04-25 01:16:09 -04:00
Joseph Doherty 9e4aae350b Task #151 — Modbus coalescing: periodic re-probe of auto-prohibitions
#148 introduced auto-prohibited coalesced ranges that persist for the
driver lifetime. Long-running deployments with transient PLC permission
changes (firmware update unlocking a previously-protected register,
operator reconfiguring the device) had no recovery short of operator
restart.

Adds an opt-in background loop that re-probes each prohibition periodically:

- ModbusDriverOptions.AutoProhibitReprobeInterval (TimeSpan?, default null
  = disabled). Set to e.g. TimeSpan.FromHours(1) to opt in.
- _autoProhibited refactored from HashSet<key> to Dictionary<key, DateTime>
  so each entry tracks its last failure / last re-probe timestamp.
- ReprobeLoopAsync runs on the same Task.Run pattern as ProbeLoopAsync;
  cancelled by ShutdownAsync. Each tick snapshots the prohibition set
  and issues a one-shot coalesced read per range. Successful re-probes
  drop the prohibition; failed ones bump the timestamp + leave the
  prohibition in place.
- Communication failures during re-probe (transport-level) are treated
  the same as PLC-exception failures — the prohibition stays, but isn't
  upgraded to "permanent" since transports recover. The driver-instance
  health surface picks up the failure separately.
- ShutdownAsync explicitly clears the prohibition set so a manual restart
  via ReinitializeAsync starts with a clean slate (matches the old
  "restart to clear" semantics).
- Factory DTO + JSON binding extended with AutoProhibitReprobeMs field.

Tests (2 new, additive to the 3 in ModbusCoalescingAutoRecoveryTests):
- Reprobe_Clears_Prohibition_When_Range_Becomes_Healthy — protected
  register at 102 records prohibition; clearing the simulated protection
  + invoking the re-probe drops the prohibition.
- Reprobe_Leaves_Prohibition_When_Range_Is_Still_Bad — re-probe on a
  still-failing range keeps the prohibition in place.

Tests use a new internal RunReprobeOnceForTestAsync helper to fire one
re-probe pass synchronously, so the suite doesn't have to wait on the
background timer (the loop's timer behaviour is exercised implicitly via
the InitializeAsync wire-up + the synchronous helper sharing the actual
re-probe code path).

234 + 2 = 236 unit tests green.
2026-04-25 01:12:48 -04:00
Joseph Doherty 8de152df4f Task #149 — Modbus address-preview page + ImportEquipment help
The original task scope assumed a per-tag editor lived in EquipmentTab.razor
or a similar surface. Reading the codebase confirmed that's not the case:
tags are seeded via SQL (scripts/smoke/*) or arrive at runtime through
ITagDiscovery; the Admin UI has no per-tag CRUD page today. Equipment
import is for equipment metadata (Name / MachineCode / ZTag / SAPID /
Identification) — not tag rows.

Adjusted scope:

1. ModbusAddressPreview.razor — new standalone page at /modbus/address-preview.
   Hosts the ModbusAddressEditor component shipped in #145 + the family
   selector + a copy-pasteable grammar reference. Operators can sanity-check
   address-string syntax (40001:F:CDAB / HR1:I / V2000:F / D100:I etc.)
   without committing it to a config row first.

2. ImportEquipment.razor — appended a secondary alert banner clarifying
   that Modbus per-tag addressing isn't part of equipment import; points
   users at the Drivers tab + the new preview tool.

Builds clean against the existing Admin app. The actual per-tag CRUD UI is
still a separate piece of work — when it ships, it can drop in
ModbusAddressEditor directly. The preview page acts as the canonical
demonstration of how to use the component.

Razor caveat: the grammar reference uses literal `<...>` syntax tokens
that the Razor parser interprets as malformed elements when inlined in a
<pre> block. Held as a string field (_grammarReference) and rendered
through @ binding to sidestep the parser conflict.
2026-04-25 01:09:24 -04:00
Joseph Doherty 3b0e093002 Task #148 — Modbus block-coalescing: auto-recover from protected register holes
Pre-#148 behaviour: a coalesced FC03/FC04 read that crossed a write-only or
PLC-fault register marked every member tag Bad until the operator manually
flagged the offending tag with CoalesceProhibited. Healthy tags around the
hole stayed broken indefinitely.

Post-#148: two-stage recovery, no operator intervention needed.

1. Same-scan fallback: when a coalesced read fails with a Modbus exception
   (IllegalDataAddress, SlaveDeviceFailure, etc.), the planner does NOT
   mark members handled. The per-tag fallback in the same scan reads each
   member individually — non-protected members surface Good values
   immediately, and only the actual protected register stays Bad.

2. Cross-scan prohibition: the failed range (Unit, Region, Start, End) is
   recorded in a per-driver `_autoProhibited` set. On subsequent scans the
   planner checks each candidate merge against the set and refuses to
   re-form any block that overlaps a known-bad range. Net effect: after one
   scan with a failure, the protected range goes "per-tag mode" indefinitely
   while ranges around it keep coalescing normally.

Communication failures (timeouts, socket drops) are NOT auto-prohibited —
they're transport-level, not structural. The same coalesced read can succeed
once the transport recovers; recording it as "permanently bad" would defeat
coalescing for the whole driver instance.

Auto-prohibition state lives for the driver lifetime and clears on
ReinitializeAsync (operator restart). A periodic re-probe is a follow-up if
deployments need it without a restart.

Implementation:
- Added `_autoProhibited` HashSet<(byte, ModbusRegion, ushort, ushort)> +
  `_autoProhibitedLock` on ModbusDriver.
- `RangeIsAutoProhibited(unit, region, start, end)` overlap check called
  from the planner when forming blocks.
- `RecordAutoProhibition(...)` called from the catch (ModbusException)
  branch.
- The catch (Exception) branch (non-Modbus failures) keeps the pre-#148
  "mark all Bad in this scan, don't auto-prohibit" behaviour.
- Internal `AutoProhibitedRangeCount` accessor for tests.

Tests (3 new ModbusCoalescingAutoRecoveryTests):
- First_Failure_Falls_Back_To_PerTag_Same_Scan — three tags around a
  protected register at 102: T100 + T104 surface Good values via the
  per-tag fallback in the SAME scan; T102 surfaces the exception.
- Second_Scan_Skips_Coalesced_Read_Of_Prohibited_Range — confirms scan 2
  doesn't re-attempt the failed merge (no FC03 with quantity > 1 at the
  prohibited start).
- Tags_Outside_Prohibited_Range_Still_Coalesce — separate cluster at HR
  200..202 keeps coalescing normally even after the 100..104 cluster is
  prohibited.

234/234 unit tests green.

Follow-ups intentionally NOT shipped (smaller, independent changes):
- Bisection-style range narrowing — currently the prohibition range is the
  full failed block; the planner doesn't try to find the exact protected
  register. Operator-visible diagnostic + prohibition stays correct.
- Periodic re-probe to clear stale prohibitions.
- Surface auto-prohibited ranges through GetHostStatuses or a new
  diagnostic so the Admin UI can show what's been auto-isolated.
2026-04-25 01:01:42 -04:00
Joseph Doherty 0b7653d3b2 Task #147 — wire ModbusOptionsEditor into DriversTab
Branches the DriversTab driver-add form on driver type:
- For DriverType=Modbus, render the typed <ModbusOptionsEditor> component
  shipped in #145 instead of the generic JSON textarea.
- For other driver types, the existing textarea stays (other drivers ship
  their own typed editors per decision #94).

On Save, when type is Modbus, the form serialises ModbusOptionsViewModel
into the JSON DTO shape ModbusDriverFactoryExtensions consumes (host /
port / unitId / family / keepAlive / reconnect / max*** / writeOnChangeOnly
/ etc.). Other types still pass the textarea contents verbatim.

Drive-by fix: the DriverType dropdown listed "ModbusTcp" but the actual
factory-registered name is "Modbus" — DriverInstanceBootstrapper would
silently skip a row created with the old label because the factory lookup
would miss. Renamed to match.

Tests (2 new in ModbusOptionsViewModelTests):
- DriversTab_Serialized_Defaults_RoundTrip_Through_Factory — unedited
  view-model serializes to a JSON the factory accepts; resulting
  ModbusDriverOptions matches the form defaults bit-for-bit.
- DriversTab_Serializes_Edited_Values_Correctly — flipping Host / Port /
  UnitId / Family / MaxReadGap / WriteOnChangeOnly in the view model
  surfaces in the constructed driver's options.

The serializer in the test mirrors DriversTab.razor's SerializeModbusOptions
helper. If the form's serialization shape drifts, both must be updated
together; that's the cost of testing through the JSON DTO without bUnit.

Follow-up still open: the per-tag editor (ModbusAddressEditor wiring into
EquipmentTab.razor + the bulk-import help-text update) — that's a separate
surface that touches the equipment-row CRUD flow; covered as a follow-up
when the equipment tag editor surface is next touched.
2026-04-25 00:58:03 -04:00
Joseph Doherty dfd027ebca Task #146 — Modbus addressing: align type codes with Wonderware DASMBTCP + Ignition
Web verification (2026-04-25) against current vendor docs surfaced concrete
grammar conflicts in the v1 suffix grammar shipped in #137. Hard cutover
before the Admin UI rolls out widely so users don't paste `:I` from a
Wonderware spreadsheet and silently get wrong-typed reads.

Sources:
- Wonderware DASMBTCP user guide
  https://cdn.logic-control.com/media/DASMBTCP.pdf
- Ignition Modbus addressing (8.1)
  https://www.docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/opc-ua-drivers/modbus/modbus-addressing

Type-code changes:

| Code   | Pre-#146 | Post-#146  | Vendor reference            |
|--------|----------|------------|------------------------------|
| `:S`   | (n/a)    | Int16      | Wonderware DASMBTCP `S`      |
| `:US`  | (n/a)    | UInt16     | Ignition `HRUS`              |
| `:I`   | Int16    | **Int32**  | Wonderware `I` + Ignition `HRI` |
| `:UI`  | UInt16   | **UInt32** | Ignition `HRUI`              |
| `:I_64`  | (n/a)  | Int64      | Ignition `HRI_64`            |
| `:UI_64` | (n/a)  | UInt64     | Ignition `HRUI_64`           |
| `:BCD_32`| (n/a)  | BCD32      | Ignition `HRBCD_32`          |

Codes REMOVED (no clear vendor precedent + conflict with the new mapping):
`:DI`, `:L`, `:UDI`, `:UL`, `:LI`, `:ULI`, `:LBCD`. Pre-#146 configs that
use them get an "Unknown type code" diagnostic at parse time so users get
a fast surface-level error rather than silent wrong-typed reads.

Codes UNCHANGED (already vendor-aligned): `:BOOL`, `:F`, `:D`, `:BCD`,
`:STR<n>`. Modicon 5/6-digit + mnemonic regions (HR/IR/C/DI) + bit suffix
`.N` are also unchanged.

Defaults:
- Coils / DiscreteInputs → `BOOL` (unchanged)
- HoldingRegisters / InputRegisters with no explicit type → Int16 (matches
  Ignition's bare `HR` default)

Byte-order mnemonics (`:ABCD` / `:CDAB` / `:BADC` / `:DCBA`) are kept but
documented as OtOpcUa-specific — they aren't in any major vendor's per-tag
address string. Ignition uses a `-R` suffix per prefix; Wonderware
configures word-order at the topic level.

Tests:
- 12 Type_Codes_Parse rows updated to assert the new mappings.
- New Removed_Aliases_Are_Rejected (×7) confirms each pre-#146 alias now
  fails fast with "Unknown type code".
- Worked_Example_Int16_Array uses the new `:S` code.
- New Worked_Example_Int32_Array_Via_I_Code documents the `:I = Int32`
  vendor-alignment intent so a future "fix" doesn't accidentally regress.
- Unknown_Type_Code_Rejected_With_Catalog updated to match the new error
  message ("Valid: BOOL, S, US, I, ...").

Docs:
- docs/v2/modbus-addressing.md — table replaced with the post-#146 codes,
  each row cites its Wonderware / Ignition reference. New "Codes removed
  in #146" subsection documents the cutover.
- docs/Driver.Modbus.Cli.md — example grammar list updated; explicit
  type-code reminder appended.

114 addressing tests + 231 driver tests still green. Solution build clean.
2026-04-25 00:51:50 -04:00
Joseph Doherty 5ea57d2d70 Task #138 — Modbus addressing grammar docs + e2e
Closes the docs/e2e end of the Modbus addressing line shipped across
#136-#145.

Docs:

- docs/v2/modbus-addressing.md (new) — full grammar reference.
  Region+offset (Modicon 5-digit / 6-digit / mnemonic), bit suffix,
  type codes (BOOL / I / UI / DI / UDI / LI / ULI / F / D / BCD / LBCD /
  STR<n>), all four byte-order mnemonics (ABCD / CDAB / BADC / DCBA),
  array-count semantics, family-native syntax (DL205 V/Y/C/X/SP and
  MELSEC D/M/X/Y with hex-vs-octal sub-family selection), driver-instance
  options (KeepAlive / Reconnect / IdleDisconnect, MaxCoilsPerRead and
  FC15/16 forcing, Deadband + WriteOnChangeOnly, MaxReadGap +
  CoalesceProhibited, multi-unit IPerCallHostResolver). Includes a worked
  JSON DTO example mixing AddressString + structured tag forms.

- docs/Driver.Modbus.Cli.md — appended a "v2 addressing grammar" section
  pointing users at the full reference, with quick-reference examples.

- Vendor-compatibility caveat documented: type codes and byte-order
  mnemonics were synthesised from training-era vendor docs (Wonderware
  DASMBTCP, Kepware KEPServerEX, Ignition, Matrikon, OAS) and should be
  verified against current vendor manuals before locking for production.

E2E tests (4 new AddressingGrammarTests in IntegrationTests):
- Modicon 5-digit and 6-digit forms map to identical wire offsets.
- Float32 + WordSwap (CDAB) round-trips end-to-end through the
  pymodbus simulator.
- Int16[5] array round-trips as a typed short[] surface.
- Block-read coalescing produces a wire-acceptable PDU when MaxReadGap=5
  bridges three nearby tags.

All tests skip gracefully when the pymodbus simulator at localhost:5020
is unreachable (matches the existing ModbusSimulatorFixture pattern).

Final test count across the Modbus addressing surface:
- 107 ModbusAddressing.Tests (parser + family + Modicon)
- 231 Driver.Modbus.Tests (driver, byte order, array, multi-unit, coalescing,
  protocol, subscribe, connection options)
- 110 Admin.Tests (incl. ModbusOptionsViewModel defaults pinning)
- 4 new AddressingGrammar integration tests (skip when sim down)
2026-04-25 00:32:27 -04:00
Joseph Doherty 858f300a61 Task #145 — Admin UI: expose new Modbus driver config
Two new Blazor components surface every Modbus knob added by #136-#144 so
users can configure the driver without hand-editing DriverConfig JSON.

ModbusAddressEditor.razor (live address-string parser preview):
- Bound to a string AddressString + a Family / MelsecSubFamily hint.
- On every input keystroke, runs ModbusAddressParser.TryParse and surfaces
  the resolved breakdown (Region, Offset, DataType, Bit, ByteOrder,
  ArrayCount, StringLength) inline as a green badge.
- On parse error, shows the parser's diagnostic in red.
- Re-uses the SAME parser the wire driver uses — grammar drift is
  impossible by construction.

ModbusOptionsEditor.razor (driver-instance options panel):
- Connection group (Host / Port / UnitId).
- Family group (#144) with conditional MelsecSubFamily dropdown.
- Keep-alive group (#139): Enabled / Time / Interval / RetryCount.
- Reconnect group (#139): InitialDelay / MaxDelay / BackoffMultiplier.
- Protocol group (#140): MaxRegistersPerRead / Write / Coils / ReadGap.
- Behaviour toggles (#140 + #141): UseFC15 / UseFC16 / WriteOnChangeOnly.
- Bound to ModbusOptionsViewModel — defaults match ModbusDriverOptions
  defaults so unedited rows produce the historical wire output verbatim.

Architecture:
- Admin project gains a ProjectReference to Driver.Modbus.Addressing
  (the shared parser assembly extracted in #136). Admin does NOT take a
  dep on Driver.Modbus itself — the addressing concerns are cleanly
  separated from the wire driver.
- Same-namespace shared assembly means components reference
  ModbusAddressParser / ModbusFamily / etc. without prefix gymnastics.

Tests:
- ModbusOptionsViewModelTests (1 test) — pins every default in the view
  model against the corresponding ModbusDriverOptions default. A
  regression that flips an unedited row to a non-default value gets
  caught here. (Test references both Admin and Driver.Modbus to make the
  cross-assembly comparison.)
- Live Blazor component testing requires bUnit, which isn't currently
  in the test setup; the parser logic the component wraps is fully
  covered by the 91 ModbusAddressParser tests in the addressing project,
  so the glue layer's behaviour is verifiable end-to-end already.

Caveat: the wiring into the existing DriverInstance edit page lives in
DriversTab.razor — that integration is left as a follow-up because it
touches the cluster-edit workflow specifically and the components in
this commit are framework-agnostic enough to drop in. The components
build clean against the existing Admin project; no behavioural change
to other tabs.
2026-04-25 00:26:43 -04:00
Joseph Doherty 366212417c Task #143 — Modbus block-read coalescing (with max-gap knob)
Adds a coalescing read planner that merges nearby tags into single FC03/FC04
PDUs, opt-in via ModbusDriverOptions.MaxReadGap. Default 0 = no coalescing
(every tag gets its own PDU — preserves pre-#143 wire output).

Worked example with MaxReadGap=10:
  T1 @ HR 100 (Int16, 1 reg)
  T2 @ HR 102 (Int16, 1 reg, gap 1 → joins block)
  T3 @ HR 110 (Float32, 2 regs, gap 7 → joins block)
  T4 @ HR 200 (Int16, 1 reg, gap 89 → splits, separate read)
  → 2 PDUs total: FC03 start=100 quantity=12 + FC03 start=200 quantity=1.

Planner:
- Eligible tags: known + register region (HR/IR) + scalar + not String /
  BitInRegister / array + not CoalesceProhibited.
- Groups by (UnitId, Region) — never coalesces across slaves or regions.
- Sorts by start address; merges when (next.start - last.end - 1) ≤ MaxReadGap
  AND the resulting span ≤ MaxRegistersPerRead. Otherwise opens a new block.
- Single-tag blocks are deferred to the per-tag path so WriteOnChange cache
  semantics stay correct without duplication.
- Per-block failure marks every member tag Bad and degrades health — same
  semantics the per-tag path has, but at the block granularity.

Per-tag escape hatch ModbusTagDefinition.CoalesceProhibited (bool, default
false) — when true, the tag is read in isolation regardless of MaxReadGap.
For PLCs with protected register holes between adjacent tags.

Tests (7 new ModbusCoalescingTests):
- MaxReadGap=0 keeps the per-tag behavior (2 reads for 2 tags).
- MaxReadGap=2 merges 3 tags within 5 registers into 1 read of qty=5.
- MaxReadGap=10 splits T1+T2 from T3 when the gap exceeds the threshold.
- CoalesceProhibited tag reads alone even when neighbours are eligible.
- Coalescing never crosses UnitId boundaries (multi-slave gateway safety).
- MaxRegistersPerRead caps a would-be block; planner falls back to separate
  reads when the merged span would exceed the cap.
- Per-tag values surface independently after coalescing (slice-math sanity).

Existing 220 unit tests still green; total 224 pass with the new file (tests
are additive, no regressions).

Follow-up: auto-split-on-protected-hole isn't shipped — a coalesced read
that hits an Illegal Data Address right now marks every member Bad until
the operator sets CoalesceProhibited on the offending tag. Tracked
implicitly by #138's e2e drill against a pymodbus profile with a protected
hole mid-block.
2026-04-25 00:21:18 -04:00
Joseph Doherty ad7d811f69 Task #142 — Modbus multi-unit-ID per TCP connection (gateway support)
Lifts the previous "one driver = one slave" assumption so a single Modbus
driver instance can front N RTU slaves behind one Ethernet gateway (Anybus,
ProSoft, Lantronix style). Each tag carries an optional UnitId that drives
the MBAP unit-id byte per-PDU, and the IPerCallHostResolver contract surfaces
per-slave host strings so per-PLC circuit breakers fire per-slave (matches
the AB CIP template documented in docs/v2/multi-host-dispatch.md).

Changes:

- ModbusTagDefinition gains optional UnitId (byte?). Null = use driver-level
  ModbusDriverOptions.UnitId (preserves single-slave deployments verbatim).
- ResolveUnitId(tag) helper computed once per ReadOneAsync / WriteOneAsync
  call; passed through ReadRegisterBlockAsync / ReadBitBlockAsync /
  ReadRegisterBlockChunkedAsync / ReadBitBlockChunkedAsync explicitly. The
  probe loop continues using driver-level UnitId (the probe is a
  connection-health check, not slave-specific).
- ModbusDriver implements IPerCallHostResolver. ResolveHost(fullReference)
  returns "host:port/unitN" — distinct strings per slave so the resilience
  pipeline keys breakers on the right granularity. Unknown references fall
  back to the bare HostName (single-slave behaviour).
- BitInRegister RMW path also threads the per-tag UnitId through both the
  read and write halves so a multi-slave deployment stays correct under bit-
  level writes.
- Factory DTO + JSON binding extended with the per-tag UnitId field.

Tests (4 new ModbusMultiUnitTests):
- Per-tag UnitId routes to the correct slave in the MBAP header (driver-level
  UnitId=99 must NOT appear when both tags override).
- Tag without override falls back to driver-level UnitId.
- IPerCallHostResolver returns distinct "host:port/unitN" strings per slave.
- Unknown reference returns the bare HostName fallback.

Existing 220 unit tests + 107 addressing tests still green. Per-PLC breaker
isolation under simulated dead slaves is verifiable via the existing AB CIP
test infra; live coverage lands as an integration test in the #138 docs/e2e
refresh.
2026-04-25 00:16:41 -04:00
Joseph Doherty 4cf0b4eb73 Task #144 — Modbus family-native parser branch (DL205 / MELSEC)
Promotes DirectLogicAddress + MelsecAddress from "utility helpers an engineer
calls manually" to "first-class branch of ModbusAddressParser." Users can now
paste DL205-native (V2000, Y0, C100, X17, SP10) and MELSEC-native (D100, M50,
X20 hex/octal, Y0) addresses directly into TagConfig and the parser handles
the PLC-native → Modbus PDU translation.

Changes:

- Both helper files moved into the shared Driver.Modbus.Addressing assembly
  (same namespace, zero-churn for callers). Required because the parser
  needs to call them and the dependency direction is parser→helpers, not
  the other way.
- New ModbusFamily enum (Generic / DL205 / MELSEC) on
  ModbusDriverOptions.Family. Generic preserves pre-#144 behaviour exactly.
- ModbusDriverOptions.MelsecSubFamily picks the X/Y notation (Q_L_iQR hex
  vs F_iQF octal). Default Q_L_iQR.
- ModbusAddressParser.Parse now takes optional family + sub-family hints.
  When non-Generic, family-native parsing runs FIRST; on miss falls back to
  Modicon / mnemonic. Cross-family ambiguity (C100 = Modicon coil under
  Generic, DL205 control relay under DL205) is unambiguous within one
  driver instance.
- Suffix grammar composes with native addresses: V2000:F:CDAB:5 parses
  end-to-end as DL205 V-memory at PDU 1024 + Float32 + word-swap + array of 5.
- Bit suffix composes too: V2000.7 parses as bit 7 of HR[1024].
- Factory DTO fields Family / MelsecSubFamily flow through to BuildTag so
  the JSON binding can drive everything per-driver.

Tests: 16 new ModbusFamilyParserTests covering DL205 V/Y/C/X/SP, MELSEC
D/M/X/Y, sub-family hex-vs-octal disambiguation, cross-family C100 ambiguity,
fallback to Modicon when native misses, and grammar composition with bit/
byte-order/array modifiers. Existing 91 parser tests still green; 220 driver
tests still green.

Caveat: bank-base offsets for MELSEC X/Y/M default to 0 in the grammar
string. Sites with non-zero "Modbus Device Assignment Parameter" bases must
use the structured tag form to override — addressed in the docs refresh
(#138).
2026-04-25 00:10:43 -04:00
Joseph Doherty 4bffe879c5 Task #141 — Modbus subscribe-side knobs (deadband + write-on-change)
Two driver-side filters that ≥5 of 6 surveyed vendors expose:

1. Per-tag Deadband (double?, on ModbusTagDefinition) — when set, the
   PollGroupEngine onChange callback suppresses publishes whose distance
   from the last-published value is below the threshold. Reduces wire
   traffic to OPC UA clients on noisy analog signals (flow meters,
   temperatures). Numeric scalar types only — Bool / BitInRegister / String
   / array tags publish unconditionally.

2. WriteOnChangeOnly (bool, on ModbusDriverOptions) — when true, the driver
   short-circuits writes whose value matches the most recent successful
   write to that tag. Saves PLC bandwidth on clients that re-publish the
   same setpoint every scan. Cache invalidates on any read that returns a
   different value, so HMI-side changes don't get masked.

Both default off so existing deployments see no behaviour change.

Implementation:
- ShouldPublish guard wraps the existing OnDataChange invocation. First sample
  always passes through (no baseline); subsequent samples compare via
  Convert.ToDouble for the cross-numeric-type math.
- IsRedundantWrite check at the top of WriteAsync; on success the cache is
  populated. Object.Equals handles boxed-numeric equality; arrays are
  excluded (reference-equality would never match anyway).
- ReadAsync invalidates the WriteOnChangeOnly cache when the new value
  differs from the cached last-written value.

Tests (5 new ModbusSubscribeOptionsTests):
- Deadband suppresses sub-threshold changes (100 → 102 → 106 → 107 with
  deadband=5 publishes 100 and 106 only).
- Deadband=null still publishes every change.
- WriteOnChangeOnly suppresses 3 identical 42 writes (only first hits wire).
- WriteOnChangeOnly default false hits the wire every time.
- Read-divergence cache invalidation: external panel write to 99, our
  client's re-write of 42 must NOT be suppressed.

220/220 unit tests green; existing ProtocolOptions tests hardened against
probe-loop noise by disabling the probe in their fixtures.
2026-04-25 00:05:25 -04:00
Joseph Doherty 55f4044a69 Task #140 — Modbus protocol-behavior knobs
Adds ModbusDriverOptions knobs that ≥4 of 6 surveyed vendors expose:

1. MaxCoilsPerRead (ushort, default 2000) — separate from MaxRegistersPerRead
   because coil packing (1 bit per coil) and register packing (16 bits each)
   have different spec ceilings. Coil-array reads above the cap auto-chunk
   the same way register reads have always done. New ReadBitBlockChunkedAsync
   re-assembles per-chunk LSB-first bitmaps into one logical bitmap.

2. UseFC15ForSingleCoilWrites (default false) — forces FC15 (Write Multiple
   Coils with quantity=1) for single-coil writes instead of the default FC05
   (Write Single Coil). Safety / audit PLCs that only accept the multi-write
   codes need this.

3. UseFC16ForSingleRegisterWrites (default false) — same idea for FC16 vs
   FC06 on single holding-register writes.

4. DisableFC23 (default false) — placeholder no-op for the future block-read
   coalescing (#143) work that may opt into FC23 (Read/Write Multiple
   Registers). Lets deployments pre-disable FC23 for PLCs that won't accept
   it, before we ship the optimisation that emits it.

Defaults preserve the historical wire output bit-for-bit (FC05/FC06 for
singles, no chunking under 2000 coils, no FC23). Factory DTO + JSON-binding
extended with parallel fields.

6 new ModbusProtocolOptionsTests covering: defaults, FC05→FC15 forcing,
FC06→FC16 forcing, MaxCoilsPerRead chunking math (2500 coils / 2000 cap →
2 reads of 2000 + 500). Existing 209 unit tests still green.
2026-04-24 23:59:04 -04:00
Joseph Doherty 6cf20131fe Task #139 — Modbus connection-layer config knobs (keep-alive / idle / reconnect)
Promotes the previously hardcoded transport-layer settings to ModbusDriverOptions
so users can tune them through DriverConfig JSON without recompiling.

Three new option groups:

1. KeepAlive (ModbusKeepAliveOptions): Enabled / Time / Interval / RetryCount.
   Defaults preserve the historical PR 53 wire output exactly (Enabled=true,
   Time=30s, Interval=10s, RetryCount=3). Set Enabled=false for PLCs that
   reject SO_KEEPALIVE.

2. IdleDisconnectTimeout (TimeSpan?): when set, the transport tracks last-PDU-
   success and proactively closes + reconnects on the next request after the
   threshold. Defends against silent NAT / firewall socket reaping. Default
   null = disabled (no behaviour change).

3. Reconnect (ModbusReconnectOptions): InitialDelay / MaxDelay /
   BackoffMultiplier for the post-drop reconnect loop. Defaults
   (InitialDelay=0, MaxDelay=30s, Multiplier=2.0) preserve the historical
   immediate-retry behaviour for the first attempt and add geometric backoff
   only if the reconnect itself fails. Capped at 10 attempts before propagating.

ModbusTcpTransport ctor extended with optional keepAlive / idleDisconnect /
reconnect parameters; existing 4-arg call sites continue to compile. Factory
DTO gains parallel KeepAlive / IdleDisconnectMs / Reconnect fields with
default-aware binding.

5 new ModbusConnectionOptionsTests covering the default-preservation contract
(every default field matches pre-#139) and the JSON-binding round-trip for
each knob group. Existing 204 unit tests still green.
2026-04-24 23:53:26 -04:00
Joseph Doherty 850b816873 Task #137 — Modbus per-tag suffix grammar (type / bit / byte-order / array)
Adds the full Wonderware/Kepware/Ignition-style address suffix grammar so
users paste tag spreadsheets without per-tag manual translation:

  <region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]

Examples that now parse end-to-end:
  40001                          HoldingRegisters[0], Int16
  400001                         same, 6-digit form
  40001.5                        bit 5 of HR[0]
  40001:F                        Float32 (HR[0..1])
  40001:F:CDAB                   word-swapped Float32
  40001:STR20                    20-char ASCII string
  HR1:DI                         Int32 via mnemonic region
  C100                           Coils[99] (mnemonic)
  40001:F:5                      Float32[5] array (3-field shorthand)
  40001:I:CDAB:10                Int16[10] word-swapped (4-field strict)

Driver-side plumbing:
- ModbusAddressParser + ParsedModbusAddress in the shared Addressing
  assembly. 91 parser tests (every grammar variant + malformed shapes).
- ModbusDataType / ModbusByteOrder moved to shared (with the same namespace
  so callers compile unchanged). ModbusByteOrder gains ByteSwap (BADC) and
  FullReverse (DCBA) alongside the existing BigEndian (ABCD) and WordSwap
  (CDAB).
- NormalizeWordOrder extended to honor all four orders for both 4-byte and
  8-byte values. Old WordSwap behavior preserved bit-for-bit.
- ModbusTagDefinition gains optional ArrayCount.
- ReadOneAsync / WriteOneAsync handle array fan-out: one FC03/04 read covers
  N consecutive register-typed elements, decoded into a typed array (short[],
  float[], etc.). Coil arrays use FC01 reads + FC15 writes (FakeTransport
  in tests gains FC15 support to match).
- DriverAttributeInfo IsArray / ArrayDim flow from ArrayCount so the OPC UA
  address space surfaces ValueRank=1 + ArrayDimensions to clients.
- ModbusDriverFactoryExtensions gains AddressString DTO field. When
  present, the parser drives Region/Address/DataType/ByteOrder/Bit/
  StringLength/ArrayCount; structured fields (Writable, WriteIdempotent,
  StringByteOrder) still come from the DTO. Existing structured tag rows
  keep working unchanged.

Tests: 91 parser unit tests (Driver.Modbus.Addressing.Tests, all green) +
204 driver tests including new ModbusByteOrderTests (BADC/DCBA roundtrips
across Int32/Float32/Float64) and ModbusArrayTests (Int16[5], Float32[3]
CDAB, Coil[10], length-mismatch error, IsArray/ArrayDim discovery).
Solution-wide build clean.

Caveat: grammar names (type codes, byte-order mnemonics, the :count
shorthand) were synthesized from training-era vendor docs. Verify against
current Kepware Modbus Ethernet Driver Help and Ignition Modbus Addressing
manuals before freezing for production deployments — naming may need a
back-compat layer if vendor wording has shifted.
2026-04-24 23:49:22 -04:00
Joseph Doherty 501d8f494b Task #136 — Modicon address-string parser (5/6-digit) + shared addressing assembly
Foundation for the Modbus addressing-grammar work tracked in #137-#145. Adds
ModbusModiconAddress.Parse / TryParse that turns classic Modicon strings
(40001 / 400001 / 30001 / 00001 / 10001) into (Region, ushort PduOffset).

Also extracts ModbusRegion to a new Driver.Modbus.Addressing assembly so the
Admin UI (#145) can reference the addressing surface without taking a dep on
the wire driver. The new assembly intentionally extends the same
ZB.MOM.WW.OtOpcUa.Driver.Modbus namespace as the driver — callers see the
type as if it lived in one place; only the project layout changes. No
existing call site needed editing (zero-churn move).

Behaviour:
- Single leading digit selects region (0=Coils, 1=DiscreteInputs,
  3=InputRegisters, 4=HoldingRegisters).
- 5-digit form: trailing 4 digits are 1-based register, supports 1..9999.
- 6-digit form: trailing 5 digits are 1-based register, supports 1..65536
  (full PDU address space).
- Strict 5-or-6 length check; whitespace trimmed; clear FormatException
  diagnostics for every malformed shape (wrong length, non-digit body,
  illegal leading digit, register zero, register overflow).

29/29 new unit tests pass. Full Driver.Modbus suite (182 tests) and the
solution-wide build still green after the ModbusRegion move.
2026-04-24 23:34:18 -04:00
Joseph Doherty fb760bc465 Task #135 — update integration-test NodeIds for path-based scheme
7 integration tests in Server.Tests were left behind by the path-based
NodeId rename (#134). Each was constructing test NodeIds in the old
"FullReference" shape ("TestFolder.Var1", "raw.var", "AlphaFolder.Var1",
"plcaddr-temperature"), which the node manager no longer mints — the new
shape is `{driverId}/{folder-path}/{browseName}` per OPC UA Part 3 §5.2.2
NodeId immutability.

Fixed by re-deriving each test NodeId from the actual browse path the test
fixture's driver registers:

- OpcUaServerIntegrationTests: "TestFolder.Var1" → "fake/TestFolder/Var1"
- HistoryReadIntegrationTests (4 tests): "raw.var" → "history-driver/raw",
  "proc.var" → "history-driver/proc" (×2), "atTime.var" → "history-driver/atTime"
- MultipleDriverInstancesIntegrationTests: "AlphaFolder.Var1" →
  "alpha/AlphaFolder/Var1"; "BetaFolder.Var1" → "beta/BetaFolder/Var1"
- OpcUaEquipmentWalkerIntegrationTests: "plcaddr-temperature" →
  "galaxy-prod/warsaw/line-a/oven-3/Temperature" (the walker uses Tag.Name
  as the browseName; the FullReference lives in TagConfig but no longer
  surfaces in the NodeId path)

Server.Tests now 277/277 green excluding LiveLdap. Clears the regression
flagged during the #124 verification run.
2026-04-24 22:03:03 -04:00
Joseph Doherty 75c07149d4 Task #124 — Phase 6.2 multi-user authz interop matrix + close LdapGroups gap
The Phase 6.2 evaluator was wired but received no input in production:
RoleBasedIdentity (the IUserIdentity our LDAP path produces) implemented
IRoleBearer but not ILdapGroupsBearer, so AuthorizationGate.BuildSessionState
always returned null and the gate lax-mode-allowed every request. UserAuthResult
also never carried the resolved LDAP groups, only the role-mapped strings.

Closing the gap so the evaluator gets real data:

- UserAuthResult adds Groups alongside Roles. LdapUserAuthenticator now
  surfaces the raw RDN values (ReadOnly / WriteOperate / ...) it already
  collected during the directory query. Roles stay separate per decision #150
  (control-plane Admin role mapping vs data-plane NodeAcl key).
- RoleBasedIdentity implements ILdapGroupsBearer so AuthorizationGate sees
  the groups via the same seam unit tests already use.

ThreeUserInteropMatrixTests drives the closure end-to-end against the live
GLAuth dev directory:

- 5 distinct group memberships (readonly / writeop / writetune /
  writeconfig / alarmack) plus the multi-group admin user
- Each is bound through the real LdapUserAuthenticator
- Resolved groups feed an LdapBoundIdentity that goes through the strict-mode
  AuthorizationGate against a seeded TriePermissionEvaluator
- 31 InlineData rows assert the role × operation matrix; failures pinpoint
  the exact (user, op) cell

The remaining wire-level leg of #124 — a real OPC UA client driving UserName
tokens through an encrypted endpoint policy — still needs a deployment knob
and stays a manual cross-vendor smoke (#119 / #124 manual scope). The doc
audit note in admin-ui-phase-6-status.md is updated to reflect what's now
auto'd vs what stays manual.

33/33 new tests pass against live GLAuth; existing 270 non-LiveLdap tests
in Server.Tests still pass; Core.Tests 205/205, Admin.Tests 109/109. The 7
integration-test failures observed during this run pre-exist this commit
(NodeId-scheme regression from #134) and are tracked separately as #135.
2026-04-24 20:40:07 -04:00
Joseph Doherty d11d160395 Admin UI Phase 6 audit — close #128–#131 as already-shipped
Task-by-task audit of the Admin UI quartet shows every page listed in
the task descriptions is already built, routed, DI-wired, SignalR-live,
and covered by Admin.Tests (112/112 green):

- #128 /hosts — Hosts.razor 233 LOC with ConsecutiveFailures +
  LastCircuitBreakerOpenUtc + Stale/Faulted/Running cards
- #129 RoleGrants + AclsTab + Probe — RoleGrants.razor (192 LOC),
  AclsTab.razor (279 LOC) with the embedded Probe form at line 38
- #130 RedundancyTab — RedundancyTab.razor 175 LOC with peer
  reachability / ServiceLevel / apply-lease / failover button
- #131 Draft/Publish/Diff/Identification — DraftEditor (105 LOC) +
  Generations (73 LOC) + DiffViewer (87 LOC) + IdentificationFields
  (49 LOC), all wired to GenerationService / DraftValidationService

Shipping docs/v2/implementation/admin-ui-phase-6-status.md as the
canonical reference. Each task's required features are listed with the
exact file / LOC / routing + DI injection so future auditors don't
need to re-derive the status.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:07:05 -04:00
Joseph Doherty e5d1c9c9b9 Phase 6.1 multi-host dispatch — document shipped contract + per-driver status
Task #127 / decision #144. The resilience infrastructure for per-PLC
circuit breakers is shipped and fully tested — the task description's
"current pipeline keys on DriverInstanceId only" was stale. The actual
state:

- `DriverResiliencePipelineBuilder` keys on
  `(DriverInstanceId, HostName, DriverCapability)`.
- `CapabilityInvoker.ExecuteAsync` takes `hostName` per call.
- `IPerCallHostResolver` is the driver-side hook; AB CIP implements it.
- `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`
  proves the end-to-end isolation.

Remaining work is per-driver adoption, not shared infrastructure:
- AB CIP: live + tested
- Galaxy / FOCAS / OPC UA Client / AB Legacy: 1 device per instance by
  design, trivially isolated
- Modbus / S7 / TwinCAT: single-device today; multi-device refactor is
  per-driver surgery (Device row + options + resolver + transport
  fan-out), not a shared-infra change

Shipping docs/v2/multi-host-dispatch.md as the canonical reference:
contract + driver-author checklist + current fleet-wide status table.
Future driver authors follow the AB CIP template.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:01:47 -04:00
Joseph Doherty bd6568bcbd Phase 6.1 Stream B.4 — wire ScheduledRecycleHostedService into bootstrap
Task #125 / #137. The hosted service + scheduler classes already shipped;
this commit connects them to the published-generation driver list so a
Tier C driver with `RecycleIntervalSeconds` in its `ResilienceConfig`
actually gets an armed scheduler at bootstrap.

Wiring:

- `DriverFactoryRegistry.Register` gains an optional `DriverTier`
  parameter (default Tier.A). Existing call sites unchanged —
  `GalaxyProxyDriverFactoryExtensions.Register` explicitly passes
  Tier.C so the bootstrapper can identify out-of-process drivers
  without a per-driver-type allow-list.
- `DriverResilienceOptions` + parser grow `RecycleIntervalSeconds`.
  Tier A/B values are rejected with a diagnostic (decision #74 —
  recycling an in-process driver would kill every OPC UA session).
  Non-positive values are rejected the same way.
- `DriverInstanceBootstrapper` auto-arms a `ScheduledRecycleScheduler`
  after a successful driver register when: (1) the registered tier is
  C, (2) the row's ResilienceConfig carries a positive recycle interval,
  (3) DI has an `IDriverSupervisor` keyed by that `DriverInstanceId`.
  Missing supervisor → warn + skip (no crash). That keeps the wiring
  harmless by default: no driver ships a supervisor today, so the
  hosted service runs with zero schedulers out of the box.
- `Program.cs` registers `ScheduledRecycleHostedService` as singleton
  (shared with `DriverInstanceBootstrapper`) + hosted service (drives
  the tick loop). Constructor changes on the bootstrapper ripple into
  DI resolution automatically.

Tests: 4 new parser tests covering RecycleIntervalSeconds on Tier C
happy path, null default, Tier A/B rejection, non-positive rejection.
Existing 283 Server.Tests + 200 Core.Tests all still green.

No behavioural change for existing deployments: Galaxy driver + any
future Tier C driver gain the opt-in automatically; Tier A/B drivers
(FOCAS, Modbus, S7, AB CIP, AB Legacy, TwinCAT) are structurally
excluded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:58:13 -04:00
Joseph Doherty a52086efc5 Refresh phase-7-e2e-smoke.md to match current wiring
The runbook shipped at phase-7 close (2026-04-20) described the original
`Doubled = Source × 2` virtual tag, Float64 seed, and flat TagId-shaped
NodeIds. Four commits later the wiring has moved:

- Seed now targets `TestMachine_001.TestHistoryValue` (Int32, writable,
  historized) — no placeholder to fill in for the dev box.
- VirtualTag is `MachineStatus` (Boolean, `Source > 0`, historized).
- NodeIds are path-based per OPC UA Part 3 §5.2.2
  (`{driverId}/{folder-path}/{browseName}`).
- Seed inserts the ClusterNodeCredential row — without it the Server
  bootstrap fails `Unauthorized: caller X is not bound to NodeId`.

Changes:

1. Step 3 — replace "edit the placeholder" instructions with the ZB
   Galaxy-Repository query that finds writable historized attributes
   (dpc CTE + HistoryExtension EXISTS + `security_classification > 0`).
2. New step 4a — LDAP + `SecurityProfile = Basic256Sha256-Sign` recipe
   for the reverse-bridge + alarm-fires stages. Anonymous sessions are
   denied writes against `Operate`-classified attributes (PR 26 gate);
   `writeop / writeop123` against the dev-box GLAuth clears it.
3. Step 6 validation commands updated to the new NodeIds + reference
   the path-based scheme's Part-3 rationale.
4. Drive-the-alarm snippet now calls `otopcua-cli write … -U writeop`
   so operators see the explicit auth step.
5. Acceptance checklist updated for the new tag names + the
   test-galaxy.ps1 `-Username` invocation.
6. Added a 2026-04-24 second-run evidence section alongside the original
   — documents the 3/7 anonymous ceiling and what's needed to reach 7/7.

No code or seed changes in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:13:27 -04:00
Joseph Doherty ec1a5905bf Galaxy E2E — point at live writable historized attribute + MachineStatus
Pick a Galaxy attribute that actually exercises the full driver stack:
TestMachine_001.TestHistoryValue. Verified against the live dev-box ZB:
it's Int32, writable (security_classification = Operate), and historized
(HistoryExtension primitive). The query lives in
`gr/queries/attributes_extended.sql` — swap to any other writable
historized attribute via the same shape
(`WHERE is_historized = 1 AND security_classification > 0`).

Seed changes:
- Tag row: FullName = TestMachine_001.TestHistoryValue (Int32 / ReadWrite)
- VirtualTag renamed: `Doubled` → `MachineStatus` (Boolean), script returns
  `Source > 0`. Historized, so the write/subscribe exercise doubles as a
  historian-sink check once the alarm/write stages are enabled.
- Scripted alarm predicate reads the same Source and fires on `> 50`.
- Added ClusterNodeCredential(sa → p7-smoke-node) row so
  sp_GetCurrentGenerationForCluster's caller-binding check passes. Without
  this the server bootstrap fails with
  `Unauthorized: caller sa is not bound to NodeId p7-smoke-node`.

E2E script:
- Path-based NodeId defaults updated to match the new MachineStatus
  virtual tag.
- Added optional `-Username / -Password` parameters. Anonymous sessions
  still get denied against Operate-classified attributes (PR 26 /
  docs/Security.md); supplying `-Username writeop -Password writeop123`
  against the dev-box GLAuth exercises the reverse-bridge stage.
- Wired those credentials into every Invoke-Cli / Start-Process CLI
  invocation the script drives.

Anonymous smoke remains 3/7 pass (probe + source read + reverse-bridge
marked acl-expected INFO). A fuller run with
`-Username writeop -Password writeop123` requires also enabling LDAP +
a SecurityProfile that carries a UserName UserTokenPolicy — separate
config step tracked alongside #124 (3-user authz matrix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:04:39 -04:00
Joseph Doherty 69e1d320ac Cold-start guard for script engines — skip evaluation with empty upstream
Both VirtualTagEngine and ScriptedAlarmEngine share a pattern: the
BuildReadCache helper iterates the script's declared input set, reading
from _valueCache with a fallback to _upstream.ReadTag. When an upstream
tag hasn't yet delivered its first subscription push, ReadTag returns a
DataValueSnapshot with a null Value and BadNotConnected quality. User
scripts then cast `(double)ctx.GetTag(path).Value` unconditionally and
throw NullReferenceException — once per evaluation tick until the cache
fills, spamming the log with identical stack traces. The existing catch
block recovered (kept the prior state) but didn't silence the churn.

Add AreInputsReady(cache) to both engines: return true only when every
entry has a non-null Value and a non-Bad StatusCode (Good + Uncertain
are both considered ready). Skip script evaluation when the check
returns false — the engine holds the prior state (alarm) or the prior
snapshot (virtual tag) until upstream delivers. Eliminates the cold-
start NRE spam at root without changing the script-engine contract.

Also: fix $changeLines.Count in test-galaxy.ps1 — PowerShell's
Set-StrictMode -Version 3.0 errors on .Count when Where-Object returns
0 or 1 items. Wrap in `@(...)` to force an array; same pattern the
sibling _common.ps1 already uses in Write-Summary for the same reason.

Task #112 — the Galaxy live E2E now passes 3/7 stages (probe + source
read + reverse-bridge-ACL). The remaining 4 stages (virtual-tag,
subscribe-sees-change, alarm-fires, history-read) are deployment-
specific: MoveInBatchID is idle in this Galaxy + its AccessLevel blocks
writes + it's not historized. Cold-start behaviour is now correct, so
once the seed points at a live attribute those stages should light up.

Tests: 36/36 VirtualTags.Tests + 47/47 ScriptedAlarms.Tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 17:43:48 -04:00
Joseph Doherty 8be82e02c2 Path-based NodeIds — decouple client contract from driver address
The pre-refactor design minted OPC UA NodeIds directly from the driver's
FullReference (the native-address string). That had three long-term
problems:

1. OPC UA Part 3 §5.2.2 requires NodeIds to be immutable across a node's
   lifetime. A rename of the underlying device address — Galaxy attribute,
   S7 tag, Modbus register alias — changed the NodeId and broke every
   client that had pinned the previous identifier.
2. Two drivers with coincidentally-matching native addresses (e.g. `temp`
   in Modbus and `temp` in S7 under different Equipment rows) collided on
   the NodeId identifier.
3. TagConfig was being placed verbatim on the wire; for drivers whose
   TagConfig is JSON (every driver shipped today, per the
   CK_Tag_TagConfig_IsJson check constraint), clients saw the raw JSON
   blob as the NodeId string.

Refactor:

* DriverNodeManager.Variable now mints a stable path-based NodeId
  `{driverId}/{folder-path}/{browseName}` and records the driver-side
  FullReference in a new _fullRefByNodeId map. OnReadValue / OnWriteValue
  / ResolveFullRef look the FullReference up via that map instead of
  casting NodeId.Identifier. The old cast path is preserved as a
  fallback so any test fixture that still registers variables with
  FullRef-shaped NodeIds keeps working.

* EquipmentNodeWalker.AddTagVariable now extracts the cross-driver
  `FullName` field from Tag.TagConfig before handing the address to
  DriverAttributeInfo. Every shipped driver stores the wire reference in
  TagConfig[FullName]; falling back to the raw string covers any future
  driver that wants an opaque non-JSON address. ExtractFullName is
  exposed internal for unit coverage.

* scripts/e2e/test-galaxy.ps1 defaults updated to the new path-based
  NodeIds. Verified live against p7-smoke-galaxy on the dev box:
  `ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source` reads
  return Status=0x00000000 with a real Galaxy byte-array value.

Test suite: 195/195 Core.Tests + 283/283 Server.Tests green. Five new
ExtractFullName / FullName-passthrough tests added.

Task #112 GA-3 — golden-path read verified end-to-end; remaining E2E
script stages still blocked on pre-existing issues (ScriptedAlarm
predicate NRE on empty upstream cache, PowerShell $changeLines.Count
guard), tracked separately.
Task #134 — complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:57:20 -04:00
Joseph Doherty d11dd0520b Galaxy IPC unblock — live dev-box E2E path
Three root-cause fixes to get an elevated dev-box shell past session open
through to real MXAccess reads:

1. PipeAcl — drop BUILTIN\Administrators deny ACE. UAC's filtered token
   carries the Admins SID as deny-only, so the deny fired even from
   non-elevated admin-account shells. The per-connection SID check in
   PipeServer.VerifyCaller remains the real authorization boundary.

2. PipeServer — swap the Hello-read / VerifyCaller order. ImpersonateNamedPipeClient
   returns ERROR_CANNOT_IMPERSONATE until at least one frame has been read
   from the pipe; reading Hello first satisfies that rule. Previously the
   ACL deny-first path masked this race — removing the deny ACE exposed it.

3. GalaxyIpcClient — add a background reader + single pending-response
   slot. A RuntimeStatusChange event between OpenSessionRequest and
   OpenSessionResponse used to satisfy the caller's single ReadFrameAsync
   and fail CallAsync with "Expected OpenSessionResponse, got
   RuntimeStatusChange". The reader now routes response kinds (and
   ErrorResponse) to the pending TCS and everything else to a handler the
   driver registers in InitializeAsync. The Proxy was already set up to
   raise managed events from RaiseDataChange / RaiseAlarmEvent /
   OnHostConnectivityUpdate — those helpers had no caller until now.

4. RedundancyPublisherHostedService — swallow BadServerHalted while
   polling host.Server.CurrentInstance. StandardServer throws that code
   during startup rather than returning null, so the first poll attempt
   crashed the BackgroundService (and the host) before OnServerStarted
   ran. This race was latent behind the Galaxy init failure above.

Updates docs that described the Admins deny ACE + mandatory non-elevated
shells, and drops the admin-skip guards from every Galaxy integration +
E2E fixture that had them (IpcHandshakeIntegrationTests, EndToEndIpcTests,
ParityFixture, LiveStackFixture, HostSubprocessParityTests).

Adds GalaxyIpcClientRoutingTests covering the router's
request/response match, ErrorResponse, event-between-call, idle event,
and peer-close paths.

Verified live on the dev box against the p7-smoke cluster (gen 6):
driver registered=1 failedInit=0, Phase 7 bridge subscribed, OPC UA
server up on 4840, MXAccess read round-trip returns real data with
Status=0x00000000.

Task #112 — partial: Galaxy live stack is functional end-to-end. The
supplied test-galaxy.ps1 script still fails because the UNS walker
encodes TagConfig JSON as the tag's NodeId instead of the seeded TagId
(pre-existing; separate issue from this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:30:16 -04:00
Joseph Doherty fb6dd3478d Phase 6.2 Stream C wiring — AuthorizationBootstrap + OpcUaApplicationHost.SetAuthorization
Closes task #133 — the "authz gate is inert in production" blocker
surfaced during task #123. Before this commit, every ACL check on the
six dispatch surfaces (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) short-circuited to allow because Program.cs
constructed OpcUaApplicationHost without passing authzGate or
scopeResolver.

New pieces:

- `AuthorizationOptions` — bound to `Node:Authorization` in
  appsettings.json. `Enabled` (default false) is the master switch;
  `StrictMode` (default false) controls the anonymous / no-LDAP-groups
  fallback behaviour.
- `AuthorizationBootstrap` — singleton service that loads `NodeAcl`
  rows for the published generation, builds a `PermissionTrieCache` +
  `AuthorizationGate`, merges every registered driver's
  `EquipmentNamespaceContent` through `ScopePathIndexBuilder` into one
  full-path `NodeScopeResolver`. Returns `(null, null)` when disabled
  or when no generation is Published yet.
- `DriverEquipmentContentRegistry.Snapshot()` — new method returning a
  defensive copy of the driver → content map so the bootstrap can
  iterate without holding the lock.
- `OpcUaApplicationHost.SetAuthorization(gate, resolver)` — late-bind
  method matching the existing `SetPhase7Sources` pattern. Must run
  before `StartAsync`; rejects post-start rebinding with
  InvalidOperationException.
- `OpcUaServerService.ExecuteAsync` calls `AuthorizationBootstrap.BuildAsync`
  after `PopulateEquipmentContentAsync` and before `applicationHost.StartAsync`,
  in the same window that `SetPhase7Sources` runs.

Behaviour change
- Default (Enabled=false): no behaviour change — the gate stays null,
  all six dispatch surfaces run unchanged. Safe for any existing
  deployment on upgrade.
- Enabled=true with StrictMode=false: identities carrying LDAP groups
  are evaluated against the trie; anonymous / no-groups identities
  pass through (v1 legacy-client compatibility).
- Enabled=true with StrictMode=true: everything evaluates. Anonymous
  or no-groups identities are denied.

Follow-up not covered here: rebind the gate+resolver on generation
refresh (the `GenerationRefreshHostedService` that shipped earlier in
this session). Today the gate only reflects the bootstrap generation
— operators publishing new ACL changes need a process restart to see
them. Matches the current driver-hot-reload limitation and is tracked
in the existing 6.3 follow-up bullet.

Docs: v2-release-readiness.md Phase 6.2 Stream C.12 bullet flipped to
Closed with operator-facing config pointer (`Node:Authorization:Enabled`).

All 283/283 Server.Tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:35:46 -04:00
Joseph Doherty 1be0fb5a29 Phase 6.2 Stream C.12 — lock in ScopePathIndexBuilder semantics with tests
Closes task #123 (partial — builder semantics unit-tested; production
wiring is the new task #133).

ScopePathIndexBuilder + NodeScopeResolver indexed mode already exist —
they produce a full Cluster → Namespace → UnsArea → UnsLine → Equipment
→ Tag scope from the published generation's config rows. What was
missing: unit coverage of the Build semantics (the only consumers were
compile-time references) + explicit acknowledgement in the readiness
doc that the gate/resolver aren't yet wired into Program.cs.

Tests — 6 cases in ScopePathIndexBuilderTests.cs:
- Well-formed content emits full hierarchy.
- Tags with null EquipmentId skipped (SystemPlatform-namespace fallback).
- Tags with broken Equipment FK skipped (publish-time validation
  should have caught; builder is defensive).
- Equipment with broken Line FK skipped.
- Duplicate TagConfig throws InvalidOperationException.
- Resolver with index returns full-path scope; un-indexed ref falls
  through to cluster-only scope (pre-ADR-001 behaviour preserved).

Server.Tests 277 → 283.

Critical follow-up (task #133): Program.cs still constructs
OpcUaApplicationHost WITHOUT authzGate or scopeResolver, so all six
dispatch-layer gates (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) are currently inert in production. Wiring
them up — load NodeAcl + EquipmentNamespaceContent at bootstrap,
construct gate + resolver, pass into OpcUaApplicationHost, rebind on
generation refresh — is the last Phase 6.2 GA blocker.

Docs: v2-release-readiness.md Phase 6.2 Stream C hardening list marks
the scope-resolution bullet struck-through with a close-out note that
calls out the gate-inert-in-production gap + task #133.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:28:19 -04:00
Joseph Doherty ded292ecd7 Phase 6.2 Stream C — Call + Alarm Acknowledge/Confirm gating
Closes task #122 (Acknowledge + Confirm + generic Call — Shelve stays as
a follow-up pending per-instance method-NodeId resolution).

Before this commit any session with a connected channel could invoke
method nodes on driver-materialized equipment — including alarm
Acknowledge / Confirm. Combined with the Browse + CreateMonitoredItems
gates that landed earlier in Stream C, this was the last service-layer
entry point where a session could still affect state without passing
the authz trie.

Implementation on DriverNodeManager:
- `Call` override — pre-iterates methodsToCall, gates each through
  AuthorizationGate with the operation kind returned by
  MapCallOperation. Denied calls get errors[i] = BadUserAccessDenied
  before delegating to base.Call.
- `MapCallOperation(NodeId methodId)` — maps well-known Part 9 method
  NodeIds to dedicated operation kinds:
    MethodIds.AcknowledgeableConditionType_Acknowledge →
        OpcUaOperation.AlarmAcknowledge
    MethodIds.AcknowledgeableConditionType_Confirm →
        OpcUaOperation.AlarmConfirm
    everything else → OpcUaOperation.Call
  Lets the ACL distinguish "can acknowledge alarms" from "can invoke
  arbitrary methods" without conflating the two roles.
- Shelve dispatch paths through per-instance ShelvedStateMachine methods
  with dynamic NodeIds that can't be constant-matched — falls through
  to generic Call. Fine-grained OpcUaOperation.AlarmShelve is a follow-
  up when the method-invocation path grows a "method-role" annotation.

Extracted GateCallMethodRequests + MapCallOperation as static internal
for unit-testability. 8 new tests (MapCallOperation Acknowledge /
Confirm / generic; gate-null no-op, denied-Acknowledge, allowed-
Acknowledge, mixed-batch, pre-populated-error-preserved).
Server.Tests 269 → 277.

Known follow-ups:
- Shelve per-operation gating (see above).
- TranslateBrowsePathsToNodeIds gating (Browse follow-up from #120).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:22:19 -04:00
Joseph Doherty 6a6b0f56f2 Phase 6.2 Stream C — CreateMonitoredItems per-item gating
Closes task #121 (partial — creation-time gate; decision #153 per-item
revocation stamp is a follow-up).

Before this commit a session could subscribe to any node via
CreateMonitoredItems, even nodes where Read was denied — the
subscription would surface BadUserAccessDenied on each data-change
read, but the client saw a successful CreateMonitoredItems response
and held the subscription open, wasting resources and leaking the
address-space shape through the item metadata.

New override on DriverNodeManager.CreateMonitoredItems:
- Pre-iterates itemsToCreate, gates each through AuthorizationGate with
  OpcUaOperation.CreateMonitoredItems at the target node's scope.
- For denied slots: sets errors[i] = new ServiceResult(
  StatusCodes.BadUserAccessDenied). The OPC Foundation base stack
  honours pre-populated non-success errors and skips item creation for
  those slots — the subscription never holds a handle to a denied
  node.
- Preserves prior errors (e.g. BadNodeIdUnknown) — first diagnosis wins.
- Non-string-identifier references (stack-synthesized numeric ids)
  bypass the gate.

Extracted the pure filter logic into
GateMonitoredItemCreateRequests(items, errors, identity, gate,
scopeResolver) — static internal, unit-testable without the OPC UA
server stack.

Tests — 6 new in MonitoredItemGatingTests.cs (gate-null no-op,
denied-gets-BadUserAccessDenied, allowed-passes, mixed-batch-denies-
per-item, pre-populated-error-preserved, numeric-id-bypass). Server.Tests
263 → 269.

Known follow-ups:
- Per-item (AuthGenerationId, MembershipVersion) stamp (decision #153)
  for detecting revocation mid-subscription — needs subscription-layer
  plumbing.
- TransferSubscriptions not yet wired (same pattern, smaller scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:17:40 -04:00
Joseph Doherty e8b8541554 Phase 6.2 Stream C — Browse gating on DriverNodeManager
Closes task #120 (partial — strict point-check; ancestor-visibility
implication is a follow-up).

Before this commit DriverNodeManager exposed every materialized node to
every browsing session regardless of the user's ACL. Read + Write +
HistoryRead were already gated through AuthorizationGate in Phase 6.2
Stream C core; Browse was the one surface where the session could still
enumerate nodes it had no permission to touch, discovering structure
even when reads failed with BadUserAccessDenied.

Implementation
- New `Browse` override on DriverNodeManager that calls base.Browse
  first (lets the stack populate the reference list normally), then
  post-filters the IList<ReferenceDescription> so denied nodes are
  removed silently. OPC UA convention: Browse filtering is invisible to
  the client; no BadUserAccessDenied surfaces.
- Extracted the filter loop into the static internal
  `FilterBrowseReferences(references, userIdentity, gate, scopeResolver)`
  so the policy is unit-testable without standing up the full OPC UA
  server stack.
- Non-string NodeId identifiers (stack-synthesized standard-type
  references with numeric identifiers) bypass the gate — only driver-
  materialized nodes key into the authz trie.
- When AuthorizationGate or NodeScopeResolver is null, the filter is a
  no-op — preserves the pre-Phase-6.2 dispatch path for integration
  tests that construct DriverNodeManager without authz.

Tests — 6 new in BrowseGatingTests.cs (gate-null no-op, empty-list
no-op, denied-removed, allowed-passes-through, numeric-id bypass,
lax-mode null-identity keeps references). Server.Tests 257 → 263.

Known follow-up (tracked implicitly under #120 re-scope):
- Ancestor-visibility implication (acl-design.md §Browse line 111): a
  user with Read at `Line/Tag` should be able to Browse `Line` even
  without an explicit Browse grant. Current filter does a strict
  point-check. Proper fix needs TriePermissionEvaluator to expose a
  "subtree-has-any-grant" query.
- TranslateBrowsePathsToNodeIds not yet filtered (same extension
  pattern; small follow-up).

Docs: v2-release-readiness.md Phase 6.2 Stream C hardening list marks
the Browse bullet struck-through with "Partial" close-out note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:11:19 -04:00
Joseph Doherty a23de2a7e4 Phase 6.3 A.2 + D.1 — GenerationRefreshHostedService: poll + lease-wrap apply
Closes tasks #132 + #118 (GA hardening backlog).

Before this commit, the Server only observed the generation in force at
process start (SealedBootstrap). Peer-published generations accumulated
in the shared config DB while the running node kept serving the
generation it had sealed on boot. Two consequences:

1. Operator role-swaps required a process restart — Admin publishes a
   new generation, but the Server's RedundancyCoordinator never re-read
   the topology.
2. ApplyLeaseRegistry had no apply to wrap. ServiceLevelBand sat at
   PrimaryHealthy (255) during every publish because nothing opened a
   lease; PrimaryMidApply (200) was effectively dead code.

New GenerationRefreshHostedService (src/.../Server/Hosting/):
- Polls sp_GetCurrentGenerationForCluster every 5s (tunable).
- On change: opens leases.BeginApplyLease(newGenerationId, Guid.NewGuid()),
  calls coordinator.RefreshAsync inside the `await using`, releases on
  scope exit (success / exception / cancellation via IAsyncDisposable).
- Diagnostic properties: LastAppliedGenerationId, TickCount, RefreshCount.
- Delegate-injected currentGenerationQuery for test drive-through; real
  path is the private static DefaultQueryCurrentGenerationAsync.
- Registered as HostedService in Program.cs alongside the Phase 6.3
  redundancy / peer-probe stack.

Scope intentionally narrow: only the coordinator refreshes today. Driver
re-init, virtual-tag re-bind, script-engine reload remain as follow-up
wiring. The lease wrap is the right seam for those subscribers to hook
once they grow hot-reload support — the doc comments say so.

Tests
- 5 new unit tests in GenerationRefreshHostedServiceTests (first-apply,
  identity no-op, change-triggers-refresh, null-generation-is-no-op,
  lease-is-released-on-exit). Stub generation-query delegate; real
  coordinator backed by EF InMemory DB.
- Server.Tests total 252 → 257.

Docs
- v2-release-readiness.md Phase 6.3 follow-ups list marks the
  sp_PublishGeneration lease wrap bullet struck-through with close-out
  note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:02:33 -04:00
Joseph Doherty de77d42eab Phase 6.3 Stream B — peer-probe HostedServices populating PeerReachabilityTracker
Closes task #116 (GA hardening backlog). Before this commit the
RedundancyStatePublisher saw PeerReachability.Unknown for every peer
because the tracker had no writers — every healthy peer got
degraded to the Isolated-Primary band (230) even when fully reachable.
Not release-blocking (safe default), but not the full non-transparent-
redundancy UX either.

Two-layer probe model per docs/v2/implementation/phase-6-3-redundancy-runtime.md
§Stream B:

- PeerHttpProbeLoop (Stream B.1) — fast-fail layer at 2 s / 1 s timeout.
  Hits each peer's http://{Host}:{DashboardPort}/healthz via an injected
  IHttpClientFactory. Writes the HTTP bit of PeerReachability while
  preserving the UA bit from the last UA probe so a transient HTTP blip
  doesn't clobber the authoritative UA reading.

- PeerUaProbeLoop (Stream B.2) — authoritative layer at 10 s / 5 s
  timeout. Calls DiscoveryClient.GetEndpoints against opc.tcp://{Host}:
  {OpcUaPort} — cheap compared to a full Session.Create, no cert trust
  required. Short-circuits when the HTTP probe last reported the peer
  unhealthy (no wasted handshakes on a known-dead endpoint), clearing
  the stale UaHealthy bit in that case.

Both inherit from BackgroundService, follow the tick/delay/catch pattern
RedundancyPublisherHostedService + ResilienceStatusPublisherHostedService
established, and expose TickAsync() as internal for test drive-through.

New PeerProbeOptions class carries the four intervals/timeouts so
operators can tune cadence per site. Registered as singleton in Program.cs;
HTTP client registered by name so the OtOpcUa handler chain
(Serilog enrichers, potential future OpenTelemetry instrumentation) isn't
bypassed.

Tests — 9 new unit tests across PeerHttpProbeLoopTests (5) and
PeerUaProbeLoopTests (4). All pass. Server.Tests total 243 → 252.
Full solution build clean.

Docs: v2-release-readiness.md Phase 6.3 follow-ups list marks the
peer-probe bullet struck-through with a close-out note.

Still deferred in Phase 6.3:
  - OPC UA variable-node binding (task #117 — ServiceLevel + ServerUriArray)
  - sp_PublishGeneration lease wrap (task #118)
  - Client interop matrix (task #119)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:53:38 -04:00
Joseph Doherty 96918b148c Unblock phase-6 compliance meta-runner on task-galaxy-e2e
Two small fixes so `scripts/compliance/phase-6-all.ps1` exits 0 — this is
GA exit-criterion #1 from docs/v2/v2-release-readiness.md.

1. Admin csproj: bump OpenTelemetry.Extensions.Hosting 1.15.2 → 1.15.3 +
   OpenTelemetry.Exporter.Prometheus.AspNetCore 1.15.2-beta.1 →
   1.15.3-beta.1. Fixes NU1902 moderate-severity advisory
   (GHSA-g94r-2vxg-569j) on the transitive OpenTelemetry.Api 1.15.2 pull.
   TreatWarningsAsErrors on the Admin project promoted the advisory to an
   error and failed the whole `dotnet test` run at restore.

2. SchemaComplianceTests.All_expected_tables_exist: the expected-tables
   list drifted behind four Phase 7 migration additions — Script,
   ScriptedAlarm, ScriptedAlarmState, VirtualTag. The EF model + live
   migrations have carried these tables for a while; the compliance test
   just needed the four names added. Applied migrations against a scratch
   DB to confirm the list is exhaustive.

Verification: full solution test pass 2301 / 2301 (one tolerated
pre-existing CLI flake). Phase 6 aggregate compliance: all four phases
PASS with no test-count regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:36:20 -04:00
Joseph Doherty 69e0d02c72 task-galaxy-e2e branch — non-FOCAS work-in-progress snapshot
Catch-all commit for pending work on the task-galaxy-e2e branch that
wasn't part of the FOCAS migration. Grouping by topic so future per-topic
commits can be cherry-picked if needed.

TwinCAT
- src/.../Driver.TwinCAT/AdsTwinCATClient.cs + TwinCATDriverFactoryExtensions.cs:
  factory-registration extensions + ADS client refinements.
- src/.../Driver.TwinCAT.Cli/Commands/BrowseCommand.cs: new browse command
  for the TwinCAT test-client CLI.
- tests/.../Driver.TwinCAT.IntegrationTests/TwinCAT3SmokeTests.cs + TwinCatProject/:
  fixture scaffold with a minimal POU + README pointing at the TCBSD/ESXi
  VM for e2e.
- docs/Driver.TwinCAT.Cli.md + docs/drivers/TwinCAT-Test-Fixture.md:
  documentation for the above.
- docs/v3/twincat-backlog.md: forward-looking backlog seed.

Admin UI + fleet status
- src/.../Admin/Components/Pages/Clusters/DriversTab.razor + Hosts.razor:
  UI refresh for fleet-status rendering.
- src/.../Admin/Hubs/FleetStatusHub.cs + FleetStatusPoller.cs +
  Admin/Program.cs: SignalR hub + poller plumbing for live fleet data.
- tests/.../Admin.Tests/FleetStatusPollerTests.cs: poller coverage.

Server + redundancy runtime (Phase 6.3 follow-ups)
- src/.../Server/Hosting/RedundancyPublisherHostedService.cs: HostedService
  that owns the RedundancyStatePublisher lifecycle + wires peer reachability.
- src/.../Server/Redundancy/ServerRedundancyNodeWriter.cs: OPC UA
  variable-node writer binding ServiceLevel + ServerUriArray to the
  publisher's events.
- src/.../Server/Program.cs + Server.csproj: hosted-service registration.
- tests/.../Server.Tests/ServerRedundancyNodeWriterTests.cs +
  Server.Tests.csproj: coverage for the above.

Configuration
- src/.../Configuration/Validation/DraftValidator.cs +
  tests/.../Configuration.Tests/DraftValidatorTests.cs: draft-validation
  refinements.

E2E scripts (shared infrastructure)
- scripts/e2e/README.md + _common.ps1 + test-all.ps1: shared helpers + the
  all-drivers test-all runner.
- scripts/e2e/test-opcuaclient.ps1: OPC UA Client e2e runner.

Docs
- docs/v2/implementation/phase-6-{1,2,3,4}*.md + exit-gate-phase-{3,7}.md:
  phase-gate + implementation doc updates.
- docs/v2/plan.md: top-level plan refresh.
- docs/v2/redundancy-interop-playbook.md: client interop playbook for the
  Phase 6.3 redundancy-runtime work.

Two orphan FOCAS docs remain on disk but deliberately unstaged —
docs/v2/focas-deployment.md and docs/v2/implementation/focas-simulator-plan.md
describe the now-retired Tier-C topology and should either be rewritten
or deleted in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:12:19 -04:00
Joseph Doherty 4b0664bd55 FOCAS — retire Tier-C split, inline managed wire client, make read-only
Migration closes the FOCAS Tier-C architecture. OtOpcUa previously had
`Driver.FOCAS.Host` (NSSM-wrapped Windows service loading Fwlib64.dll via
P/Invoke) + `Driver.FOCAS.Shared` (MessagePack IPC contracts) + a C shim
DLL stand-in for unit tests. All of it is deleted; the driver is now a
single in-process managed assembly talking the FOCAS/2 Ethernet binary
protocol directly on TCP:8193.

Architecture

- Pure-managed `FocasWireClient` inlined at `src/.../Driver.FOCAS/Wire/`
  (owner-imported — see Wire/FocasWireClient.cs for the full surface).
  Opens two TCP sockets, runs the initiate handshake, serialises requests
  on socket 2 through a semaphore, closes cleanly with PDU + socket
  teardown. Both sync `IDisposable` and async `IAsyncDisposable`.
- `WireFocasClient` (same folder) adapts the wire client to OtOpcUa's
  `IFocasClient` surface — fixed-tree reads, PARAM/MACRO/PMC addresses,
  alarms. Writes return `BadNotWritable` by design — OtOpcUa is read-only
  against FOCAS.
- `FocasDriverFactoryExtensions` now accepts `"Backend": "wire"` (default)
  and `"Backend": "unimplemented"`. Legacy `ipc` and `fwlib` backends are
  rejected at startup with a diagnostic pointing at the migration doc.

Deletions

- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/` — whole project + Ipc/,
  Backend/, Stability/, Program.cs.
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared/` — Contracts/, FrameReader,
  FrameWriter, whole project.
- `tests/...Driver.FOCAS.Host.Tests/` + `.Shared.Tests/` — whole projects.
- `src/.../Driver.FOCAS/FwlibNative.cs` + `FwlibFocasClient.cs` — 21
  P/Invokes + 7 `Pack=1` marshalling structs + the Fwlib-backed
  `IFocasClient` implementation.
- `src/.../Driver.FOCAS/Ipc/` + `Supervisor/` — IPC client wrapper +
  Host-process supervisor (backoff, circuit breaker, heartbeat, post-
  mortem reader, process launcher).
- `scripts/install/Install-FocasHost.ps1` — NSSM service installer.
- `tests/.../Driver.FOCAS.Tests/{IpcFocasClientTests, IpcLoopback,
  FwlibNativeHelperTests, PostMortemReaderCompatibilityTests,
  SupervisorTests, FocasDriverFactoryExtensionsTests}.cs` — tests that
  exercised the retired surfaces.
- `tests/.../Driver.FOCAS.IntegrationTests/Shim/` — the zig-built C shim
  DLL that masqueraded as Fwlib64.dll.

Solution changes

- `ZB.MOM.WW.OtOpcUa.slnx` drops the 4 retired project refs.
- `src/.../Driver.FOCAS.csproj` drops the Shared ProjectReference, adds
  `Microsoft.Extensions.Logging.Abstractions` for the optional `ILogger`
  hook in `FocasWireClient`.
- `src/.../Driver.FOCAS.Cli.csproj` drops the six `<Content Include>`
  entries that copied `vendor/fanuc/*.dll` into the CLI bin. CLI now uses
  `WireFocasClient` directly.
- `FocasDriver` default factory flips to `Wire.WireFocasClientFactory`.

Integration tests

- New `tests/.../Driver.FOCAS.IntegrationTests/` project covering fixed-
  tree reads (identity, axes, dynamic, program, operation mode, timers,
  spindle load + max RPM, servo meters), user-authored PARAM / MACRO /
  PMC reads, `DiscoverAsync` emission, `SubscribeAsync` + `OnDataChange`,
  `IAlarmSource` raise/clear transitions, and `ProbeAsync` /
  `OnHostStatusChanged`. 9 e2e tests against the focas-mock fixture
  (Docker container with the vendored Python mock's native FOCAS/2
  Ethernet responder).
- `scripts/integration/run-focas.ps1` orchestrates compose up → tests →
  compose down. Dropped the shim-build stage + DLL-copy step + the split
  testhost workaround (the latter only existed because of native-DLL
  lifecycle bugs the shim tripped).
- Docker compose collapses from 11 per-series services to one `focas-sim`
  service. Tests seed per-series state via `mock_load_profile` at test
  start.
- Vendored focas-mock snapshot refreshed to pick up upstream's native
  FOCAS/2 Ethernet responder (was 660 lines, now 1018) — the
  pre-refresh snapshot only spoke the JSON admin protocol.

Tests

- 145/145 unit tests in `Driver.FOCAS.Tests` pass (was 208 pre-deletion;
  63 removed tests exercised the retired IPC/shim/supervisor/Fwlib
  surfaces).
- 9/9 integration tests pass against the refreshed mock.
- `FocasScaffoldingTests.Unimplemented_factory_throws_on_Create…` updated
  to assert the new diagnostic message pointing at
  `docs/drivers/FOCAS.md` rather than the now-gone `Fwlib64.dll`.

Docs

- `docs/drivers/FOCAS.md` rewritten for the managed wire topology —
  deployment collapses to one `"Backend": "wire"` config block, no
  separate service, no DLL deployment, no pipe ACL.
- `docs/drivers/FOCAS-Test-Fixture.md` updated — single TCP probe skip
  gate instead of TCP + shim probe; fewer moving parts.
- `docs/drivers/README.md` row for FOCAS reflects the Tier-A managed
  topology (previously listed Tier-C + `Fwlib64.dll` P/Invoke).
- `docs/Driver.FOCAS.Cli.md` drops the Tier-C architecture-note section.
- `docs/v2/implementation/focas-isolation-plan.md` marked historical —
  the plan it documents was executed then superseded by the wire client.
- `docs/v2/v2-release-readiness.md` re-audited 2026-04-24. Phase 5
  driver complement closed. FOCAS change-log entry added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:10:59 -04:00
Joseph Doherty 404b54add0 FOCAS — commit previously-orphaned support files
Brings seven FOCAS-related files into git that shipped as part of earlier
FOCAS work but were never staged. Adding them now so the tree reflects the
compilable state + pre-empts dead references from the migration commit that
follows:

- src/.../Driver.FOCAS/FocasAlarmProjection.cs — raise/clear diffing + severity
  mapping surfaced via IAlarmSource on FocasDriver. Referenced by committed
  FocasDriver.cs; tests in FocasAlarmProjectionTests.cs.
- src/.../Admin/Services/FocasDriverDetailService.cs — Admin UI per-instance
  detail page data source.
- src/.../Admin/Components/Pages/Drivers/FocasDetail.razor — Blazor page
  rendering the above (from task #69).
- tests/.../Admin.Tests/FocasDriverDetailServiceTests.cs — exercises the
  detail service.
- tests/.../Driver.FOCAS.Tests/FocasAlarmProjectionTests.cs — raise/clear
  diff semantics against FakeFocasClient.
- tests/.../Driver.FOCAS.Tests/FocasHandleRecycleTests.cs — proactive recycle
  cadence test.
- docs/v2/implementation/focas-wire-protocol.md — captured FOCAS/2 Ethernet
  wire protocol reference. Useful going forward even though the Tier-C /
  simulator plan docs are historical.

No runtime behaviour change — these files compile today and the solution
build/test pass already depends on them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:09:51 -04:00
684 changed files with 45455 additions and 61367 deletions
+10 -4
View File
@@ -12,14 +12,16 @@
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ZB.MOM.WW.OtOpcUa.Driver.Modbus.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.S7/ZB.MOM.WW.OtOpcUa.Driver.S7.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/ZB.MOM.WW.OtOpcUa.Driver.AbCip.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.Shared/ZB.MOM.WW.OtOpcUa.Client.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.CLI/ZB.MOM.WW.OtOpcUa.Client.CLI.csproj"/>
@@ -49,7 +51,12 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli.Tests.csproj"/>
@@ -65,8 +72,7 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests.csproj"/>
+6 -33
View File
@@ -183,46 +183,19 @@ otopcua-cli historyread -u opc.tcp://localhost:4840/OtOpcUa \
| `--start` | Start time, ISO 8601 or date string (default: 24 hours ago) |
| `--end` | End time, ISO 8601 or date string (default: now) |
| `--max` | Maximum number of values (default: 1000) |
| `--aggregate` | Aggregate function name (see catalog below). Case-insensitive. |
| `--aggregate` | Aggregate function: Average, Minimum, Maximum, Count, Start, End |
| `--interval` | Processing interval in milliseconds for aggregates (default: 3600000) |
#### Aggregate mapping
The CLI accepts the seven aggregates listed below — these are the
human-driven set the operator typically asks for from the command line.
| Name | OPC UA Node ID |
|------|---------------|
| `Average` (or `avg`) | `AggregateFunction_Average` |
| `Minimum` (or `min`) | `AggregateFunction_Minimum` |
| `Maximum` (or `max`) | `AggregateFunction_Maximum` |
| `Average` | `AggregateFunction_Average` |
| `Minimum` | `AggregateFunction_Minimum` |
| `Maximum` | `AggregateFunction_Maximum` |
| `Count` | `AggregateFunction_Count` |
| `Start` (or `first`) | `AggregateFunction_Start` |
| `End` (or `last`) | `AggregateFunction_End` |
| `StandardDeviation` (or `stddev` / `stdev`) | `AggregateFunction_StandardDeviationSample` |
The driver-side `IHistoryProvider.ReadProcessedAsync` API (used by the
OtOpcUa server's HistoryRead facade) supports the full OPC UA Part 13 §5
catalog — ~30 aggregates including `TimeAverage`, `Interpolative`, `Range`,
`PercentGood`, `Delta`, etc. See
[`docs/drivers/OpcUaClient.md`](drivers/OpcUaClient.md#historyread-aggregates-part-13-catalog)
for the full list. Adding a new CLI shorthand is a one-line change in
`HistoryReadCommand.ParseAggregateType` — file an issue if you need one
exposed.
#### Event-mode coverage
Drivers that implement the filter-aware
`IHistoryProvider.ReadEventsAsync(fullReference, EventHistoryRequest, ct)`
overload (currently the OPC UA Client gateway driver — Galaxy keeps the
fixed-field fallback) honour `EventFilter` SelectClauses and a `WhereClause`
when the server-side history facade forwards them. The CLI does not yet
expose a dedicated `--events` flag — clients that need filter-aware event
history call `HistoryReadEvents` through their own SDK; the CLI's
`historyread` command stays focused on the data-history (Raw / Processed /
AtTime) path. Adding `--events` is tracked as a follow-up — the wire path
on the driver side is in place (see
[`docs/drivers/OpcUaClient.md`](drivers/OpcUaClient.md#historyread-events)).
| `Start` | `AggregateFunction_Start` |
| `End` | `AggregateFunction_End` |
### alarms
-109
View File
@@ -20,8 +20,6 @@ dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli -- --help
| `-g` / `--gateway` | **required** | Canonical `ab://host[:port]/cip-path` |
| `-f` / `--family` | `ControlLogix` | ControlLogix / CompactLogix / Micro800 / GuardLogix |
| `--timeout-ms` | `5000` | Per-operation timeout |
| `--addressing-mode` | `Auto` | `Auto` / `Symbolic` / `Logical` — see [AbCip-Performance §Addressing mode](drivers/AbCip-Performance.md#addressing-mode). `Logical` against Micro800 silently falls back to Symbolic with a warning. |
| `--partner` | _(unset)_ | PR abcip-5.1 — partner gateway URI for a ControlLogix HSBY pair (e.g. `ab://10.0.0.6/1,0`). When set, the driver runs a second role-probe loop against the partner and the [`hsby-status`](#hsby-status--which-chassis-is-active-now) command can surface which chassis is currently Active. See [AbCip-HSBY.md](drivers/AbCip-HSBY.md) for the full guide. |
| `--verbose` | off | Serilog debug output |
Family ↔ CIP-path cheat sheet:
@@ -57,21 +55,6 @@ otopcua-abcip-cli read -g ab://10.0.0.5/1,0 -t "Recipe[3]" --type Real
otopcua-abcip-cli read -g ab://10.0.0.5/1,0 -t "Motor01.Speed" --type Real
```
#### Diagnostic / system tags
PR abcip-4.3 exposes five read-only diagnostic variables per device under
`AbCip/<device>/_System/` in the OPC UA address space (see
[AbCip-Operability §System tags](drivers/AbCip-Operability.md#system-tags--_system-folder)
for the full table). These are not reachable through the AB CIP CLI — they
live on the OPC UA server side, not the libplctag wire — so to read one,
point the **OPC UA client** CLI at the running OtOpcUa server:
```powershell
# Read _ConnectionStatus for one device through the OPC UA server
otopcua-client-cli read -u opc.tcp://localhost:4840 \
-n "ns=2;s=AbCip/ab://10.0.0.5/1,0/_System/_ConnectionStatus"
```
### `write` — single Logix tag
Same shape as `read` plus `-v`. Values parse per `--type` using invariant
@@ -90,65 +73,6 @@ otopcua-abcip-cli write -g ab://10.0.0.5/1,0 -t StartCommand --type Bool -v true
otopcua-abcip-cli subscribe -g ab://10.0.0.5/1,0 -t Motor01_Speed --type Real -i 500
```
### `hsby-status` — which chassis is Active now?
PR abcip-5.1 — read the role tag (`WallClockTime.SyncStatus` by default,
`S:34` for legacy SLC500 / PLC-5 fronts) on a ControlLogix HSBY pair and
print which chassis is currently Active. Requires `--partner`.
```powershell
otopcua-abcip-cli hsby-status -g ab://10.0.0.5/1,0 --partner ab://10.0.0.6/1,0
# Custom role tag (legacy fronts) and more samples
otopcua-abcip-cli hsby-status -g ab://10.0.0.5/1,0 --partner ab://10.0.0.6/1,0 \
--role-tag S:34 --samples 5
```
| Flag | Default | Purpose |
|---|---|---|
| `--role-tag` | `WallClockTime.SyncStatus` | Address of the role tag. Use `S:34` for SLC500 / PLC-5. |
| `--samples` | `3` | Number of role-probe ticks to wait for before printing. |
The output prints the resolved roles + the address of whichever chassis the
driver currently considers Active. PR abcip-5.1 only **reports** the role —
PR abcip-5.2 will land the routing change so reads / writes flow to the
Active chassis automatically.
See [AbCip-HSBY.md](drivers/AbCip-HSBY.md) for the role-tag detection matrix
+ active-resolution rules + the feature-flag gate.
### `rebrowse` — force a controller-side `@tags` re-walk
PR abcip-2.5 (issue #233) added `RebrowseAsync` to drop the cached UDT
template shapes and re-run the symbol-table enumerator without restarting
the driver. The CLI variant builds a transient driver against the supplied
gateway, runs the rebrowse, and prints the freshly discovered tag names —
useful after a controller program-download to confirm the new tags are
visible on the wire before wiring them through the OtOpcUa server.
```powershell
otopcua-abcip-cli rebrowse -g ab://10.0.0.5/1,0
```
## Refreshing the tag DB
Two operator-facing surfaces drive the same `RebrowseAsync` plumbing — pick
the one that matches your context:
| Surface | When to use | Command |
|---|---|---|
| **CLI `rebrowse`** | Off-server validation. Spins up a transient driver against the gateway, prints the discovered tag list, no shared state with the live OtOpcUa server. | `otopcua-abcip-cli rebrowse -g ab://10.0.0.5/1,0` |
| **OPC UA write to `_RefreshTagDb`** | Production / Admin-UI button (PR abcip-4.4). Forces the **live** driver to re-walk + clear its template cache. The `AbCip.RefreshTriggers` driver-diagnostics counter increments per truthy write. | `otopcua-client-cli write -u opc.tcp://localhost:4840 -n "ns=2;s=AbCip/ab://10.0.0.5/1,0/_System/_RefreshTagDb" -v true --type Boolean` |
Read-back semantics: `_RefreshTagDb` always reads back as `false` (Kepware-
style "latches to idle the moment the dispatch returns") so a subscribed
client sees a stable shape regardless of how many refreshes have fired.
Falsy / unparseable writes are no-ops that still report `Good` so a UI
template that resets the trigger flag after firing it doesn't see a phantom
error. See
[AbCip-Operability §System tags](drivers/AbCip-Operability.md#refreshing-the-tag-db-via-opc-ua-write)
for the full semantics + the diagnostics counter wiring.
## Typical workflows
- **"Is the PLC reachable?"** → `probe`.
@@ -157,36 +81,3 @@ for the full semantics + the diagnostics counter wiring.
- **"Is this GuardLogix safety tag writable from non-safety?"** → `write` and
read the status code — safety tags surface `BadNotWritable` / CIP errors,
non-safety tags surface `Good`.
- **"Did my program download show up in the address space?"** → `rebrowse`
(off-server) or write `true` to the live server's `_RefreshTagDb` system
tag (in-server, PR abcip-4.4) — both drop the template cache + force a
fresh `@tags` walk.
## Connection Size
PR abcip-3.1 introduced a per-device `ConnectionSize` override on the driver
side (`AbCipDeviceOptions.ConnectionSize`, range `500..4002`). The CLI does
not expose a flag for it — every CLI invocation uses the family-default
Connection Size (4002 / 504 / 488 depending on `--family`). When a Forward
Open is rejected with a CIP error like `0x01/0x113` ("connection request
size invalid"), the symptom is almost always a **mismatch between the chosen
family default and the controller firmware**:
- **v19-and-earlier ControlLogix** caps at 504 — pick `--family CompactLogix`
on the CLI to fall back to that narrower default.
- **5069-L1/L2/L3 CompactLogix** narrow-buffer parts also cap at 504, which
is the family default already.
- **FW20+ ControlLogix** accepts the full 4002.
For the warning *"AbCip device 'X' family 'Y' uses a narrow-buffer profile
(default ConnectionSize Z); the configured ConnectionSize N exceeds the
511-byte legacy-firmware cap..."* see
[`docs/drivers/AbCip-Performance.md`](drivers/AbCip-Performance.md) — that
warning is fired by the driver host, not the CLI.
## Related operability knobs
- [`docs/drivers/AbCip-Operability.md`](drivers/AbCip-Operability.md) — Phase 4
per-tag knobs (per-tag scan rate, deadband, etc). The CLI does not expose
these knobs directly; they're set in driver config JSON and consumed by the
driver at subscribe time.
+1 -193
View File
@@ -19,11 +19,7 @@ dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli -- --help
|---|---|---|
| `-g` / `--gateway` | **required** | Canonical `ab://host[:port]/cip-path` |
| `-P` / `--plc-type` | `Slc500` | Slc500 / MicroLogix / Plc5 / LogixPccc |
| `--timeout-ms` | `5000` | Per-operation timeout — see precedence note below |
| `--retries` | `0` | Retry count on transient `BadCommunicationError` (PR 9 / #252) |
| `--demote-failure-threshold` | `3` | **PR ablegacy-12 / #255** — consecutive comm failures before the device is auto-demoted |
| `--demote-for-ms` | `30000` | **PR ablegacy-12 / #255** — auto-demote cool-down window in ms |
| `--no-demote` | off | **PR ablegacy-12 / #255** — disable auto-demote entirely (counters still tick) |
| `--timeout-ms` | `5000` | Per-operation timeout |
| `--verbose` | off | Serilog debug output |
Family ↔ CIP-path cheat sheet:
@@ -32,52 +28,6 @@ Family ↔ CIP-path cheat sheet:
with no backplane
- **LogixPccc** — `1,0` (Logix controller accessed via the PCCC compatibility
layer; rare)
- **PLC-5 via 1756-DHRIO bridge** — `1,<slot>,2,<station-octal>` (PLC-5 only).
See [drivers/AbLegacy-DH-Bridging.md](drivers/AbLegacy-DH-Bridging.md) for
the full DH+ syntax, octal-station reference (00..77 = 0..63), and manual
hardware smoke procedure.
#### DHRIO worked example (PR ablegacy-13 / #256)
PLC-5 on DH+ node 7 (octal 07), DHRIO module in chassis slot 3,
EtherNet/IP gateway 192.168.1.10:
```powershell
otopcua-ablegacy-cli read `
-g ab://192.168.1.10/1,3,2,07 `
-P Plc5 -a N7:10 -t Int
```
The parser validates `1,<slot>,2,<station>`: port-1 must be the backplane,
slot must be 0..16, port-3 must be `2` (DH+), station must be octal 0..77 (so
`80`, `90`, etc. are rejected). Combining a DH+ bridge path with a non-PLC-5
family at startup throws `InvalidOperationException("DHRIO bridging is
PLC-5-only")`.
### Per-device timeout / retry tuning (#252, PR 9)
The CLI's `--timeout-ms` is the **driver-wide default** when launched as a
one-shot test client. In production (server-side, multi-device deployment)
each `AbLegacyDeviceOptions` row carries its own optional `Timeout` /
`Retries` that override the driver-wide value.
Precedence (highest → lowest): per-device override → driver-wide default →
hard-coded fallback (2000 ms / 0 retries).
Tuning cheat sheet — start here, measure, then trim:
| Family | Recommended `Timeout` | Notes |
|---|---|---|
| SLC 5/01 (RS-232 / DH+ bridge) | **5000 ms** | Slowest of the bunch; serial round-trip plus DH+ hop |
| SLC 5/02 / 5/03 (DH+) | 3000 ms | Bridged Ethernet → DH+ adds ~1 s |
| **SLC 5/04 / 5/05** (Ethernet) | **2000 ms** | Fastest of the SLC family — direct EIP/PCCC |
| MicroLogix 1100 / 1400 | **3000 ms** | Single-CPU, slow scan; no backplane |
| PLC-5 (Ethernet I/F) | 2500 ms | Comparable to SLC 5/05 over EIP |
| LogixPccc compat layer | 2000 ms | Logix CPU is fast; PCCC layer is the floor |
A small `--retries 1` (or `2` for slow chassis) is generally safe — the retry
loop only fires on transient `BadCommunicationError`; terminal errors
(`BadNodeIdUnknown`, `BadTypeMismatch`, …) surface on the first attempt.
## PCCC address primer
@@ -108,37 +58,6 @@ otopcua-ablegacy-cli probe -g ab://192.168.1.20/1,0
otopcua-ablegacy-cli probe -g ab://192.168.1.30/ -P MicroLogix -a S:0
```
`probe` output (PR ablegacy-12 / #255) reports both `Health` (driver health
state) and `Host state`. The latter is sourced from `IHostConnectivityProbe`
and surfaces `Demoted` when the auto-demote threshold has tripped — a fast
visual signal that the CLI is short-circuiting future reads against this
device until the cool-down expires:
```text
Gateway: ab://192.168.1.20/1,0
PLC type: Slc500
Health: Degraded
Host state: Demoted
Last error: libplctag status -33 reading N7:0
```
### Auto-demote knobs
```powershell
# Trip after just one comm failure, hold for 60s.
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a N7:0 -t Int `
--demote-failure-threshold 1 --demote-for-ms 60000
# Opt out of auto-demote — stresses the link without short-circuiting.
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a N7:0 -t Int --no-demote
```
The CLI is a one-shot test client — auto-demote primarily matters in the
server-side multi-device deployment, where a single demoted PLC can no
longer block reads against its healthy peers. Use the CLI flags to
reproduce a flapping-link scenario locally before tuning the server-side
`appsettings.json` `Demote` block.
### `read`
```powershell
@@ -156,17 +75,8 @@ otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a L19:0 -t Long
# Timer ACC
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a T4:0.ACC -t TimerElement
# Diagnostic counter (PR ablegacy-10 / #253). The seven _Diagnostics/<name>
# addresses live alongside user tags — short-circuit serves them straight from
# the in-process counter store, so no PCCC frame is sent to the PLC.
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 --address _Diagnostics/RequestCount
```
The diagnostic surface auto-emits per device — no config required. See
`docs/drivers/AbLegacy-Diagnostics.md` for the full counter table + reset
semantics + collision-rejection rules.
### `write`
```powershell
@@ -185,108 +95,6 @@ PLC-managed — use with caution.
otopcua-ablegacy-cli subscribe -g ab://192.168.1.20/1,0 -a N7:10 -t Int -i 500
```
#### Deadband
PR 8 — per-tag absolute / percent change filter on top of the polled subscription. The driver
caches the last *published* value per tag and suppresses `OnDataChange` notifications until the
new sample crosses the configured threshold.
| Flag | Effect |
|---|---|
| `--deadband-absolute <value>` | Suppress until `|new - prev| >= value`. |
| `--deadband-percent <value>` | Suppress until `|new - prev| >= |prev * value / 100|`. `prev == 0` always publishes (avoids div-by-zero). |
Booleans bypass the filter entirely (every transition publishes); strings + status changes
always publish; first-seen always publishes; both flags set → either passing triggers a
publish (Kepware-style logical OR).
```powershell
# Float — drop sub-0.5 jitter from the noisy load-cell address.
otopcua-ablegacy-cli subscribe -g ab://192.168.1.20/1,0 -a F8:0 -t Float -i 500 `
--deadband-absolute 0.5
# Integer — only fire on >= 5% deviation from the last reported value.
otopcua-ablegacy-cli subscribe -g ab://192.168.1.20/1,0 -a N7:10 -t Int -i 500 `
--deadband-percent 5
```
## Array reads
PR 7 — one PCCC frame can carry up to ~120 words. Address an array tag with either the
Rockwell-native `,N` suffix or the libplctag-native `[N]` suffix on the word number; both
forms canonicalise to `[N]` when the driver hands the tag to libplctag, and the parser
caps `N` at 120.
```powershell
# Rockwell `,N` form — "10 consecutive words starting at N7:0"
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a "N7:0,10" -t Int
# libplctag `[N]` form — same wire result
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a "N7:0[10]" -t Int
# Float / Long arrays — same suffix syntax, narrower frame ceiling on Float (~60 elements)
# and Long (~60 elements) because each element is 4 bytes vs Int's 2.
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a "F8:0,4" -t Float
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a "L19:0,4" -t Long
# --array-length override — pin the element count from config rather than the address
# suffix. Wins over the parsed `,N` / `[N]` value when both are set; useful for keeping the
# address string compact while bumping the element count from a tags config file.
otopcua-ablegacy-cli read -g ab://192.168.1.20/1,0 -a "N7:0" --array-length 10 -t Int
```
Array tags reject sub-element references (`T4:0,5.ACC`) and bit suffixes (`N7:0,10/3`) at
parse time — both combinations are semantically meaningless against a contiguous block.
For `B`-files the Rockwell convention is "one BOOL per word, not per bit": `B3:0,10`
returns `bool[10]` (one per word's non-zero state), not `bool[160]`.
### `import-rslogix`
ablegacy-11 / [#254](https://github.com/dohertj2/lmxopcua/issues/254) — bulk-import RSLogix
500 / 5 CSV symbol exports into an `appsettings.json` tag fragment. Avoids hand-typing every
`N7:0` / `F8:12` / `B3:0/5` row of a several-hundred-tag PLC. Binary `.RSS` / `.RSP` project
files are out of scope; export to CSV first.
```powershell
# Default: emit JSON fragment to stdout
otopcua-ablegacy-cli import-rslogix `
--file C:\plc\plc-export.csv `
--device ab://192.168.1.20/1,0
# Write the fragment to a file + print a summary line to stdout
otopcua-ablegacy-cli import-rslogix `
--file C:\plc\plc-export.csv `
--device ab://192.168.1.20/1,0 `
--output tags.json
# Filter by Scope column — only import Local:1 program-scoped tags
otopcua-ablegacy-cli import-rslogix `
--file C:\plc\plc-export.csv `
--device ab://192.168.1.20/1,0 `
--scope Local:1
# Summary mode — one-line counter for CI / health checks
otopcua-ablegacy-cli import-rslogix `
--file C:\plc\plc-export.csv `
--device ab://192.168.1.20/1,0 `
--emit summary
```
| Flag | Default | Purpose |
|---|---|---|
| `-f` / `--file` | **required** | RSLogix CSV path |
| `-d` / `--device` | **required** | `ab://host[:port]/cip-path` every imported tag binds to |
| `--emit` | `appsettings-fragment` | `appsettings-fragment` (JSON) or `summary` (one-line counter) |
| `-o` / `--output` | stdout | Optional output file path |
| `--scope` | none | Scope filter — `Global` / `Local:N` (case-insensitive); empty Scope counts as Global |
| `--max-rows` | unlimited | Defensive cap on rows imported |
| `--strict` | off | Fail-fast on first malformed row (default permissive: skip + log) |
See [drivers/AbLegacy-RSLogix-Import.md](drivers/AbLegacy-RSLogix-Import.md) for the full
column reference, file-letter → `AbLegacyDataType` mapping, and the API surface
(`IRsLogixImporter`, `AbLegacyDriverOptions.AddRsLogixImport`).
## Known caveat — ab_server upstream gap
The integration-fixture `ab_server` Docker container accepts TCP but its PCCC
+18 -81
View File
@@ -5,27 +5,22 @@ protocol. Uses the **same** `FocasDriver` the OtOpcUa server does — PMC R/G/F
file registers, axis bits, parameters, and macro variables — all through
`FocasAddressParser` syntax.
Sixth of the driver test-client CLIs, added alongside the Tier-C isolation
work tracked in task #220.
Sixth of the driver test-client CLIs.
## Architecture note
FOCAS is a Tier-C driver: `Fwlib32.dll` is a proprietary 32-bit Fanuc library
with a documented habit of crashing its hosting process on network errors.
The target runtime deployment splits the driver into an in-process
`FocasProxyDriver` (.NET 10 x64) and an out-of-process `Driver.FOCAS.Host`
(.NET 4.8 x86 Windows service) that owns the DLL — see
[v2/implementation/focas-isolation-plan.md](v2/implementation/focas-isolation-plan.md)
and
[v2/implementation/phase-6-1-resilience-and-observability.md](v2/implementation/phase-6-1-resilience-and-observability.md)
for topology + supervisor / respawn / back-pressure design.
FOCAS is an in-process driver. The pure-managed `WireFocasClient`
speaks the FOCAS2 binary protocol directly over TCP:8193, removing the
Tier-C process-isolation split that the historical P/Invoke + out-of-
process Host arrangement required. The CLI loads `FocasDriver` with
`WireFocasClientFactory` and talks to the CNC without any native
components.
The CLI skips the proxy and loads `FocasDriver` directly (via
`FwlibFocasClientFactory`, which P/Invokes `Fwlib32.dll` in the CLI's own
process). There is **no public simulator** for FOCAS; a meaningful probe
requires a real CNC + a licensed `Fwlib32.dll` on `PATH` (or next to the
executable). On a dev box without the DLL, every wire call surfaces as
`BadCommunicationError` — still useful as a "CLI wire-up is correct" signal.
A dev-friendly mock is available — start
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml`
and point `--cnc-host` at `localhost` for end-to-end CLI exercises
without a real CNC. See
[drivers/FOCAS-Test-Fixture.md](drivers/FOCAS-Test-Fixture.md).
## Build + run
@@ -51,7 +46,6 @@ Every command accepts:
| `-p` / `--cnc-port` | `8193` | FOCAS TCP port (FOCAS-over-EIP default) |
| `-s` / `--series` | `Unknown` | CNC series — `Unknown` / `Zero_i_D` / `Zero_i_F` / `Zero_i_MF` / `Zero_i_TF` / `Sixteen_i` / `Thirty_i` / `ThirtyOne_i` / `ThirtyTwo_i` / `PowerMotion_i` |
| `--timeout-ms` | `2000` | Per-operation timeout |
| `--cnc-password` | (none) | **F4-d (issue #271)** — optional CNC connection-level password emitted via `cnc_wrunlockparam` on connect. Required only by controllers that gate parameter writes / selected reads behind a password switch (16i + some 30i firmwares with parameter-protect on). **PASSWORD INVARIANT: never logged.** The CLI's Serilog config does not destructure this flag and `FocasDeviceOptions.ToString` redacts the value. See [`v2/focas-deployment.md`](v2/focas-deployment.md) § "FOCAS password handling". |
| `--verbose` | off | Serilog debug output |
## Addressing
@@ -111,74 +105,16 @@ Values parse per `--type` with invariant culture. Booleans accept
```powershell
otopcua-focas-cli write -h 192.168.1.50 -a R100 -t Int16 -v 42
otopcua-focas-cli write -h 192.168.1.50 -a G50.3 -t Bit -v on
# MACRO: write — recipe / setpoint surface (server-side WriteOperate ACL)
otopcua-focas-cli write -h 192.168.1.50 -a MACRO:500 -t Int32 -v 42
# PARAM: write — commissioning surface (server-side WriteConfigure ACL,
# CNC must be in MDI mode + parameter-write switch enabled, else EW_PASSWD
# surfaces as BadUserAccessDenied)
otopcua-focas-cli write -h 192.168.1.50 -a PARAM:1815 -t Int32 -v 100
otopcua-focas-cli write -h 192.168.1.50 -a MACRO:500 -t Float64 -v 3.14
```
> **WARNING — `write -a G50.3 -t Bit -v on` is a read-modify-write.**
> The wire call `pmc_wrpmcrng` is byte-addressed; the driver reads the
> parent byte at `G50` first, sets bit 3, and writes the byte back. Other
> bits in `G50` that the ladder is concurrently updating may be clobbered
> by the byte we read a millisecond ago. Coordinate via a ladder-side
> handshake when this matters. **PMC writes also bypass the ladder's
> normal MDI-mode protection** — a misdirected bit can move motion or
> latch a feedhold the moment it lands. Verify e-stop is live and the
> machine is in JOG mode before issuing the first PMC write of a
> session. See [`docs/drivers/FOCAS.md`](drivers/FOCAS.md) "PMC bit-write
> read-modify-write semantics" for the full RMW flow.
PMC G/R writes land on a running machine — be careful which file you hit.
Parameter writes may require the CNC to be in MDI mode with the
parameter-write switch enabled.
#### Server-enforced ACL — issue #269, plan PR F4-b
When the same write flows through the OtOpcUa server (rather than the CLI's
direct-to-CNC path), the server-layer ACL gates by tag kind:
- `PARAM:` writes require **`WriteConfigure`** group membership — heavier
ACL because a misdirected parameter write can put the CNC in a bad
state.
- `MACRO:` writes require **`WriteOperate`** — matches the standard HMI
recipe / setpoint surface.
- PMC R/G/F writes require **`WriteOperate`**.
The classification is declared by the FOCAS driver per tag and enforced by
`DriverNodeManager`; the driver itself never inspects user identity. See
[`docs/security.md`](security.md) for the full LDAP-group → permission
mapping, [`docs/v2/acl-design.md`](v2/acl-design.md) for the design, and
[`docs/v2/focas-deployment.md`](v2/focas-deployment.md) "Write safety" for
the operator pre-check runbook (MDI mode, parameter-write switch).
**Writes are non-idempotent by default** — a timeout after the CNC already
applied the write will NOT auto-retry (plan decisions #44 + #45).
#### Server-side `Writes` enforcement (issue #268 F4-a + #269 F4-b + #270 F4-c)
The OtOpcUa server gates every FOCAS write behind multiple independent
opt-ins: `FocasDriverOptions.Writes.Enabled` (driver-level master switch),
`Writes.AllowParameter` (PARAM kill switch — F4-b), `Writes.AllowMacro`
(MACRO kill switch — F4-b), `Writes.AllowPmc` (PMC kill switch — F4-c),
and `FocasTagDefinition.Writable` (per-tag). All default `false`; any one
off short-circuits the server-side `WriteAsync` to `BadNotWritable` before
the wire client is touched. See [`docs/drivers/FOCAS.md`](drivers/FOCAS.md)
"Writes (opt-in, off by default)" subsection +
[`docs/v2/decisions.md`](v2/decisions.md) for the decision record.
**The CLI bypasses the server-side flag.** `otopcua-focas-cli write` is a
per-invocation operator tool — it sets `Writes.Enabled = true` locally for
the lifetime of one process and creates the synthesised tag with
`Writable = true`. This is intentional: the CLI is the operator's
direct-to-CNC fallback, not a long-lived process bound to the central
config DB. Configuring the server still requires both opt-ins to be set
explicitly in the DriverInstance JSON.
### `subscribe` — watch an address until Ctrl+C
FOCAS has no push model; the shared `PollGroupEngine` handles the tick
@@ -211,7 +147,8 @@ fails.
**"Why did this macro flip?"** → `subscribe` to the macro, let the
operator reproduce the cycle, watch the HH:mm:ss.fff timeline.
**"Is the Fwlib32 DLL wired up?"** → `probe` against any host. A
`DllNotFoundException` surfacing as `BadCommunicationError` with a
matching `Last error` line means the driver is loading but the DLL is
missing; anything else means a transport-layer problem.
**"Can I reach the CNC on TCP:8193?"** → `probe` against any host. A
`BadCommunicationError` means the wire client couldn't open a socket
(firewall / wrong host / FOCAS Ethernet option unlicensed on the CNC).
`BadDeviceFailure` after a successful connect means the CNC is rejecting
the session setup — check the CNC's FOCAS option and password settings.
+34
View File
@@ -119,3 +119,37 @@ address.
**"What's the right byte order for this family?"** → `read` with
`--byte-order BigEndian`, then with `--byte-order WordSwap`. The one that
gives plausible values is the correct one for that device.
## v2 addressing grammar
The driver accepts the industry-standard tag-address grammar so you can
paste tag spreadsheets from Wonderware / Kepware / Ignition without
per-row manual translation. Full reference + grammar rules:
[`docs/v2/modbus-addressing.md`](v2/modbus-addressing.md).
Quick examples:
```
40001 HoldingRegisters[0], Int16
400001 same, 6-digit form
40001:F Float32
40001:F:CDAB Float32 word-swapped
40001:STR20 20-char ASCII string
40001:S:5 Int16[5] array (3-field shorthand)
40001:F:CDAB:10 Float32[10] with explicit word-swap (4-field strict)
40001.5 bit 5 of HR[0]
HR1:I Int32 via mnemonic region prefix (matches Wonderware)
C100 Coil 100 (mnemonic, 1-based)
V2000:F:CDAB DL205 V-memory at PDU 1024 + Float32 + word-swap (Family=DL205)
D100:I MELSEC D-register 100, Int32 (Family=MELSEC)
```
**Type-code reminder** (post-#146): `:I` is **Int32** (matches Wonderware
DASMBTCP + Ignition `HRI`). The explicit Int16 code is `:S`. Bare HR/IR
with no type still defaults to Int16. Pre-#146 codes `:DI` / `:L` /
`:UDI` / `:UL` / `:LI` / `:ULI` / `:LBCD` are removed; configs that use
them get a clear "Unknown type code" diagnostic at parse time.
In `DriverConfig` JSON, set the per-tag `addressString` field instead of
the structured `region` + `address` + `dataType` fields. Both styles can
coexist within one driver instance.
-134
View File
@@ -22,11 +22,6 @@ dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli -- --help
| `--rack` | `0` | Hardware rack (S7-400 distributed setups only) |
| `--slot` | `0` | CPU slot (S7-300 = 2, S7-400 = 2 or 3, S7-1200/1500 = 0) |
| `--timeout-ms` | `5000` | Per-operation timeout |
| `--tsap-mode` | `Auto` | ISO-on-TCP connection class: `Auto` / `Pg` / `Op` / `S7Basic` / `Other`. Hardened S7-1500 / ET 200SP CPUs may require `Op` or `S7Basic`. See [s7.md TSAP / Connection Type](v2/s7.md#tsap--connection-type). |
| `--local-tsap` | (unset) | Optional 16-bit local TSAP override (e.g. `0x0200`). Required when `--tsap-mode Other`; wins over class default under Pg/Op/S7Basic. |
| `--remote-tsap` | (unset) | Optional 16-bit remote TSAP override. Required when `--tsap-mode Other`; wins over class default under Pg/Op/S7Basic. |
| `--password` | (unset) | Connection-level password sent right after `OpenAsync`. Used by hardened S7-300/400 (protection levels 1-3) and S7-1200/1500 (TIA Portal *Connection Mechanism* gate). Never logged. NB: S7netplus 0.20 doesn't expose `SendPassword`; the CLI prints a one-line warning and continues. See [s7.md "PLC password / protection levels"](v2/s7.md#plc-password--protection-levels). |
| `--protection-level` | `Auto` | Declarative hint: `Auto` / `None` / `Level1` / `Level2` / `Level3` (S7-300/400) / `ConnectionMechanism` (S7-1200/1500). Diagnostic only — the wire-side unlock is driven by `--password`. |
| `--verbose` | off | Serilog debug output |
## PUT/GET must be enabled
@@ -36,28 +31,6 @@ Enable it in TIA Portal: *Device config → Protection & Security → Connection
mechanisms → "Permit access with PUT/GET communication from remote partner"*.
Without it the CLI's first read will surface `BadNotSupported`.
### Pre-flight PUT/GET enablement (PR-S7-C5)
The driver issues a tiny 2-byte read against `Probe.ProbeAddress` (default
`MW0`) immediately after `OpenAsync` and **fails `InitializeAsync` with a
typed `S7PutGetDisabledException`** when the PLC rejects the read with the
wire-level "function not allowed" response. The exception message names the
exact TIA Portal toggle to flip — operators see the configuration fix at
init time, not after the first per-tag read produces `BadDeviceFailure`.
Two opt-out knobs on the JSON `Probe` block:
- `ProbeAddress` — set to `""` (empty string) to skip the pre-flight read
entirely. Useful when no fingerprint address has been wired.
- `SkipPreflight` — set to `true` to defer the check to runtime while
keeping the background liveness loop. Per-tag reads still surface
`BadDeviceFailure` until PUT/GET is enabled, but Init succeeds and the
driver becomes visible in the Admin UI.
See [s7.md "Pre-flight PUT/GET enablement"](v2/s7.md#pre-flight-putget-enablement)
for the full rationale, classifier behaviour, and the wire-level
`ErrorCode` matching.
## S7 address grammar cheat sheet
| Form | Meaning |
@@ -97,17 +70,6 @@ otopcua-s7-cli read -h 192.168.1.30 -a M0.0 -t Bool
# 80-char S7 string
otopcua-s7-cli read -h 192.168.1.30 -a DB10.STRING[0] -t String --string-length 80
# CPU diagnostics (SZL) — virtual @System.* addresses (PR-S7-E1).
# Requires ExposeSystemTags = true on the driver instance; surfaces as
# BadNotSupported until S7netplus exposes a public ReadSzlAsync (or we ship
# a raw-PDU helper). See docs/v2/s7.md "CPU diagnostics (SZL)" for the full
# table and the snap7 / S7netplus 0.20 caveat.
otopcua-s7-cli read -h 192.168.1.30 -a @System.CpuType -t String
otopcua-s7-cli read -h 192.168.1.30 -a @System.Firmware -t String
otopcua-s7-cli read -h 192.168.1.30 -a @System.OrderNo -t String
otopcua-s7-cli read -h 192.168.1.30 -a @System.CycleMs.Min -t Float64
otopcua-s7-cli read -h 192.168.1.30 -a "@System.DiagBuffer.Entry[0]" -t String
```
### `write`
@@ -121,63 +83,6 @@ otopcua-s7-cli write -h 192.168.1.30 -a M0.0 -t Bool -v true
**Writes to M / Q are real** — they drive the PLC program. Be careful what you
flip on a running machine.
### Hardened CPU — forcing OP-class TSAP
```powershell
# Probe a hardened S7-1500 that rejects PG class but accepts OP.
otopcua-s7-cli probe -h 10.50.12.30 --tsap-mode Op
# Read against the same CPU.
otopcua-s7-cli read -h 10.50.12.30 --tsap-mode Op -a DB1.DBW0 -t Int16
# Manual TSAP override (e.g. site with a fixed proprietary TSAP gateway).
otopcua-s7-cli probe -h 10.50.12.30 --tsap-mode Other --local-tsap 0x4D57 --remote-tsap 0x4D58
```
Without `--tsap-mode`, the CLI uses S7netplus's CpuType-derived default (PG
class for almost everything). The same connection-refused failure shape that a
wrong `--slot` produces also shows up when the CPU rejects PG class — try
`--tsap-mode Op` first when the handshake is failing on otherwise-correct
endpoint config. See [s7.md TSAP / Connection Type](v2/s7.md#tsap--connection-type)
for the byte table and motivation.
### Hardened CPU — supplying a connection-level password
```powershell
# S7-300 protection-level 2 — read+write protected without unlock.
otopcua-s7-cli read -h 192.168.1.31 -c S7300 --slot 2 `
--password "tia-portal-set-password" `
--protection-level Level2 `
-a DB1.DBW0 -t Int16
# S7-1500 ConnectionMechanism — TIA Portal Protection & Security pane gate.
otopcua-s7-cli probe -h 10.50.12.30 `
--tsap-mode Op `
--password "tia-portal-set-password" `
--protection-level ConnectionMechanism
```
The password is emitted to the PLC immediately after `OpenAsync` succeeds and
before the pre-flight PUT/GET probe runs (the same probe that would otherwise
be the first operation a hardened CPU refuses). Never logged in any form;
identifier-only success line is `S7 password sent for {Host}`.
**S7netplus 0.20 does not yet expose a public `SendPassword`** — the driver
discovers the method reflectively, so a future minor release will be picked
up automatically. Until then, configuring `--password` on a hardened CPU
emits this warning at Init:
```
[Warning] S7 password is set on driver '<id>' against host '<host>', but
the linked S7netplus library does not expose SendPassword; password is
being ignored at the wire.
```
Init still completes (the COTP handshake itself doesn't require the
password) but the first read against a hardened CPU will surface
`BadDeviceFailure`. See [s7.md "PLC password / protection levels"](v2/s7.md#plc-password--protection-levels)
for the full motivation, the no-log invariant, and the workaround matrix.
### `subscribe`
```powershell
@@ -186,42 +91,3 @@ otopcua-s7-cli subscribe -h 192.168.1.30 -a DB1.DBW0 -t Int16 -i 500
S7comm has no native push — the CLI polls through `PollGroupEngine` just like
Modbus / AB.
### `import-symbols`
PR-S7-D1 / [#299](https://github.com/dohertj2/lmxopcua/issues/299) — read a TIA
Portal CSV ("Show all tags" export) or STEP 7 Classic `.AWL` file and emit a
JSON tag fragment for `appsettings.json`, or a one-line summary. Mirrors the
AB Legacy `import-rslogix` CLI in shape.
```powershell
# TIA Portal CSV — emit JSON fragment to stdout
otopcua-s7-cli import-symbols --file plc-export.csv --format tia
# STEP 7 Classic AWL — emit summary line
otopcua-s7-cli import-symbols --file classic.awl --format awl --emit summary
# DE-locale CSV — auto-detected; output to file
otopcua-s7-cli import-symbols `
--file plc-de.csv `
--format tia `
--emit appsettings-fragment `
--output tags.json
# Strict mode — fail-fast on the first malformed row (CI lint)
otopcua-s7-cli import-symbols --file plc.csv --format tia --strict
```
| Flag | Default | Purpose |
|---|---|---|
| `-f` / `--file` | **required** | Path to the TIA CSV or `.AWL` file |
| `--format` | `tia` | `tia` (CSV) or `awl` (STEP 7 Classic) |
| `-d` / `--device` | none | Optional documentation tag (held for symmetry with `import-rslogix`) |
| `--emit` | `appsettings-fragment` | `appsettings-fragment` (JSON) or `summary` (one-line counter) |
| `-o` / `--output` | stdout | Optional path; when set the JSON fragment is written there + summary line goes to stdout |
| `--max-rows` | unlimited | Defensive cap on rows imported |
| `--strict` | off | Fail-fast on the first malformed row (default permissive: skip + log) |
UDT-typed rows import as placeholder tags (data type forced to `Byte`); see
[S7-TIA-Import.md](drivers/S7-TIA-Import.md) for the full format reference,
locale auto-detection, and AWL position-based addressing rules.
+32 -149
View File
@@ -1,9 +1,9 @@
# `otopcua-twincat-cli` — Beckhoff TwinCAT test client
Ad-hoc probe / read / write / subscribe tool for Beckhoff TwinCAT 2 / TwinCAT 3
runtimes via ADS. Uses the **same** `TwinCATDriver` the OtOpcUa server does
(`Beckhoff.TwinCAT.Ads` package). Native ADS notifications by default;
`--poll-only` falls back to the shared `PollGroupEngine`.
Ad-hoc probe / read / write / subscribe / browse tool for Beckhoff TwinCAT 2 /
TwinCAT 3 runtimes via ADS. Uses the **same** `TwinCATDriver` the OtOpcUa
server does (`Beckhoff.TwinCAT.Ads` package). Native ADS notifications by
default; `--poll-only` falls back to the shared `PollGroupEngine`.
Fifth (final) of the driver test-client CLIs.
@@ -28,56 +28,6 @@ sessions. Pick one:
The CLI compiles + runs without a router, but every wire call fails with a
transport error until one is reachable.
## UDT decomposition
PR 4.1 (issue #315) replaces the old "skip non-atomic symbols" behaviour
of `BrowseSymbolsAsync` with a recursive type walker
(`TwinCATTypeWalker`). When the OtOpcUa server's TwinCAT driver runs
discovery with `EnableControllerBrowse=true`, struct / UDT / function-block
typed symbols flatten into one OPC UA variable per atomic leaf. Browse
addresses use the same dotted-instance form the PLC exposes:
| PLC declaration | OPC UA browse paths surfaced |
|---|---|
| `MAIN.bStart : BOOL` | `MAIN.bStart` |
| `GVL.stMotor : ST_Motor` | `GVL.stMotor.bRunning`, `GVL.stMotor.nState`, `GVL.stMotor.rTemperature`, … |
| `GVL.aRecipe : ARRAY[1..10] OF DINT` | `GVL.aRecipe[1]``GVL.aRecipe[10]` |
| `GVL.aPairs : ARRAY[0..2] OF ST_Pair` | `GVL.aPairs[0].nCount`, `GVL.aPairs[0].rValue`, `GVL.aPairs[1].…` |
| `GVL.aBig : ARRAY[1..5000] OF DINT` | `GVL.aBig` (single whole-array root — over the cap) |
The CLI's `read` / `write` / `subscribe` commands take dotted paths
directly:
```powershell
# Read a struct member
otopcua-twincat-cli read -n 192.168.1.40.1.1 -s GVL.stMotor.rTemperature -t Real
# Read an array element
otopcua-twincat-cli read -n 192.168.1.40.1.1 -s "GVL.aRecipe[3]" -t DInt
```
### Array expansion bound
`TwinCATDriverOptions.MaxArrayExpansion` (default `1024`) caps how many
elements an array contributes to the discovered address space. Arrays
whose total element count exceeds the cap surface as a single
whole-array root with `IsArrayRoot=true` instead of one variable per
element. Raise the bound when operators routinely care about individual
elements of large recipe / lookup tables; lower it to keep discovery
cheap for symbol tables that ship multi-thousand-element scratch
arrays. Pre-declared whole-array tags from the `Tags` config bypass the
walker entirely — set `ArrayDimensions` on a `TwinCATTagDefinition` to
keep array reads on the existing PR 1.4 read-array path.
### Cycle / depth guard
The walker tracks the visited-type set + a hard depth cap of 8 levels
so a self-pointer (`POINTER TO ST_Self`) or pathological alias chain
terminates rather than spinning. POINTER / REFERENCE members are
skipped at the type-graph level — surfacing them would require
dereferencing through the AMS routing layer which has its own access
patterns.
## Common flags
| Flag | Default | Purpose |
@@ -100,6 +50,13 @@ caller interpret semantics.
### `probe`
Per-command flags:
| Flag | Default | Purpose |
|---|---|---|
| `-s` / `--symbol` | **required** | Symbol path to probe (e.g. `MAIN.bRunning`) |
| `--type` | `DInt` | Declared data type — see the [Data types](#data-types) list |
```powershell
# Local TwinCAT 3, probe a canonical global
otopcua-twincat-cli probe -n 127.0.0.1.1.1 -s "TwinCAT_SystemInfoVarList._AppInfo.OnlineChangeCnt"
@@ -108,35 +65,6 @@ otopcua-twincat-cli probe -n 127.0.0.1.1.1 -s "TwinCAT_SystemInfoVarList._AppInf
otopcua-twincat-cli probe -n 192.168.1.40.1.1 -s MAIN.bRunning --type Bool
```
#### Health probe
The OtOpcUa server's TwinCAT driver runs an internal probe loop (PR 3.2, issue #314)
that — alongside the cheap `ReadStateAsync` reachability check — samples four
well-known system symbols once per probe interval and surfaces the result through
the cross-driver `driver-diagnostics` RPC (added for Modbus, task #154). The same
symbols can be probed directly via the CLI for ad-hoc troubleshooting:
```powershell
# Cycle time (UDINT, 100 ns ticks → ÷10000 for ms)
otopcua-twincat-cli probe -n 192.168.1.40.1.1 -s "TwinCAT_SystemInfoVarList._TaskInfo[1].CycleTime" --type UDInt
# Last task execution wall-clock (UDINT, 100 ns ticks → ÷10000 for ms)
otopcua-twincat-cli probe -n 192.168.1.40.1.1 -s "TwinCAT_SystemInfoVarList._TaskInfo[1].LastExecTime" --type UDInt
# Online-change count — increments on every accepted online change
otopcua-twincat-cli probe -n 192.168.1.40.1.1 -s "TwinCAT_SystemInfoVarList._AppInfo.OnlineChangeCnt" --type UDInt
# Loaded PLC project name (STRING(80))
otopcua-twincat-cli probe -n 192.168.1.40.1.1 -s "TwinCAT_SystemInfoVarList._AppInfo.AppName" --type String
```
Within the running OtOpcUa server these four signals land on
`DeviceState.LastDiagnostics` as a `TwinCATDeviceDiagnostics` record + are folded
into `DriverHealth.Diagnostics` keyed `TwinCAT.CycleTimeMs`, `TwinCAT.LastExecTimeMs`,
`TwinCAT.JitterMs` (computed `LastExecTimeMs - CycleTimeMs`),
`TwinCAT.OnlineChangeCnt`, and `TwinCAT.OnlineChangeIncrements`. See
`docs/drivers/TwinCAT-Test-Fixture.md §Diagnostics` for the full mapping.
### `read`
```powershell
@@ -156,14 +84,6 @@ otopcua-twincat-cli read -n 192.168.1.40.1.1 -s "Recipe[3]" -t Real
otopcua-twincat-cli read -n 192.168.1.40.1.1 -s GVL.sMessage -t WString
```
ADS variable handles for `read` / `write` symbols are cached transparently
inside the CLI's underlying `AdsTwinCATClient`. The first read of a symbol
resolves a handle; repeats reuse the cached handle for smaller AMS payloads
and skipped name resolution. The cache wipes on reconnect, on
`DeviceSymbolVersionInvalid` (with a one-shot retry), and on CLI exit. See
`docs/drivers/TwinCAT-Test-Fixture.md §Handle caching` for the full story
including the staleness caveat after an online change.
### `write`
```powershell
@@ -176,78 +96,41 @@ Structure writes refused — drop to driver config JSON for those.
### `subscribe`
Per-command flags:
| Flag | Default | Purpose |
|---|---|---|
| `-s` / `--symbol` | **required** | Symbol path — same format as `read` |
| `-t` / `--type` | `DInt` | Declared data type |
| `-i` / `--interval-ms` | `1000` | Publishing interval in **milliseconds** — native mode passes this as the ADS `NotificationSettings.CycleTime` |
```powershell
# Native ADS notifications (default) — PLC pushes on its own cycle
otopcua-twincat-cli subscribe -n 192.168.1.40.1.1 -s GVL.Counter -t DInt -i 500
# Fall back to polling for runtimes where native notifications are constrained
otopcua-twincat-cli subscribe -n 192.168.1.40.1.1 -s GVL.Counter -t DInt -i 500 --poll-only
# Coalesce bursty changes — runtime buffers up to 500 ms before dispatch
otopcua-twincat-cli subscribe -n 192.168.1.40.1.1 -s GVL.Counter -t DInt -i 50 --max-delay-ms 500
```
| Flag | Default | Purpose |
|---|---|---|
| `-s` / `--symbol` | **required** | Symbol path — same format as `read` |
| `-t` / `--type` | `DInt` | IEC type (see Data types section) |
| `-i` / `--interval-ms` | `1000` | **Cycle time** — minimum interval between change checks the PLC runtime applies |
| `--max-delay-ms` | `0` | **Max coalescing window** — upper bound on how long the runtime buffers change events before dispatch. `0` = fire ASAP, no coalescing |
| `--poll-only` | off | Disable native notifications, use `PollGroupEngine` instead |
`-i` / `--interval-ms` and `--max-delay-ms` are different things and both flow
into the Beckhoff `NotificationSettings` ctor:
- **`--interval-ms`** is the *cycle*: the runtime checks for value changes at
most this often. Smaller = lower latency, higher CPU.
- **`--max-delay-ms`** is the *coalescing ceiling*: once a change is detected,
the runtime can hold it for up to this long before dispatching, which lets
it batch a burst of changes into a single callback. Default `0` means
every detected change fires immediately — same as the pre-PR-3.1 behaviour.
For high-frequency signals (a counter incrementing every 10 ms PLC cycle),
pair a small `-i` (so latency stays bounded) with a non-zero `--max-delay-ms`
(so the OPC UA queue downstream doesn't flood). For slow signals just leave
`--max-delay-ms` at `0`.
The subscribe banner announces which mechanism is in play — "ADS notification"
or "polling" — and includes the `max-delay` value when set, so it's obvious
in screen-recorded bug reports.
or "polling" — so it's obvious in screen-recorded bug reports.
`--poll-only` polls go through the same cached-handle path as `read`, so
repeated polls of the same symbol carry only a 4-byte handle on the wire
rather than the full symbolic path.
### `browse`
### `alarms` (PR 5.1 / #316)
Stream TC3 EventLogger alarms via the driver's `IAlarmSource` bridge.
Subscribes against AMS port 110 (`AMSPORT_EVENTLOG`) on the same target,
prints each event with timestamp / source / severity / message until
Ctrl+C.
```powershell
# All alarms — every event the EventLogger surfaces
otopcua-twincat-cli alarms -n 192.168.1.40.1.1
# Filter by source — only events whose source name matches (case-insensitive)
otopcua-twincat-cli alarms -n 192.168.1.40.1.1 --source Conveyor1.MotorOverload
# Multiple sources — repeat the flag
otopcua-twincat-cli alarms -n 192.168.1.40.1.1 --source Conveyor1 --source Pump3
```
Walks the controller's symbol table via ADS `SymbolLoaderFactory` (same path
`TwinCATDriver.DiscoverAsync` takes when `EnableControllerBrowse = true`).
Output filters to symbols whose type maps onto the driver's atomic surface —
UDTs / function-block instances don't appear.
| Flag | Default | Purpose |
|---|---|---|
| `--source` | (none) | Optional source filter; repeat for multiple |
| `--prefix` | _(none)_ | Case-sensitive instance-path prefix filter (e.g. `GVL_Fixture`) |
| `--max` | `500` | Max symbols to print. `0` = unbounded |
Output format (one line per event):
```powershell
# Everything under a single GVL
otopcua-twincat-cli browse -n 192.168.1.40.1.1 --prefix GVL_Fixture
# Full dump (beware: flat-mode walks on a real controller can top 10k symbols)
otopcua-twincat-cli browse -n 192.168.1.40.1.1 --max 0
```
[HH:mm:ss.fff] <source> sev=<Low|Medium|High|Critical> type=<event-class> cond=<condition-id> "<message>"
```
The verb forces `EnableAlarms=true` on the underlying driver; the
default driver config keeps it off so deployments without an
EventLogger configured pay no cost. See
[`docs/drivers/TwinCAT.md` §Alarms](drivers/TwinCAT.md) for the
full bridge architecture and decode caveats.
-13
View File
@@ -67,19 +67,6 @@ their flag values to the already-shipped driver.
then the other. The plausible result identifies the correct setting
for that device family. (Modbus, S7.)
## Family-specific commands
Most drivers ship the four shared verbs and nothing else. AB Legacy adds a
fifth family-specific verb for bulk symbol-table import:
| Driver | Extra verb | Doc |
|---|---|---|
| AB Legacy | `import-rslogix` — read RSLogix 500/5 CSV symbol exports + emit a JSON tag fragment | [drivers/AbLegacy-RSLogix-Import.md](drivers/AbLegacy-RSLogix-Import.md) |
Binary RSLogix project files (`.RSS` / `.RSP`) are out of scope for v1 — the
format is proprietary and undocumented; no parser ships in libplctag or any
community library. Export to CSV first.
## Known gaps
- **AB Legacy cip-path quirk** — libplctag's ab_server requires a
-4
View File
@@ -98,10 +98,6 @@ Role swaps, stand-alone promotions, and base-level adjustments all happen throug
The OtOpcUa Client CLI at `src/ZB.MOM.WW.OtOpcUa.Client.CLI` supports `-F` / `--failover-urls` for automatic client-side failover; for long-running subscriptions the CLI monitors session KeepAlive and reconnects to the next available server, recreating the subscription on the new endpoint. See [`Client.CLI.md`](Client.CLI.md) for the command reference.
## vs. upstream-side redundancy
The mechanics on this page describe **OtOpcUa as a redundant server** — two of our instances clustered behind one OPC UA address space, exposing `ServerUriArray` + dynamic `ServiceLevel` to downstream clients. The mirror-image scenario — **the OPC UA Client driver consuming an upstream redundant pair** — is documented separately in [`drivers/OpcUaClient.md` § Upstream redundancy](drivers/OpcUaClient.md#upstream-redundancy-serverarray). Both rely on the same OPC UA Part 4 § 6.6.2 model (non-transparent warm/hot via `RedundancySupport` + `ServerUriArray` + `ServiceLevel`); they sit at opposite ends of the gateway pipeline. A deployment can wire either, both, or neither.
## Depth reference
For the full decision trail and implementation plan — topology invariants, peer-probe cadence, recovery-dwell policy, compliance-script guard against enum-value drift — see `docs/v2/plan.md` §Phase 6.3.
+1 -1
View File
@@ -83,7 +83,7 @@ The host spins up `StaPump` (the STA thread with message pump), creates the MXAc
### Pipe security
`PipeServer` builds a `PipeAcl` from the provided `SecurityIdentifier` + uses `NamedPipeServerStream` with `maxNumberOfServerInstances: 1`. The handshake requires a matching shared secret in the first Hello frame; callers whose SID doesn't match `OTOPCUA_ALLOWED_SID` are rejected before any frame is processed. **By design the pipe ACL denies BUILTIN\Administrators** — live smoke tests must therefore run from a non-elevated shell that matches the allowed principal. The installed dev host (`OtOpcUaGalaxyHost`) runs as `dohertj2` with the secret at `.local/galaxy-host-secret.txt`.
`PipeServer` builds a `PipeAcl` from the provided `SecurityIdentifier` + uses `NamedPipeServerStream` with `maxNumberOfServerInstances: 1`. The handshake requires a matching shared secret in the first Hello frame; callers whose SID doesn't match `OTOPCUA_ALLOWED_SID` are rejected before any frame is processed via `NamedPipeServerStream.RunAsClient` + a SID comparison against the configured allow list. The DACL grants `ReadWrite | Synchronize` only to the allowed SID and denies `LocalSystem`. The installed dev host (`OtOpcUaGalaxyHost`) runs as `dohertj2` with the secret at `.local/galaxy-host-secret.txt`.
### Installation
-332
View File
@@ -1,332 +0,0 @@
# AbCip — ControlLogix HSBY paired-IP support
PR abcip-5.1 + 5.2 ship **non-transparent** HSBY (Hot-Standby) awareness
to the AB CIP driver. Each device may declare a partner gateway; when both
gateways are up the driver concurrently probes a role tag on each chassis,
reports which one is currently Active, and routes reads / writes through
that chassis automatically.
- **PR abcip-5.1** — gathers + reports the role of each chassis through
driver diagnostics. See [Role-tag detection matrix](#role-tag-detection-matrix)
+ [Active-resolution rules](#active-resolution-rules).
- **PR abcip-5.2** — wires the resolved active address into
`AbCipDriver.ResolveHost` and the runtime-cache lifecycle. See
[Failover behaviour](#failover-behaviour-pr-52) +
[Failure-mode walkthrough](#failure-mode-walkthrough).
## When to use HSBY paired IPs
You have a redundant **ControlLogix** chassis pair (1756-RM redundancy
module, two CPUs, one acting + one standby) and the SCADA / OPC UA layer
needs to keep talking to *whichever chassis is currently Active* without an
operator manually re-pointing the connection.
Pre-5.1 the driver only knew about a single `HostAddress`. After a
hot-standby switch-over, the standby (now Active) carried a **different IP**
and the driver kept probing the dead-but-was-Active address until someone
edited the config.
PR abcip-5.1 closes the visibility half of that gap by reading the role tag
on both chassis. PR abcip-5.2 closes the routing half by re-pointing
`ResolveHost` at the Active address each tick + invalidating the per-tag
runtime cache + write-coalescer state on every flip.
## Configuration
```jsonc
{
"Devices": [
{
"HostAddress": "ab://10.0.0.5/1,0",
"PartnerHostAddress": "ab://10.0.0.6/1,0",
"Hsby": {
"Enabled": true,
"RoleTagAddress": "WallClockTime.SyncStatus",
"ProbeIntervalMs": 2000
}
}
]
}
```
| Field | Default | Notes |
|---|---|---|
| `PartnerHostAddress` | `null` | Canonical `ab://gateway[:port]/cip-path` of the partner chassis. `null` = no HSBY pair; the driver behaves exactly like every pre-5.1 build. |
| `Hsby.Enabled` | `false` | Master switch. When `false` (or `Hsby` omitted) no role probing happens, even if `PartnerHostAddress` is set. |
| `Hsby.RoleTagAddress` | `WallClockTime.SyncStatus` | Address of the role tag on each chassis. See [role-tag detection matrix](#role-tag-detection-matrix). |
| `Hsby.ProbeIntervalMs` | `2000` | How often each chassis is sampled. 2 s is a good default — tight enough to detect a switch-over within one Admin-UI refresh, loose enough to leave headroom for the regular probe loop. |
## Feature-flag gate (`Redundancy.Hsby.Enabled`)
`Hsby.Enabled = false` (the default) is the off-switch for the entire
feature. The role-probe loop never starts, the diagnostics keys are not
emitted, and the driver behaves identically to a pre-5.1 build. This is the
gate to flip when an operator wants to roll the feature out cautiously
across a fleet — set `Hsby.Enabled = true` per-device in driver config (no
build flag, no env var).
When the gate is on but the partner gateway is unreachable, the role-probe
loop reports `HsbyRole.Unknown` for the partner each tick. The primary's
role still drives the active-chassis resolution; the operator sees the
partner's role as Unknown in the Admin UI / driver diagnostics, which is the
correct surface for "we can't reach the standby chassis right now."
## Role-tag detection matrix
| Firmware / fronts | Address | Decode |
|---|---|---|
| **v20 / v24 / v32+ ControlLogix HSBY** | `WallClockTime.SyncStatus` (DINT) | `0` = Standby, `1` = Synchronized / Active, `2` = Disqualified, anything else = Unknown |
| **PLC-5 / SLC500 status-byte fallback** | `S:34` Module Status word | bit 0 = "this chassis is Active". Bit set → `Active`; clear → `Standby` |
| **Custom user role tag** | any DINT-typed CIP path | Same matrix as `WallClockTime.SyncStatus` (0 / 1 / 2). Out-of-range values → Unknown. |
`AbCipHsbyRoleProber.MapValueToRole` is the value-to-role mapper; unit tests
in `tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Tests/AbCipHsbyTests.cs` pin every
row of the matrix.
## What gets reported
The driver surfaces three diagnostics counters per HSBY-enabled device
(visible via `driver-diagnostics` RPC + the Admin UI):
| Counter | Value |
|---|---|
| `AbCip.HsbyActive` | `1` if primary is Active, `2` if partner is Active, `0` if neither (or HSBY off) |
| `AbCip.HsbyPrimaryRole` | `(int)HsbyRole``0` = Unknown, `1` = Active, `2` = Standby, `3` = Disqualified |
| `AbCip.HsbyPartnerRole` | Same encoding as `HsbyPrimaryRole`, observed on the partner chassis |
| `AbCip.HsbyFailoverCount` (PR 5.2) | Total number of `ActiveAddress` transitions the probe loop has observed across every HSBY-enabled device on this driver. Each increment maps to one runtime-cache invalidation + write-coalescer reset. |
When more than one HSBY pair is configured on the same driver instance the
flat keys are scoped per primary host: `AbCip.HsbyActive[ab://10.0.0.5/1,0]`,
etc.
The `DeviceState.ActiveAddress` field (internal; surfaced via
`HsbyActive` diagnostics) is the address PR 5.2 routes through
`ResolveHost` + uses to scope the per-host bulkhead / breaker key.
See [Failover behaviour](#failover-behaviour-pr-52) for the runtime
implications.
### Active-resolution rules
| Primary role | Partner role | `ActiveAddress` resolution |
|---|---|---|
| Active | Standby / Disqualified / Unknown | primary |
| Standby / Disqualified / Unknown | Active | partner |
| Active | Active (split-brain) | **primary wins**, warning logged |
| Standby + Standby | Standby + Standby | `null` — PR 5.2's `ResolveHost` falls back to the configured primary; the existing dial flow surfaces `BadCommunicationError` if the primary is also down. See [Both-stuck](#both-stuck-no-chassis-active). |
| Unknown + Unknown | Unknown + Unknown | `null` (same fallback as Standby + Standby) |
Split-brain (both chassis claim Active simultaneously) is a real
production failure mode — typically a redundancy-module misconfiguration or
a partial network split. The driver picks primary deterministically + emits
a warning through `AbCipDriverOptions.OnWarning` so operators see it in the
log.
## CLI flags
The `otopcua-abcip-cli` tool exposes the HSBY plumbing through two surfaces
(see [Driver.AbCip.Cli.md](../Driver.AbCip.Cli.md) for the full CLI guide):
- `--partner <gateway>` — global flag on every command. Sets
`PartnerHostAddress` + auto-enables `Hsby.Enabled = true` so the role
probe runs alongside any read / write / subscribe.
- `hsby-status` — dedicated command that prints which chassis is
currently Active. Reads the role tag on both gateways for a few ticks +
prints the `(primary, partner, active)` tuple.
```powershell
# Print which chassis is Active right now
otopcua-abcip-cli hsby-status -g ab://10.0.0.5/1,0 --partner ab://10.0.0.6/1,0
# Subscribe through the active chassis (PR 5.2 follow-up — today the
# subscribe stays pointed at the primary; the role probe runs alongside).
otopcua-abcip-cli subscribe -g ab://10.0.0.5/1,0 --partner ab://10.0.0.6/1,0 \
-t Motor01_Speed --type Real -i 500
```
## Test coverage
- **Unit** (`tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Tests/AbCipHsbyTests.cs`):
- Pure `MapValueToRole` matrix (WallClockTime.SyncStatus + S:34 bit
mask + Unknown values).
- End-to-end driver loop: primary Active / partner Standby resolves to
primary; both Active resolves to primary with a warning; both
Standby clears `ActiveAddress`; primary read failure routes to
partner.
- Diagnostics surface (`AbCip.HsbyActive` / `HsbyPrimaryRole` /
`HsbyPartnerRole`).
- DTO JSON round-trip (`PartnerHostAddress` + `Hsby.{Enabled,
RoleTagAddress, ProbeIntervalMs}` survive deserialise → driver →
`DeviceState`).
- `Hsby.Enabled = false` → no role probing.
- **Integration** (`tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/`):
- `AbCipHsbyRoleProberTests.cs` (PR 5.1) and
`AbCipHsbyFailoverTests.cs` (PR 5.2) — both **skipped by default**
(`Assert.Skip`). `ab_server` cannot emulate a ControlLogix HSBY
pair (no `WallClockTime.SyncStatus`, no second chassis concept).
The Docker `paired` profile (PR 5.1) brings up two `ab_server`
instances + a stub `hsby-mux` sidecar so the topology is
documented, but a patched `ab_server` image that actually serves
the role tag is still on the follow-up list.
- Trait `Category=Hsby` so `dotnet test --filter Category=Hsby`
finds them once they're promoted.
- **End-to-end** (`scripts/e2e/test-abcip-hsby.ps1`, PR 5.2):
- Paired-fixture variant of `test-abcip.ps1`. Subscribes to a tag
through the OPC UA server, flips the active chassis mid-stream
via the `hsby-mux` sidecar's `POST /flip` endpoint, asserts the
stream survives + `AbCip.HsbyFailoverCount` increments. Gated
on operator-supplied `BridgeNodeId` + a running paired fixture;
ships unwired into `test-all.ps1` until the patched `ab_server`
lands.
## Failover behaviour (PR 5.2)
PR 5.2 wires `DeviceState.ActiveAddress` into the read / write hot path
through `AbCipDriver.ResolveHost` and the runtime-cache lifecycle. After
the role-probe loop (PR 5.1) detects an active-address transition the
driver re-points every wire-level operation at the now-Active chassis
without operator intervention.
### What flips on a failover
| Aspect | Pre-flip | Post-flip |
|---|---|---|
| `ResolveHost(tag)` return | primary `HostAddress` | the partner address (when partner is now Active) |
| Per-tag libplctag handles in `DeviceState.Runtimes` | created against primary gateway | dropped on flip; lazily re-created against the partner gateway on next read / write |
| Parent-DINT RMW handles in `DeviceState.ParentRuntimes` | primary gateway | dropped on flip; same re-create-on-demand path |
| `AbCipWriteCoalescer` per-device cache | last-known-written values from the primary | reset; the first write of any value to the partner pays the full round-trip |
| `LogicalInstanceMap` (Logical-mode `@tags` walk) | populated for primary | cleared; the next read on a Logical-mode device re-walks `@tags` against the partner |
| Per-host bulkhead key (Polly bulkhead + breaker, plan decision #144) | keyed on primary `HostAddress` | keyed on the new active address — the partner gets its own fresh breaker state instead of inheriting a tripped breaker from the now-standby |
| `AbCip.HsbyFailoverCount` diagnostic | 0 | incremented by 1 on every transition observed by the probe loop |
### How the invalidation runs
PR 5.2 introduces an internal `OnActiveAddressChanged` event raised by
`HsbyProbeLoopAsync` on every `DeviceState.ActiveAddress` transition. The
driver subscribes to it from its own constructor; the handler
(`HandleActiveAddressChanged`) does the cache invalidation in one place:
1. Disposes every entry in `DeviceState.Runtimes` and
`DeviceState.ParentRuntimes`, then clears both dicts. Disposed
`IAbCipTagRuntime` instances release their underlying libplctag
handles so the native heap doesn't leak.
2. Clears `DeviceState.LogicalInstanceMap` and resets
`LogicalWalkComplete = false` so the next read on a Logical-mode
device re-fires the `@tags` symbol walk against the new chassis.
3. Calls `AbCipWriteCoalescer.Reset(deviceHostAddress)` so cached
"we already wrote 42" decisions don't stale-suppress the first
partner-side write.
4. Resets `DeviceState.RuntimesAddress = null` so subsequent
diagnostics observers see a fresh stamp on the next runtime
creation.
5. `Interlocked.Increment` on the driver-wide
`AbCip.HsbyFailoverCount` counter.
The handler is idempotent — a second event for the same address change
is harmless because the dicts are already empty + the coalescer reset
is itself idempotent.
### Bulkhead key semantics
The per-host resilience pipeline (Polly bulkhead + circuit breaker, plan
decision #144) keys on whatever `IPerCallHostResolver.ResolveHost`
returns. PR 5.2 changes that resolver so an HSBY-failed-over device
returns the partner's address, which means:
- The **device-state lookup** (`_devices.TryGetValue`) keeps using the
configured primary `HostAddress` as the dictionary key — that key
never changes for the lifetime of a device, so multi-device
configurations stay routable.
- The **resilience pipeline** (Polly bulkhead, breaker, retry policies)
receives the active address as the host-name dimension. The standby
chassis's tripped breaker (if its primary went away) doesn't bleed
over to the partner; the partner gets fresh limits + a closed
breaker.
When HSBY is disabled (`Hsby.Enabled = false`) `ResolveHost` returns the
configured primary `HostAddress` exactly as it always has — pre-5.2
behaviour, no double-key risk.
## Failure-mode walkthrough
PR 5.2 adds three failover surface areas to reason about. The table
below summarises the behaviour the driver reports + how an operator
can inspect it.
### Primary-stuck (primary unreachable, partner Active)
The primary chassis goes away (network partition, power loss, a stuck
Forward Open). The role-probe loop reads `HsbyRole.Unknown` for the
primary and `HsbyRole.Active` for the partner.
| Surface | Behaviour |
|---|---|
| `DeviceState.ActiveAddress` | partner address |
| `DeviceState.PrimaryRole` | `Unknown` |
| `DeviceState.PartnerRole` | `Active` |
| `ResolveHost(tag)` | partner address |
| Reads / writes | route through partner gateway transparently |
| `AbCip.HsbyFailoverCount` | incremented when the address transitioned away from the primary |
| `AbCip.HsbyActive` | `2` (partner is the active chassis) |
| Operator action | none required for routing; investigate why the primary is unreachable through the connectivity-probe loop's `_System/_ConnectionStatus` for the device |
### Secondary-stuck (partner unreachable, primary Active)
The partner chassis goes away (its OPC UA server is down, its IP is
unreachable, the redundancy module unhitched it). The probe loop reads
`HsbyRole.Active` for the primary and `HsbyRole.Unknown` for the partner.
| Surface | Behaviour |
|---|---|
| `DeviceState.ActiveAddress` | primary address (no transition; this is the steady state) |
| `DeviceState.PrimaryRole` | `Active` |
| `DeviceState.PartnerRole` | `Unknown` |
| `ResolveHost(tag)` | primary address |
| Reads / writes | route through primary gateway exactly as in a non-HSBY deployment |
| `AbCip.HsbyFailoverCount` | unchanged — no flip happened |
| `AbCip.HsbyActive` | `1` (primary is the active chassis) |
| Operator action | investigate why the partner is unreachable; the operational risk is that a future primary-side outage has no fall-back |
### Both-stuck (no chassis Active)
Both chassis report `Standby` / `Disqualified` / `Unknown` (a
redundancy-module misconfiguration, both controllers in Program mode,
or both unreachable).
| Surface | Behaviour |
|---|---|
| `DeviceState.ActiveAddress` | `null` |
| `ResolveHost(tag)` | falls back to the configured primary `HostAddress` |
| Reads / writes | dispatched to the configured primary; a stuck primary surfaces `BadCommunicationError` per the existing dial flow |
| `AbCip.HsbyActive` | `0` (no chassis Active) |
| `AbCip.HsbyFailoverCount` | incremented when the transition `Active → null` happened |
| Operator action | investigate the redundancy module / mode keys; the SCADA layer sees stuck-or-bad-quality reads, not incorrect routing |
The "fall back to primary on null Active" choice is deliberate. Routing
all reads to a deterministic chassis (the configured primary) keeps the
breaker key + bulkhead state stable while the operator diagnoses the
double-down outage; the alternative (round-robin / partner) would just
trip both breakers in turn and obscure which chassis is the real
problem.
## Follow-ups (beyond PR 5.2)
- **Patched `ab_server` image** — add a writable `WallClockTime.SyncStatus`
tag (or a separate Python shim) so the Docker `paired` profile can
exercise the wire-level role probe + the
`tests/.../IntegrationTests/AbCipHsbyFailoverTests.cs` scaffold can
flip its `Assert.Skip` for a real integration assertion.
- **`hsby-mux` REST endpoint** — `POST /flip {"active": "primary"}` writes
`1` to the chosen chassis + `0` to the other so integration tests +
`scripts/e2e/test-abcip-hsby.ps1` can drive switch-overs
deterministically.
- **GuardLogix HSBY** — same role-tag plumbing applies; verify against a
real 1756-L8xS pair when one is on-site.
## See also
- [`docs/Driver.AbCip.Cli.md`](../Driver.AbCip.Cli.md) — `--partner` flag +
`hsby-status` command reference
- [`docs/drivers/AbServer-Test-Fixture.md`](AbServer-Test-Fixture.md) §"What
it does NOT cover" — HSBY entry
- [`docs/Redundancy.md`](../Redundancy.md) — server-level (OPC UA-stack)
redundancy; HSBY is the **driver-level** companion
-406
View File
@@ -1,406 +0,0 @@
# AB CIP — Operability knobs
Phase 4 of the AB CIP driver plan introduces operator-tunable behaviour that
changes how the driver schedules per-tag traffic, deduplicates updates, and
surfaces health — knobs that an operator typically reaches for *after* the
address space is in place and the deployment is past the green-field phase.
The Phase 3 doc (`AbCip-Performance.md`) covers connection-shape and
read-strategy knobs; this doc is the home for the per-tag scheduling and
operability levers as PRs land.
PR abcip-4.1 ships the first knob: per-tag **Scan Rate** (Kepware-parity scan
classes).
## Per-tag scan rate
### What it is
A per-tag override of the OPC UA subscription's `publishingInterval`. The AB
CIP driver mirrors the Galaxy hierarchy as a single OPC UA address space, so
every tag served from one driver normally ticks at the publishing interval the
client requested when it created the Subscription. This knob lets specific
tags publish at a different cadence — fast HMI tags at 100 ms, batch /
historian tags at 110 s — without forcing the operator to split tags into
separate subscriptions or driver instances.
It is the Kepware "scan classes" model expressed per-tag. The same shape is
already shipped in the S7 driver (`S7TagDefinition.ScanGroup`) and the AB
Legacy / TwinCAT drivers; AB CIP adopts a leaner per-tag-only form because the
CIP single-connection model means the practical knob a deployment reaches for
is "this one tag, faster", not "every tag in this group".
### How it interacts with OPC UA publishingInterval
OPC UA semantics:
- The Subscription's `publishingInterval` is the *upper bound* on how often
the server publishes a NotificationMessage. Each MonitoredItem also has its
own `samplingInterval`; that's where this knob lands.
- A per-tag `samplingInterval` shorter than the Subscription's
`publishingInterval` means the server samples faster but only publishes at
the next Subscription tick — clients may receive multiple values for one
tag in a single Publish response.
- A per-tag `samplingInterval` longer than the Subscription's
`publishingInterval` is legal too — the server simply skips ticks for that
tag.
AB CIP-side: the driver's `SubscribeAsync` receives one `publishingInterval`
plus a list of tag references. With per-tag `ScanRateMs` it buckets the input
list by resolved interval and registers one `PollGroupEngine` subscription per
bucket. Each bucket runs an independent timer, so a 100 ms tag never waits
for a 1000 ms tag's `Task.Delay` to expire.
### Override knob
`AbCipTagDefinition.ScanRateMs` (`int?`, default `null`). `null` = use the
subscription's default `publishingInterval` (legacy behaviour). Bind via
driver config JSON:
```json
{
"Tags": [
{
"Name": "Motor1.Speed",
"DeviceHostAddress": "ab://10.0.0.5/1,0",
"TagPath": "Motor1.Speed",
"DataType": "DInt",
"ScanRateMs": 100
},
{
"Name": "Motor1.RunHours",
"DeviceHostAddress": "ab://10.0.0.5/1,0",
"TagPath": "Motor1.RunHours",
"DataType": "DInt",
"ScanRateMs": 5000
},
{
"Name": "Motor1.NamePlate",
"DeviceHostAddress": "ab://10.0.0.5/1,0",
"TagPath": "Motor1.NamePlate",
"DataType": "String"
}
]
}
```
Result: three buckets — 100 ms, 5000 ms, and the subscription default for
`NamePlate`. UDT members inherit the parent tag's `ScanRateMs` at fan-out
time, so a UDT declared at 100 ms publishes every member at 100 ms without
the operator having to repeat the override on each member.
### Floor and degenerate cases
- `PollGroupEngine` floors every bucket at **100 ms** — a `ScanRateMs: 25`
is clamped up. The floor matches the Modbus / S7 / TwinCAT floors and
protects the wire from sub-mailbox-scan polling.
- `ScanRateMs: 0` and negative values are treated as unset — the tag falls
back to the subscription default. Mis-typed config degrades, doesn't fault.
- A `ScanRateMs` equal to the subscription default collapses into the same
bucket as plain tags. The driver doesn't fragment poll loops when the
override is redundant.
- Tags whose names don't appear in the driver's tag map (typo / discovery
miss) fall through to the subscription default — same "config typo
degrades" stance as the rest of the driver.
### Wire impact
Per-bucket independent timers do **not** parallelise CIP traffic. The driver
serializes wire-side reads through its per-device libplctag handles, so a
fast bucket and a slow bucket trade off against each other on the wire — the
multi-rate split decouples *cadence* (the 100 ms bucket isn't queued behind
the 1000 ms bucket's `Task.Delay`), not *throughput*. The wire still moves
one CIP request at a time per device.
If you're reading a large tag set and the slow bucket starves the fast
bucket, the lever is `AbCipDeviceOptions.ConnectionSize` (Phase 3) — pack
more tags into one CIP RTT so the slow bucket finishes faster. Per-tag scan
rate is a *scheduling* knob, not a *throughput* knob.
### Comparison to Kepware scan classes
| Kepware concept | AB CIP equivalent |
|---|---|
| Scan class table (named groups → rate) | implicit: each distinct `ScanRateMs` value is its own bucket |
| Default scan class | OPC UA Subscription's `publishingInterval` |
| Per-tag scan class assignment | `AbCipTagDefinition.ScanRateMs` |
| "Scan mode: Respect client" | always — the OPC UA `publishingInterval` is the default |
| "Force write" / "Write through cache" | not exposed — AB CIP writes always go to the wire |
The leaner shape (per-tag rate, not named groups) keeps the JSON config flat
and reflects how operators tend to use the knob in practice — a handful of
"this specific tag needs to be fast" overrides on top of a sensible default,
rather than a separate tier of scan-class definitions.
### Verification
- **Unit**: `AbCipPerTagScanRateTests` (`tests/.../AbCip.Tests`). Asserts
bucketing math, default-rate collapse, UDT member inheritance, JSON DTO
round-trip, and end-to-end cadence against the in-process fake.
- **Integration**: `AbCipPerTagScanRateTests`
(`tests/.../AbCip.IntegrationTests`). Drives two tags at 100 ms / 1000 ms
against a live `ab_server` and asserts the bucket count + each tag receives
the initial-data push.
- **E2E**: `scripts/e2e/test-abcip.ps1` — see the *PerTagScanRate* assertion.
### Cross-references
- `docs/Driver.AbCip.Cli.md` — there is no CLI surface change for this knob;
scan rate is a config-time concern.
- `docs/drivers/AbCip-Performance.md` — Phase 3 throughput knobs that pair
with per-tag scan rate when a slow bucket starves a fast one.
- S7 driver `ScanGroup` model in `src/.../S7DriverOptions.cs` — the
named-group form of the same idea.
## Write deadband / write-on-change
PR abcip-4.2 ships the second operability knob: per-tag write coalescing,
the *write-side* companion to the read-side deadband already shipped at the
OPC UA monitored-item layer. The driver remembers the value last
successfully written for a tag and can suppress redundant or below-threshold
follow-up writes — they return `Good` to the OPC UA client without ever
hitting the wire.
### What it is
- **`AbCipTagDefinition.WriteDeadband`** (`double?`, default `null`) —
numeric absolute-difference threshold. When set, a write whose
`|new last|` is below the deadband is suppressed.
- **`AbCipTagDefinition.WriteOnChange`** (`bool`, default `false`) —
equality gate. When set, a write whose value equals the last successfully
written value is suppressed.
Both knobs combine on the same tag. For numerics, the deadband path takes
priority; the equality fallback covers the cases the deadband doesn't (BOOL
setpoints, STRING constants, `WriteDeadband=0`, etc).
### Worked setpoint-jitter example
A motor speed setpoint published from an HMI tends to wobble by a few
ticks even when the operator hasn't touched it — UI rounding, Modbus
gateway re-encoding, RPN script noise. With `WriteDeadband: 0.5`:
```json
{
"Tags": [
{
"Name": "Motor1.Speed.SP",
"DeviceHostAddress": "ab://10.0.0.5/1,0",
"TagPath": "Motor1.Speed.SP",
"DataType": "Real",
"WriteDeadband": 0.5
}
]
}
```
Sequence of writes from the HMI (one every 100 ms, no operator input):
| Time | Value | `\|new last\|` | Wire? |
|---|---|---|---|
| 0 ms | 50.0 | n/a (first) | yes |
| 100 ms | 50.2 | 0.2 < 0.5 | suppressed |
| 200 ms | 50.3 | 0.3 < 0.5 | suppressed |
| 300 ms | 50.6 | 0.6 ≥ 0.5 | yes |
| 400 ms | 50.6 | 0.0 < 0.5 | suppressed |
| 500 ms | 51.5 | 0.9 ≥ 0.5 | yes |
Three writes hit the wire; three are suppressed. The OPC UA client sees
`Good` on every call. The PLC sees only the values that actually crossed
the deadband.
### Combining with WriteOnChange
A digital reset bit driven by a UI that pulses it at every cycle:
```json
{
"Name": "Conveyor.Reset",
"DeviceHostAddress": "ab://10.0.0.5/1,0",
"TagPath": "Conveyor.Reset",
"DataType": "Bool",
"WriteOnChange": true
}
```
Three consecutive `false → false → false` writes from the UI collapse to
one wire write (`false`, the first). When the operator clicks the reset
button (`true`), that write passes; subsequent `true → true` repeats
suppress until the UI clears it back to `false`.
Numeric tags can also opt into both: `WriteDeadband: 0.5` plus
`WriteOnChange: true` is well-defined — the deadband suppresses jitter, the
equality gate suppresses exact repeats (which the deadband path also catches
because `|0| < 0.5`, but having both set documents the operator's intent).
### Special cases
- **First write** always passes through. The coalescer has no prior value
to compare against, so the first write of any tag pays the full
round-trip and seeds the cache.
- **NaN / Infinity** bypass deadband suppression. IEEE-754 comparisons
against NaN are undefined and a stale `+Inf` shouldn't silently swallow
a real reset; the wire decides. `WriteOnChange` equality on NaN still
follows .NET semantics (`Equals(NaN, NaN) == true` for `double` boxed in
`object`), so a `WriteOnChange` tag stuck on NaN will suppress repeats
until something else writes a real value.
- **Failed writes** do *not* seed the cache. If the wire write fails, the
next attempt with the same value still hits the wire because the
coalescer never recorded a "last successful value" for it.
- **Reconnect drops the cache**. The driver's host-state probe transitions
`Stopped → Running` after a reconnect; both transitions reset the
per-device coalescer cache, so the first post-reconnect write of any
value pays the full round-trip. The PLC may have been restarted while
the driver was offline and our cached "we already wrote 42" is stale.
- **Two devices, same tag address**. The cache is keyed on
`(deviceHostAddress, tagAddress)` so two PLCs running the same Logix
program keep independent caches — writing 42 to A doesn't suppress
writing 42 to B.
- **Bit-in-DINT writes** consult the coalescer too, so a UI that pulses
`Flags.3` at every cycle benefits from the same `WriteOnChange`
suppression as a plain BOOL tag.
- **Plain back-compat tags** (no `WriteDeadband`, no `WriteOnChange`)
take a fast-path through the coalescer that increments only the
`WritesPassedThrough` counter — no dictionary lookup, no allocation. The
knobs are zero-overhead opt-in.
### Diagnostics
The driver surfaces two counters through `DriverHealth.Diagnostics` (the
same path the `driver-diagnostics` RPC + Admin UI render for Modbus / S7 /
OPC UA Client):
- `AbCip.WritesSuppressed` — total writes the coalescer skipped.
- `AbCip.WritesPassedThrough` — total writes that hit the wire after
consulting the coalescer.
Their ratio is the "wire savings" headline. A deployment with `0`
suppressions either has no tags opted in or has the deadband too tight /
the equality threshold too loose; revisit the per-tag config.
### Verification
- **Unit**: `AbCipWriteDeadbandTests` (`tests/.../AbCip.Tests`). Asserts
the deadband math, the equality fallback, the first-write pass-through,
reset-on-reconnect, two-device cache independence, suppressed-Good
status, NaN bypass, the back-compat fast path, and DTO round-trip.
- **Integration**: `AbCipWriteDeadbandTests`
(`tests/.../AbCip.IntegrationTests`). Drives a 5-write jittery sequence
with `WriteDeadband: 1.0` against a live `ab_server` and asserts the
driver's diagnostics counter matches the expected suppression count.
- **E2E**: `scripts/e2e/test-abcip.ps1` — see the *WriteCoalesce*
assertion.
### Cross-references
- `docs/drivers/AbServer-Test-Fixture.md` §7 — capability surfaces beyond
read; mentions write-coalesce coverage.
- Modbus driver — read-side deadband in `ModbusDriver` predates this
write-side equivalent; the config shape is intentionally similar.
- Kepware "Deadband (write)" knob — this is the AB CIP equivalent.
## System tags / `_System` folder
PR abcip-4.3 surfaces five read-only diagnostic variables under
`AbCip/<device>/_System/` so SCADA / Admin clients can pivot from "is the
wire up?" to "what's our scan rate / tag count?" without leaving the OPC UA
address space. The values come straight from the live
`IHostConnectivityProbe` + `DriverHealth` surfaces — reads bypass libplctag
and are served from the in-memory snapshot the probe loop / read loop
updates. PR abcip-4.4 added `_RefreshTagDb` as a sixth, writeable entry —
the Kepware-style refresh trigger.
### What it ships
| Variable | Type | Access | Source | Notes |
|---|---|---|---|---|
| `_ConnectionStatus` | String | ViewOnly | `HostState` | `Running` / `Stopped` / `Unknown` / `Faulted`. Mirrors what the connectivity probe sees. |
| `_ScanRate` | Float64 | ViewOnly | `AbCipProbeOptions.Interval` | Configured probe interval in milliseconds — compare against `_LastScanTimeMs` to spot wire stretch. |
| `_TagCount` | Int32 | ViewOnly | `_tagsByName` | Discovered tag count for this device, excluding `_System/*`. |
| `_DeviceError` | String | ViewOnly | `DriverHealth.LastError` | Most recent error message; empty when the device is healthy. |
| `_LastScanTimeMs` | Float64 | ViewOnly | `ReadAsync` wall-clock | Duration of the most-recent `ReadAsync` iteration on this device. |
| `_RefreshTagDb` | Boolean | **Operate** | n/a (write-only trigger) | PR abcip-4.4 — Kepware-style refresh trigger. Reads always return `false`. Writing any truthy value (`true`, non-zero number, `"true"` / `"1"` strings, case-insensitive) dispatches to `RebrowseAsync` against the device's cached `IAddressSpaceBuilder`. Falsy / unparseable writes are no-ops that report `Good` so a UI that resets the trigger flag doesn't see a phantom error. The `AbCip.RefreshTriggers` diagnostic counter increments per truthy write. |
### When the snapshot updates
- **Probe transitions** — every `Running ↔ Stopped` flip refreshes the
device's snapshot inline, so a client subscribed to
`_System/_ConnectionStatus` sees the new state on the next OPC UA
publish tick.
- **Read iterations** — `ReadAsync` recomputes `_LastScanTimeMs` per
device that owned at least one reference in the batch + writes a fresh
snapshot before returning.
- **Driver init** — every device gets a seeded snapshot
(`Unknown` / `0` / `""`) before the probe loop spins up so a read that
arrives before the first probe iteration returns a stable shape rather
than null.
### Browse + read example
```powershell
# Browse the synthetic folder
otopcua-client-cli browse -u opc.tcp://localhost:4840 \
-n "ns=2;s=AbCip/ab://10.0.0.5/1,0/_System"
# Read the connection status
otopcua-client-cli read -u opc.tcp://localhost:4840 \
-n "ns=2;s=AbCip/ab://10.0.0.5/1,0/_System/_ConnectionStatus"
```
The driver-side reference embeds the device host address (the
`_System/<device>/<name>` form) so the dispatcher can route by device
without an additional registry. PR abcip-4.4 turned `_RefreshTagDb` into
a writeable refresh trigger; the rest of the surface remains `ViewOnly`.
### Refreshing the tag DB via OPC UA write
PR abcip-4.4 wires `_RefreshTagDb` to the same `RebrowseAsync` entry point
the CLI's `rebrowse` command exercises (issue #233). Operators have two
roughly-equivalent ways to force a controller-side `@tags` re-walk after a
program download:
```powershell
# Path A — OPC UA write to the system tag (production / Admin UI path)
otopcua-client-cli write -u opc.tcp://localhost:4840 \
-n "ns=2;s=AbCip/ab://10.0.0.5/1,0/_System/_RefreshTagDb" \
-v true --type Boolean
# Path B — direct CLI rebrowse against a transient driver (admin / debug path)
otopcua-abcip-cli rebrowse -g ab://10.0.0.5/1,0
```
Both paths drop the UDT template cache + re-run the enumerator walk. Path A
is the operator-facing surface (the same `IDriverControl.RebrowseAsync`
contract, just dispatched from the OPC UA write surface instead of an
in-process call). Path B spins up its own driver instance so it doesn't
share the live server's cache, which makes it useful for one-off
controller-side validation.
The `AbCip.RefreshTriggers` driver-diagnostics counter increments per
successful truthy write, so the Admin UI / driver-diagnostics RPC can show
a "Refreshes since boot" tile that pairs naturally with the existing
`WritesSuppressed` / `WritesPassedThrough` write-coalescer counters.
### Verification
- **Unit**: `AbCipSystemTagSourceTests`
(`tests/.../AbCip.Tests`) — covers snapshot round-trip, two-device
isolation, recognised-name lookup, default-shape on unseeded devices,
discovery emits the six canonical nodes, and `ReadAsync` dispatches
through the source instead of libplctag.
- **Unit**: `AbCipRefreshTagDbTests`
(`tests/.../AbCip.Tests`) — PR abcip-4.4 — covers discovery emits the
trigger as Operate, reads always return `false`, truthy/falsy/null write
semantics, the `AbCip.RefreshTriggers` counter, two-device counter
independence, defends-in-depth `BadNotWritable` for read-only system
variables, no-op-Good when no builder is cached yet, and mixed-batch
routing alongside ordinary tag writes.
- **Integration**: `AbCipSystemTagDiscoveryTests`
(`tests/.../AbCip.IntegrationTests`) — `[AbServerFact]` connects to a
real `ab_server`, browses `_System/`, reads each variable, asserts
every one returns Good with a non-null value.
- **Integration**: `AbCipRefreshTagDbTests`
(`tests/.../AbCip.IntegrationTests`) — PR abcip-4.4 — `[AbServerFact]`
drives a `_RefreshTagDb` write, asserts the template cache drops + the
per-device counter advances against a live `ab_server`.
- **E2E**: `scripts/e2e/test-abcip.ps1` — see the *SystemTagBrowse* +
*RefreshTagDbWrite* assertions.
-405
View File
@@ -1,405 +0,0 @@
# AB CIP — Performance knobs
Phase 3 of the AB CIP driver plan introduces a small set of operator-tunable
performance knobs that change how the driver talks to the controller without
altering the address space or per-tag semantics. They consolidate decisions
that Kepware exposes as a slider / advanced page so deployments running into
high-latency PLCs, narrow-CPU CompactLogix parts, or legacy ControlLogix
firmware have an explicit lever to pull.
This document is the home for those knobs as PRs land. PR abcip-3.1 ships the
first knob: per-device **CIP Connection Size**.
## Connection Size
### What it is
CIP Connection Size — the byte ceiling on a single Forward Open response
fragment, set during the EtherNet/IP Forward Open handshake. Larger
connection sizes pack more tags into a single CIP RTT (higher request-packing
density, fewer round-trips for the same scan list); smaller connection sizes
stay compatible with legacy or narrow-buffer firmware that rejects oversized
Forward Open requests.
### Family defaults
The driver picks a Connection Size from the per-family profile when the
device-level override is unset:
| Family | Default | Rationale |
|---|---:|---|
| `ControlLogix` | `4002` | Large Forward Open — FW20+ |
| `GuardLogix` | `4002` | Same wire protocol as ControlLogix |
| `CompactLogix` | `504` | 5069-L1/L2/L3 narrow-buffer parts (5370 family) |
| `Micro800` | `488` | Hard cap on Micro800 firmware |
These map straight to libplctag's `connection_size` attribute and match the
defaults Kepware uses out of the box for the same families.
### Override knob
`AbCipDeviceOptions.ConnectionSize` (`int?`, default `null`) overrides the
family default for one device. Bind it through driver config JSON:
```json
{
"Devices": [
{
"HostAddress": "ab://10.0.0.5/1,0",
"PlcFamily": "ControlLogix",
"ConnectionSize": 504
}
]
}
```
The override threads through every libplctag handle the driver creates for
that device — read tags, write tags, probe tags, UDT-template reads, the
`@tags` walker, and BOOL-in-DINT parent runtimes. There is no per-tag
override; one Connection Size applies to the whole controller (matches CIP
session semantics).
### Valid range
`[500..4002]` bytes. This matches the slider Kepware exposes for the same
family. Values outside the range fail driver `InitializeAsync` with an
`InvalidOperationException` — there's no silent clamp; misconfigured devices
fail loudly so operators see the problem at deploy time.
| Value | Behaviour |
|---|---|
| `null` | Use family default (4002 / 504 / 488) |
| `499` or below | Driver init fault — out-of-range |
| `500..4002` | Threaded through to libplctag |
| `4003` or above | Driver init fault — out-of-range |
### Legacy-firmware caveat
ControlLogix firmware **v19 and earlier** caps the CIP buffer at **504
bytes** — Connection Sizes above that cause the controller to reject the
Forward Open with CIP error 0x01/0x113. The 5069-L1/L2/L3 CompactLogix narrow
parts are subject to the same cap.
The driver emits a warning via `AbCipDriverOptions.OnWarning` when the
configured Connection Size **exceeds 511** *and* the device's family profile
default is also at-or-below the legacy cap (i.e. CompactLogix with default
504, or Micro800 with default 488). Production hosting should wire
`OnWarning` to the application logger; the unit tests (`AbCipConnectionSizeTests`)
collect into a list to assert which warnings fired.
The warning fires once per device at `InitializeAsync`. It does not block
initialisation — operators may need the override anyway when running newer
CompactLogix firmware that does support the larger Forward Open. The
controller will reject the connection at runtime if it can't honour the size,
and that surfaces through the standard `IHostConnectivityProbe` channel.
### Performance trade-off
| Larger Connection Size | Smaller Connection Size |
|---|---|
| More tags per CIP RTT — higher throughput | Compatible with legacy / narrow firmware |
| Bigger buffers held by libplctag native (RSS impact) | Lower memory footprint |
| Forward Open rejected on FW19- ControlLogix | Always works (assuming ≥500) |
| Required for high-density scan lists | Forces more round-trips — higher latency |
For most FW20+ ControlLogix shops, the default `4002` is correct and the
override is unnecessary. The override is mainly useful when:
1. **Migrating off Kepware** with a controller-specific slider value already
tuned in production — set Connection Size to match.
2. **Mixed-firmware fleets** where some controllers are still on FW19 — set
the legacy controllers explicitly to `504`.
3. **CompactLogix L1/L2/L3** running newer firmware that supports a larger
Forward Open than the family-default 504 — bump the override up.
4. **Micro800** never goes above `488`; the override is for documentation /
discoverability rather than capability change.
### libplctag wrapper limitation
The libplctag .NET wrapper (1.5.x) does not expose `connection_size` as a
public `Tag` property. The driver propagates the value via reflection on the
wrapper's internal `NativeTagWrapper.SetIntAttribute("connection_size", N)`
after `InitializeAsync` — equivalent to libplctag's
`plc_tag_set_int_attribute`. Because libplctag native parses
`connection_size` only at create time, this is **best-effort** until either:
- the libplctag .NET wrapper exposes `ConnectionSize` directly (planned in
the upstream backlog), in which case the reflection no-ops cleanly, or
- libplctag native gains post-create hot-update for `connection_size`, in
which case the call lands as intended.
In the meantime the value is correctly stored on `DeviceState.ConnectionSize`
+ surfaces in every `AbCipTagCreateParams` the driver builds, so the override
is observable end-to-end through the public driver surface and unit tests
even if the underlying wrapper isn't yet honouring it on the wire.
Operators who need *guaranteed* Connection Size enforcement against FW19
controllers today can pin `libplctag` to a wrapper version that exposes
`ConnectionSize` once one is available, or run a libplctag native build
patched for runtime updates. Both paths are tracked in the AB CIP plan.
### See also
- [`docs/Driver.AbCip.Cli.md`](../Driver.AbCip.Cli.md) — AB CIP CLI uses the
family default ConnectionSize on each invocation; per-device overrides only
apply through the driver's device-config JSON, not the CLI's command-line.
- [`docs/drivers/AbServer-Test-Fixture.md`](AbServer-Test-Fixture.md) §5 —
ab_server simulator does not enforce the narrow CompactLogix cap, so
Connection Size correctness is verified by unit tests + Emulate-rig live
smokes only.
- [`PlcFamilies/AbCipPlcFamilyProfile.cs`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/PlcFamilies/AbCipPlcFamilyProfile.cs) —
per-family default values.
- [`AbCipConnectionSize`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipConnectionSize.cs) —
range bounds + legacy-firmware threshold constants.
## Addressing mode
### What it is
CIP exposes two equivalent ways to address a Logix tag on the wire:
1. **Symbolic** — the request carries the tag's ASCII name and the controller
parses + resolves the path on every read. This is the libplctag default
and what every previous driver build has used.
2. **Logical** — the request carries a CIP Symbol Object instance ID (a small
integer assigned by the controller when the project was downloaded). The
controller skips ASCII parsing entirely; the lookup is a single
instance-table dereference.
Logical addressing is faster on the controller side and produces smaller
request frames. The trade-off is that the driver has to learn the
name → instance-id mapping once, by reading the `@tags` pseudo-tag at
startup, and the resolution step has to repeat after a controller program
download (instance IDs are re-assigned).
### Enum values
`AbCipDeviceOptions.AddressingMode` (`AddressingMode` enum, default
`Auto`) takes one of three values:
| Value | Behaviour |
|---|---|
| `Auto` | Driver picks. **Currently resolves to `Symbolic`** — a future PR will plumb a real auto-detection heuristic (firmware version + symbol-table size). |
| `Symbolic` | Force ASCII symbolic addressing on the wire. The historical default. |
| `Logical` | Use CIP logical-segment / instance-ID addressing. Triggers a one-time `@tags` walk at the first read; subsequent reads consult the cached map. |
`Auto` is documented as "Symbolic-for-now" so deployments setting `Auto`
explicitly today will silently flip to a real heuristic when one ships,
matching the spirit of the toggle. Operators who want to pin the wire
behaviour should set `Symbolic` or `Logical` directly.
### Family compatibility
Logical addressing depends on the controller implementing CIP Symbol Object
class 0x6B with stable instance IDs. Older AB families don't:
| Family | Logical addressing supported? | Why |
|---|---|---|
| `ControlLogix` | yes | Native class 0x6B support, FW10+ |
| `CompactLogix` | yes | Same wire protocol as ControlLogix |
| `GuardLogix` | yes | Same wire protocol; safety partition is tag-level, not addressing-level |
| `Micro800` | **no** | Firmware does not implement class 0x6B; instance-ID reads trip CIP "Path Segment Error" 0x04 |
| `SLC500` / `PLC5` | **no** | Pre-CIP families; PCCC bridging only — no Symbol Object at all |
When `AddressingMode = Logical` is set on an unsupported family, the driver
**falls back to Symbolic with a warning** (via `OnWarning`) instead of
faulting. This keeps mixed-firmware deployments working — operators can ship
a uniform "Logical" config across the fleet and let the driver downgrade
the families that can't honour it.
The driver-level decision is exposed via
`PlcFamilies.AbCipPlcFamilyProfile.SupportsLogicalAddressing` and resolved at
`AbCipDriver.InitializeAsync` time; the resolved mode is stored on
`DeviceState.AddressingMode` and threaded through every
`AbCipTagCreateParams` from then on.
### One-time symbol-table walk
The first read on a Logical-mode device triggers a one-time `@tags` walk via
`LibplctagTagEnumerator` (the same component used for opt-in controller
browse). The driver caches the resulting name → instance-id map on
`DeviceState.LogicalInstanceMap`; subsequent reads consult the cache without
issuing another walk. The walk is gated by a per-device `SemaphoreSlim` so
parallel first-reads serialise on a single dispatch.
The walk happens in `AbCipDriver.EnsureLogicalMappingsAsync` and runs only
for devices that have actually resolved to `Logical`. Symbolic-mode devices
skip the walk entirely. Walk failures are non-fatal: the
`LogicalWalkComplete` flag still flips to `true` so the driver does not
re-attempt indefinitely, and per-tag handles fall back to Symbolic addressing
on the wire (libplctag's default).
A controller program download invalidates the instance IDs. There is no
auto-invalidation today — operators trigger a fresh walk by either
restarting the driver or calling `RebrowseAsync` (the same surface that
clears the UDT template cache) with logic-mode plumbing extended in a
future PR. For now, restart-on-download is the recommended workflow.
### libplctag wrapper limitation
The libplctag .NET wrapper (1.5.x) does **not** expose a public knob for
instance-ID addressing. The driver translates Logical-mode params into
libplctag attributes via reflection on
`NativeTagWrapper.SetAttributeString("use_connected_msg", "1")` +
`SetAttributeString("cip_addr", "0x6B,N")` — same best-effort fallback
pattern as the Connection Size knob.
This means **Logical mode is observable end-to-end through the public
driver surface and unit tests today**, but the actual wire behaviour
remains Symbolic until either:
- the upstream libplctag .NET wrapper exposes the
`UseConnectedMessaging` + `CipAddr` properties on `Tag` directly
(planned in the upstream backlog), in which case the reflection no-ops
cleanly, or
- libplctag native gains post-create hot-update for `cip_addr`, in which
case the call lands as intended.
The driver-level bookkeeping (resolved mode, instance-id map, family
compatibility, fall-back warning) is fully wired so the upgrade path is
purely a wrapper-version bump.
### Performance trade-off
| Symbolic addressing | Logical addressing |
|---|---|
| Works everywhere | Requires Symbol Object class 0x6B |
| ASCII parse on every read (controller-side cost) | One-time walk; instance-id lookup thereafter |
| No first-read latency | First read on a device pays the `@tags` walk |
| Smaller code surface | Stale on program download — restart driver to re-walk |
| Best for small / sparse tag sets | Best for >500-tag scans with stable controller |
For scan lists in the **single-digit-tag** range, the per-poll ASCII parse
cost is invisible. For **medium** scan lists (~100 tags) the gain is real
but small — typically 510% per CIP RTT depending on tag-name length. The
break-even point is where the ASCII-parse overhead starts dominating,
roughly **>500 tags** in a tight scan loop, which is also where libplctag's
own request-packing benefits compound. Large MES / batch projects with
many UDT instances are the canonical case.
### Driver config JSON
Bind the toggle through the driver-config JSON:
```json
{
"Devices": [
{
"HostAddress": "ab://10.0.0.5/1,0",
"PlcFamily": "ControlLogix",
"AddressingMode": "Logical"
}
]
}
```
`"Auto"`, `"Symbolic"`, and `"Logical"` parse case-insensitively. Omitting
the field defaults to `"Auto"`.
### See also
- [`AbCipDriverOptions.AddressingMode`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipDriverOptions.cs) —
enum definition + per-value docstrings.
- [`AbCipPlcFamilyProfile.SupportsLogicalAddressing`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/PlcFamilies/AbCipPlcFamilyProfile.cs) —
family compatibility table source-of-truth.
- [`docs/drivers/AbServer-Test-Fixture.md`](AbServer-Test-Fixture.md) §
"What it actually covers" — Logical-mode fixture coverage status.
- [`AbCipAddressingModeBenchTests`](../../tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/AbCipAddressingModeBenchTests.cs) —
scaffold for the wall-clock comparison; gated on `[AbServerFact]`.
## Read strategy (PR abcip-3.3)
A per-device toggle that controls how multi-member UDT batches are read.
The default `Auto` value matches every previous build's behaviour for dense
reads but switches to per-member bundling when only a handful of members of
a large UDT are subscribed — the canonical "5 of 50" sparse-subscription
case where reading the whole UDT buffer just to extract a few fields wastes
wire bandwidth.
### Three modes
| Mode | When to use |
|---|---|
| `WholeUdt` | Most members of every subscribed UDT are read together. One libplctag read per parent UDT, members decoded in-memory at their byte offsets. The task #194 default. |
| `MultiPacket` | A few members of a large UDT are subscribed at a time. One read per subscribed member, bundled per parent into one CIP Multi-Service Packet. |
| `Auto` (default) | Planner picks per-batch from the subscribed-member fraction (see *Sparsity threshold*). |
### Sparsity threshold
Auto mode divides `subscribedMembers / totalMembers` for each parent UDT and
picks `MultiPacket` when the fraction is **strictly less than** the
threshold, else `WholeUdt`. Default threshold `0.25` — a 1/4 subscription is
the rough break-even where the wire-cost of one whole-UDT read still beats
N member reads on a ControlLogix 4002-byte connection-size buffer; above
1/4, the per-member overhead dominates.
Tune via `AbCipDeviceOptions.MultiPacketSparsityThreshold` (clamped to
`[0..1]`). Threshold `0.0` = "never MultiPacket"; `1.0` = "always MultiPacket
when any member is subscribed."
### Family compatibility
`MultiPacket` requires CIP service `0x0A` (Multi-Service Packet) on the
controller. Source of truth is
[`AbCipPlcFamilyProfile.SupportsRequestPacking`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/PlcFamilies/AbCipPlcFamilyProfile.cs):
| Family | `SupportsRequestPacking` |
|---|---|
| ControlLogix | yes |
| CompactLogix | yes |
| GuardLogix | yes (wire identical to ControlLogix) |
| Micro800 | **no** |
| SLC500 / PLC5 (when those profiles ship) | **no** |
User-forced `MultiPacket` against a non-packing family logs a warning at
device init and falls back to `WholeUdt`. `Auto` against a non-packing
family stays `Auto` at the device level — the per-batch heuristic caps the
strategy to `WholeUdt` so the wire never sees a Multi-Service-Packet against
a controller that can't decode it.
### libplctag wrapper limitation
The libplctag .NET wrapper (1.5.x) does not expose the `0x0A` service as a
public knob — same wrapper-version constraint that gates PR abcip-3.1's
`connection_size` and PR abcip-3.2's instance-ID addressing. Today's
MultiPacket runtime therefore issues N libplctag reads sequentially when
the planner picks the strategy; the wire-level bundling lands cleanly when
an upstream wrapper release exposes the primitive.
The driver-level bookkeeping (resolved strategy, per-batch heuristic,
family-compat fall-back, per-device dispatch counters) is fully wired so
the upgrade path is a wrapper-version bump only — the planner already
produces the right plan, and `AbCipMultiPacketReadPlanner.Build` is
covered by unit tests that pin the plan shape rather than wire bytes.
### Driver config JSON
```json
{
"Devices": [
{
"HostAddress": "ab://10.0.0.5/1,0",
"PlcFamily": "ControlLogix",
"ReadStrategy": "Auto",
"MultiPacketSparsityThreshold": 0.25
}
]
}
```
`"Auto"`, `"WholeUdt"`, and `"MultiPacket"` parse case-insensitively.
Omitting the field defaults to `"Auto"`. Omitting
`MultiPacketSparsityThreshold` defaults to `0.25`.
### See also
- [`AbCipDriverOptions.ReadStrategy`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipDriverOptions.cs) —
enum definition + per-value docstrings.
- [`AbCipMultiPacketReadPlanner`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipMultiPacketReadPlanner.cs) —
plan shape + Auto-mode heuristic.
- [`AbCipPlcFamilyProfile.SupportsRequestPacking`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/PlcFamilies/AbCipPlcFamilyProfile.cs) —
family compatibility table source-of-truth.
- [`AbCipReadStrategyTests`](../../tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Tests/AbCipReadStrategyTests.cs) —
device-init resolution, heuristic edges, dispatch counters, DTO round-trip.
- [`AbCipEmulateMultiPacketReadTests`](../../tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/Emulate/AbCipEmulateMultiPacketReadTests.cs) —
golden-box-tier wire-level coverage scaffold; gated on `AB_SERVER_PROFILE=emulate`.
-141
View File
@@ -1,141 +0,0 @@
# AB Legacy — DH+ via 1756-DHRIO bridging
PR ablegacy-13 / [#256](https://github.com/dohertj2/lmxopcua/issues/256). The AB
Legacy driver can address a PLC-5 sitting on a DH+ link by routing CIP requests
through a 1756-DHRIO module installed in a ControlLogix chassis. This is the
canonical way to keep an installed-base PLC-5 fleet alive after the chassis-
level migration to ControlLogix; the DHRIO module exposes a DH+ "side" that
talks to the legacy PLC-5 / SLC-DH+ peers and a backplane "side" that the
ControlLogix CPU + Ethernet bridge can route through.
## Wire layout
```
OtOpcUa server ──EtherNet/IP──► 1756-EN2T (slot 0) ──backplane──► 1756-DHRIO (slot N) ──DH+──► PLC-5
```
Two CIP hops:
1. **Backplane** — port `1`, slot `<N>` (the slot the DHRIO module lives in).
2. **DH+** — port `2`, station `<S>` (the DH+ node address of the target PLC-5,
in **octal**).
Resulting CIP path: `1,<N>,2,<S>`.
> The first port `1` is always the backplane; port `2` is the DH+ side of the
> 1756-DHRIO module. This mirrors the convention Rockwell uses in RSLinx + RSLogix
> 5.
## Octal station number
The DH+ network was specified with **octal** node addresses. Rockwell tooling
displays them in octal too (RSLogix 5 → "DH+ Node Address" field on the
controller properties dialog). The driver follows suit — the station segment
of the CIP path **must be parsed as octal** (digits 0..7 only; `8`, `9`, and
multi-byte garbage are rejected).
DH+ addresses run `0..77` octal == `0..63` decimal. Quick reference:
| Octal | Decimal | Octal | Decimal | Octal | Decimal | Octal | Decimal |
|------:|--------:|------:|--------:|------:|--------:|------:|--------:|
| 00 | 0 | 20 | 16 | 40 | 32 | 60 | 48 |
| 01 | 1 | 21 | 17 | 41 | 33 | 61 | 49 |
| 02 | 2 | 22 | 18 | 42 | 34 | 62 | 50 |
| 03 | 3 | 23 | 19 | 43 | 35 | 63 | 51 |
| 04 | 4 | 24 | 20 | 44 | 36 | 64 | 52 |
| 05 | 5 | 25 | 21 | 45 | 37 | 65 | 53 |
| 06 | 6 | 26 | 22 | 46 | 38 | 66 | 54 |
| 07 | 7 | 27 | 23 | 47 | 39 | 67 | 55 |
| 10 | 8 | 30 | 24 | 50 | 40 | 70 | 56 |
| 11 | 9 | 31 | 25 | 51 | 41 | 71 | 57 |
| 12 | 10 | 32 | 26 | 52 | 42 | 72 | 58 |
| 13 | 11 | 33 | 27 | 53 | 43 | 73 | 59 |
| 14 | 12 | 34 | 28 | 54 | 44 | 74 | 60 |
| 15 | 13 | 35 | 29 | 55 | 45 | 75 | 61 |
| 16 | 14 | 36 | 30 | 56 | 46 | 76 | 62 |
| 17 | 15 | 37 | 31 | 57 | 47 | 77 | 63 |
Anything past `77` octal (i.e. decimal > 63) is invalid on a real DH+ network
and rejected by the parser.
## PLC-5 only
DHRIO bridging is **PLC-5-only**. The driver enforces this at
`AbLegacyDriver.InitializeAsync` time: a DH+ bridge path combined with
`PlcFamily=Slc500 / MicroLogix / LogixPccc` throws
`InvalidOperationException("DHRIO bridging is PLC-5-only")` immediately rather
than letting reads silently fail with `BadCommunicationError` on the wire.
Background: the 1756-DHRIO module only speaks DH+ to PLC-5 / SLC-DH+ peers, and
libplctag's PCCC stack only exposes the PLC-5 side. SLC 5/04 boxes on DH+
**can** be physically reached through a DHRIO module, but the protocol stack
needed to drive them isn't exposed by libplctag — out of scope for this driver.
## CLI worked example
PLC-5 at DH+ node `07` (octal == 7 decimal), DHRIO module in slot 3, gateway
`192.168.1.10`:
```powershell
otopcua-ablegacy-cli probe `
-g ab://192.168.1.10/1,3,2,07 `
-P Plc5 `
-a N7:0
```
```powershell
# Read N7:10 from the PLC-5 across the DHRIO bridge
otopcua-ablegacy-cli read `
-g ab://192.168.1.10/1,3,2,07 `
-P Plc5 `
-a N7:10 `
-t Int
```
The driver surfaces the parsed bridge form on the host-address record:
`BackplaneSlot=3`, `DhPlusPort=2`, `DhPlusStation=7` (decimal-translated). Use
those values when reading driver-diagnostics output to confirm the bridge was
recognised — a non-bridge CIP path leaves all three fields null.
## Manual smoke procedure
There is no automated end-to-end coverage for DH+ bridging because the only
path to wire-level validation is real hardware (libplctag's `ab_server` Docker
image doesn't simulate the DHRIO + DH+ + PLC-5 stack). The unit-test layer
covers parser positive / negative cases.
Hardware smoke checklist:
1. Confirm the 1756-DHRIO module is present in the target ControlLogix chassis.
RSLinx Classic should show `DH+, 1` under the chassis tree with the PLC-5
nodes enumerated underneath.
2. Note the DHRIO module's slot number (the `<N>` in `1,<N>,2,<S>`).
3. Note the target PLC-5's DH+ node address — read it off the front-panel switch
bank, or the controller properties in RSLogix 5. **Read it as octal**.
4. From an OtOpcUa box that can reach the EtherNet/IP gateway:
```powershell
otopcua-ablegacy-cli probe -g ab://<gateway>/1,<slot>,2,<station-octal> -P Plc5 -a S:0
```
`S:0` (status file word 0) is non-destructive and present on every PLC-5.
5. If the probe succeeds, exercise an N file read against a known
non-zero address. Compare against the value displayed in RSLogix 5 →
Online → Data → N7.
If the probe fails with `BadCommunicationError`:
- Wrong slot number — re-check via RSLinx.
- Wrong octal node — convert from RSLogix 5's display value (already octal); a
decimal-thinking conversion mistake is the most common smoke failure.
- DHRIO module's DH+ baud rate doesn't match the PLC-5's switch setting (57.6k
/ 115.2k / 230.4k) — driver-side problem this can't paper over.
- A scanner on the DHRIO is in scheduled-mode and starving unscheduled
PCCC traffic — bump the DHRIO's unscheduled-message slice in RSLogix 5000.
## See also
- [`Driver.AbLegacy.Cli.md`](../Driver.AbLegacy.Cli.md) — the family / CIP-path
cheat sheet now carries a DHRIO row.
- [`drivers/AbLegacy-Test-Fixture.md`](AbLegacy-Test-Fixture.md) — DH+ bridging
is unit-only; no Docker fixture supports it.
-188
View File
@@ -1,188 +0,0 @@
# AB Legacy diagnostic counters
Per-device diagnostic counters surface as auto-generated read-only OPC UA
variables under each device's synthetic `_Diagnostics/` folder. HMIs can bind
directly without going through a separate diagnostics RPC. Mirrors the AB CIP
`_System/` pattern from PR abcip-4.3.
Closes #253 (PR ablegacy-10).
## The nine counters
Each device managed by the `AbLegacyDriver` exposes nine read-only nodes under
`AbLegacy/<host>/_Diagnostics/<name>`. The first seven shipped in PR ablegacy-10;
`DemoteCount` + `LastDemotedUtc` arrived with PR ablegacy-12 / #255 (auto-demote
on comm failure).
| Name | Type | Semantics |
|---|---|---|
| `RequestCount` | Int64 | Total `ReadAsync` requests issued against this device. One increment per non-diagnostic reference per call, success or failure. |
| `ResponseCount` | Int64 | Successful read responses. |
| `ErrorCount` | Int64 | Failed read responses (any non-Good status). |
| `RetryCount` | Int64 | Retry attempts beyond the first per the PR 9 retry loop. A single read with two retries adds two. |
| `LastErrorCode` | Int32 | Most recent libplctag status code on a failed read; `0` when no error has been seen since the last reset. |
| `LastErrorMessage` | String | Most recent libplctag error message on a failed read; empty when no error has been seen since the last reset. |
| `CommFailures` | Int64 | Count of read failures mapped to `BadCommunicationError`. Spans transient libplctag throws + retried-out chains so operators see a single "wire fell off" counter. |
| `DemoteCount` | Int64 | **PR ablegacy-12** — cumulative auto-demote events for this device. Bumps every time the driver crosses the consecutive-failure threshold and arms a fresh cool-down window. Cumulative across `ReinitializeAsync` (preserved through redeploys) so a flapping link surfaces as a steadily climbing counter. |
| `LastDemotedUtc` | String | **PR ablegacy-12** — ISO-8601 UTC timestamp of the most recent auto-demotion. Empty string when this device has never been demoted. |
**Address shape**: `_Diagnostics/<deviceHostAddress>/<name>`
e.g. `_Diagnostics/ab://10.0.0.5/1,0/RequestCount`.
The `<deviceHostAddress>` segment is the canonical `ab://host[:port]/cip-path`
string from `AbLegacyDeviceOptions.HostAddress`. The browse path looks like
`AbLegacy/<deviceHostAddress>/_Diagnostics/<name>` — the same shape as a
user-config tag node, just under a reserved sibling folder.
## Reset behaviour
| Trigger | Effect |
|---|---|
| `ReinitializeAsync` | Every counter for every device resets to zero, plus `LastErrorMessage` clears to empty. **PR ablegacy-12 exception:** `DemoteCount` + `LastDemotedUtc` survive the reinit so an operator redeploying mid-incident doesn't lose the flapping-link history. |
| `ShutdownAsync` | All counters drop with the device map (including `DemoteCount`). |
| Driver process restart | Counters start at zero. |
| Probe transition Stopped→Running | **No automatic reset** — counters are cumulative across reconnect events so operators can spot intermittent links by watching `CommFailures` keep climbing. |
| Probe transition Demoted→Running | **PR ablegacy-12** — early-clear of the active demote window, but the cumulative `DemoteCount` stays put. |
There is no in-process "reset" RPC at the time of writing. If you need to
clear counters without a redeploy, kick a `ReinitializeAsync` from the Admin
RPC surface — the driver re-EnsureDevice's each host so the freshly registered
counters start at zero.
## What does *not* increment counters
Reads against `_Diagnostics/<host>/<name>` are **driver-local observability**,
not field traffic — they short-circuit before the libplctag dispatch and do
NOT increment `RequestCount` or any other counter. Otherwise a 1 Hz HMI poll
of `RequestCount` would make the counter chase its own tail.
Writes against `_Diagnostics/*` are rejected with `BadNotWritable` because
every diagnostic node is `SecurityClassification.ViewOnly` — a misbehaving
SCADA template can't accidentally clobber the diagnostic surface.
## Collision with user tags
User-config tags must not shadow the seven reserved diagnostic names and
must not live under the synthetic `_Diagnostics/` folder. Both shapes are
rejected at `InitializeAsync` time with a clear `InvalidOperationException`:
- A tag named `RequestCount` (or any of the other six reserved names) is
rejected because it would silently never resolve at read time — the
diagnostics short-circuit wins.
- A tag whose `Address` starts with `_Diagnostics/` is rejected because the
whole prefix is owned by the auto-emitted counters.
Pick a different name (`SiteRequestCount`, `MachineRequestCount`) or a
different address path (real PCCC files like `N7:0`).
## HMI binding examples
### OPC UA Client CLI
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840 `
-n "ns=2;s=AbLegacy/ab://10.0.0.5/1,0/_Diagnostics/RequestCount"
```
### AB Legacy CLI (driver-direct, no OPC UA layer)
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli -- read `
-g "ab://10.0.0.5/1,0" -P Slc500 `
--address "_Diagnostics/RequestCount"
```
The driver-direct path lets you sanity-check the counter without standing up
an OPC UA server — useful when triaging a wire-level issue on the bench.
### Subscription pattern
Subscribe to all seven counters at a slow rate (e.g. 510 s) on a long-lived
overview dashboard, plus a faster rate (1 s) on `LastErrorMessage` /
`LastErrorCode` when actively debugging a flapping link. The diagnostics
short-circuit makes every read O(1) — there's no penalty for fast polling
of the counter itself, only the OPC UA subscription bookkeeping.
## Auto-demote on comm failure (PR ablegacy-12 / #255)
When a device fails N consecutive reads or probes the driver marks it
**Demoted** for a configurable cool-down window. Reads against a demoted
device short-circuit with `BadCommunicationError` *without invoking
libplctag* — that's the whole point of the feature: one slow PLC sharing
the driver thread can't starve faster peers reading from healthy hosts on
the same `AbLegacyDriver` instance.
### Configuration
Per-device, optional. `null` keeps the documented defaults (auto-demote
**enabled** with 3 failures / 30 s).
```jsonc
{
"Devices": [
{
"HostAddress": "ab://10.0.0.5/1,0",
"PlcFamily": "Slc500",
"Demote": {
"FailureThreshold": 3, // default 3
"DemoteForMs": 30000, // default 30s
"Enabled": true // default true
}
}
]
}
```
| Knob | Default | Notes |
|---|---|---|
| `FailureThreshold` | `3` | Consecutive comm failures before the device is demoted. A successful read or probe resets the tally. Terminal failures (`BadNodeIdUnknown`, `BadTypeMismatch`, …) **do not count** — they're config / decoder mismatches, not field outages. |
| `DemoteForMs` | `30000` (30s) | Cool-down window. Reads while this is active short-circuit; a successful probe clears it early. |
| `Enabled` | `true` | Set to `false` to keep the diagnostic counters but skip the auto-throttle. The failure tally still ticks but never arms the cool-down. |
### Recovery
Three ways out of Demoted, in order of likelihood:
1. **Probe success** — the per-device probe loop (`Probe.Enabled = true`,
default address `S:0`) is the fast path. The next probe iteration after
demotion will exercise the wire; on success it clears
`DemotedUntilUtc` immediately and transitions the host to `Running`.
2. **Window expiry** — once `DemoteForMs` elapses the demote marker
clears on the next read attempt. The read goes through; if it fails,
the failure tally keeps counting from where it left off (so a
permanently-down device re-arms the window after one more consecutive
failure rather than having to repeat the full threshold).
3. **`ReinitializeAsync`** — clears `ConsecutiveFailures` +
`DemotedUntilUtc` outright. Cumulative `DemoteCount` survives.
### Observability
`DemoteCount` is the headline counter — it bumps once per demotion event,
not per short-circuited read. A device that flaps every hour for a week
shows `DemoteCount = ~168` on Friday afternoon, which is the operator
signal you actually want.
`LastDemotedUtc` is the ISO-8601 UTC timestamp of the most recent
demotion. Bind it on a per-device tile alongside `DemoteCount` for
"flapping link" alerting.
### Host-state surface
A demoted device reports `HostState.Demoted` (new in PR ablegacy-12
on `Core.Abstractions/IHostConnectivityProbe.cs`). Consumers that
predate the new value (the central `HostStatusPublisher`) safely treat
it as `Stopped` — no schema migration needed.
## Cross-references
- [`AbLegacyDiagnosticTags.cs`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDiagnosticTags.cs)
— counter store + read short-circuit
- [`AbLegacyDriver.cs`](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs)
— increment sites in `ReadAsync`, discovery emission in `DiscoverAsync`,
auto-demote bookkeeping in `RecordFailureAndMaybeDemote` + `ProbeLoopAsync`
- [`AbLegacy-Test-Fixture.md`](AbLegacy-Test-Fixture.md) — `AbLegacyDiagnosticsTests`
+ `AbLegacyAutoDemoteTests` + collision-rejection contract
- [AB CIP `_System/` parallel](../../src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipSystemTagSource.cs)
— same pattern with the CIP-specific six entries (incl. writeable
`_RefreshTagDb` trigger)
-163
View File
@@ -1,163 +0,0 @@
# AB Legacy — RSLogix symbol & data-table import
ablegacy-11 / [#254](https://github.com/dohertj2/lmxopcua/issues/254) — bulk-import
RSLogix 500 / 5 symbol exports into the AB Legacy driver. Saves operators from
hand-typing every `N7:0` / `F8:12` / `B3:0/5` row of a several-hundred-tag PLC
into `appsettings.json`.
## Supported formats — v1
| Format | Status | Notes |
|---|---|---|
| `.CSV` "Database Export" | **supported** | Header columns `Symbol,Address,Description,DataType,Scope`; quoted fields, doubled-quote escapes, comment lines (`;` / `#`) all honoured |
| `.SLC` text export | **supported** | RSLogix 500's "Save As Text" emits the same column shape — point the importer at the file directly |
| `.RSS` (RSLogix 500 binary project) | **out of scope** | Proprietary; no parser ships in libplctag or any community project. Export to CSV first |
| `.RSP` (RSLogix 5 binary project) | **out of scope** | Same as `.RSS` |
The binary `.RSS` / `.RSP` non-goal isn't a "we don't have time" decision —
Rockwell's binary format is undocumented + tied to RSLogix's internal page
layout, and the only known parsers are commercial IDE plugins. v1 ships with
text/CSV only and a clean abstraction (`IRsLogixImporter`) so a binary parser
can slot in later without reshaping the call sites.
## CSV column reference
| Column | Required | Notes |
|---|---|---|
| `Symbol` | yes | OPC UA tag name. RSLogix symbols are already stable; the importer uses them verbatim |
| `Address` | yes | PCCC address. File letter implies `DataType` (see below); the importer's resolution wins over the CSV's `DataType` column |
| `Description` | no | Parsed but currently unused — `AbLegacyTagDefinition` has no `Description` field at the v2 schema layer (see [#248](https://github.com/dohertj2/lmxopcua/issues/248)). Held in the column contract for future schema bumps |
| `DataType` | no | RSLogix-supplied (`INT` / `REAL` / `BOOL` / `TIMER` / …). Ignored at import time; the importer derives the type from the file letter |
| `Scope` | no | `Global` (default when blank) or `Local:N` for ladder-file-N-scoped tags. Acts as a filter when `--scope` is set on the CLI |
### File-letter → `AbLegacyDataType` mapping
| Letter | Example | Maps to | Notes |
|---|---|---|---|
| `N` | `N7:0` | `Int` (signed 16-bit) | |
| `F` | `F8:0` | `Float` (32-bit IEEE-754) | |
| `B` | `B3:0/0` | `Bit` | Bit-within-word also forces Bit when `BitIndex` is set |
| `L` | `L9:0` | `Long` (signed 32-bit) | SLC 5/05+ only |
| `ST` | `ST10:0` | `String` | 82-byte fixed-length + length word |
| `T` | `T4:0.ACC` | `TimerElement` | Sub-element implied by `.ACC` / `.PRE` / `.EN` / `.DN` |
| `C` | `C5:0.ACC` | `CounterElement` | |
| `R` | `R6:0.LEN` | `ControlElement` | |
| `A` | `A14:0` | `AnalogInt` | Older hardware |
| `I` / `O` / `S` | `I:0/0` | `Int` (or `Bit` with bit suffix) | I/O + status files |
| `PD` / `MG` / `PLS` / `BT` | `PD9:0` | `PidElement` etc. | Family-gated; PD/MG common on SLC500 + PLC-5; PLS/BT PLC-5 only |
| `RTC` / `HSC` / `DLS` / … | `RTC:0.YR` | `MicroLogixFunctionFile` | MicroLogix 1100 / 1400 only |
A bit suffix (`/N`) on any file letter forces `Bit`, regardless of the file
letter's normal classification — `N7:0/3` parses as Bit, not Int.
## Scope filter
The `Scope` column distinguishes program-scoped tags (`Local:1`, `Local:2`, …)
from globals. RSLogix exports usually mix both. The CLI's `--scope` flag (and
`ImportOptions.ScopeFilter` at the API level) keeps only the rows whose
`Scope` value matches case-insensitively; rows with no `Scope` column count as
`Global`.
```powershell
# Import only the Global symbols
otopcua-ablegacy-cli import-rslogix `
--file plc-export.csv `
--device ab://192.168.1.20/1,0 `
--scope Global
# Import only the file-2 program-scope tags
otopcua-ablegacy-cli import-rslogix `
--file plc-export.csv `
--device ab://192.168.1.20/1,0 `
--scope Local:2
```
## CLI subcommand — `import-rslogix`
```powershell
otopcua-ablegacy-cli import-rslogix --help
```
| Flag | Default | Purpose |
|---|---|---|
| `-f` / `--file` | **required** | Path to the CSV export |
| `-d` / `--device` | **required** | Canonical AB Legacy gateway URI every imported tag binds to |
| `--emit` | `appsettings-fragment` | `appsettings-fragment` (JSON) or `summary` (one-line counter) |
| `-o` / `--output` | stdout | Optional path; when set the JSON fragment is written there + summary line goes to stdout |
| `--scope` | none | Optional Scope filter (case-insensitive) |
| `--max-rows` | unlimited | Defensive cap on rows imported |
| `--strict` | off | Fail-fast on the first malformed row (default permissive: skip + log) |
### `appsettings-fragment` output shape
The default `--emit appsettings-fragment` mode writes a JSON object whose
`Tags` array is shaped like the `AbLegacyDriverConfigDto.Tags` array — paste
straight into the driver-instance config under
`Drivers/<instance>/Config/Tags`.
```json
{
"Tags": [
{
"Name": "MotorSpeed",
"DeviceHostAddress": "ab://192.168.1.20/1,0",
"Address": "N7:0",
"DataType": "Int",
"Writable": true
},
]
}
```
### Summary line
`--emit summary` writes a single line:
```
Imported 142 tag(s), skipped 3, errors 0.
```
`Skipped` covers Scope-filter rejections + missing-required-field rows; `errors`
covers rows whose `Address` failed to parse as a PCCC address.
## API surface — `IRsLogixImporter` + `AddRsLogixImport`
For server-side / bootstrap use-cases the importer is also reachable via:
```csharp
using ZB.MOM.WW.OtOpcUa.Driver.AbLegacy;
using ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Import;
var options = new AbLegacyDriverOptions
{
Devices = [new AbLegacyDeviceOptions("ab://192.168.1.20/1,0")],
};
// Append imported tags onto an existing options object.
var updated = options.AddRsLogixImport(
path: @"C:\plc\plc-export.csv",
deviceHostAddress: "ab://192.168.1.20/1,0",
out var result);
// result.ParsedCount / SkippedCount / ErrorCount surface the import telemetry.
Console.WriteLine($"Imported {result.ParsedCount} tags");
```
For a hand-managed importer instance (e.g. supplying a custom `ILogger`) call
`new RsLogixSymbolImport(logger).Parse(stream, deviceHostAddress, opts)`
directly.
## Operational notes
- The importer is **additive**`AddRsLogixImport` concatenates onto the
existing `Tags` list rather than replacing it. Hand-rolled tags (system-status
variables, computed fields the operator added by hand) survive a re-import.
- Re-imports are not idempotent today — calling `AddRsLogixImport` twice will
produce duplicate tag rows. Operators are expected to either start from a
clean options object or de-duplicate themselves; a future schema rev may add
a `replace=true` switch.
- Description metadata is dropped on the floor — see the column reference
above. When [#248](https://github.com/dohertj2/lmxopcua/issues/248) lands a
`Description` field on `AbLegacyTagDefinition` the importer will start
populating it without further changes to the CSV contract.
-106
View File
@@ -36,12 +36,6 @@ supplies a `FakeAbLegacyTag`.
- `AbLegacyAddressTests` — PCCC address parsing for SLC / MicroLogix / PLC-5
/ LogixPccc-mode (`N7:0`, `F8:12`, `B3:0/5`, etc.)
- `AbLegacyArrayTests` — PR 7 array contiguous-block addressing: parser
positives + rejects for `,N` / `[N]` suffixes, options-override
(`ArrayLength`), driver `IsArray` discovery, and array decoding for N / F /
L / B files (Rockwell convention: one BOOL per word for `B3:0,10`). Latency
benchmark against the Docker fixture is a perf-flagged integration case in
`AbLegacyArrayReadTests` — runs only when ab_server is reachable.
- `AbLegacyCapabilityTests` — data type mapping, read-only enforcement
- `AbLegacyReadWriteTests` — read + write happy + error paths against the fake
- `AbLegacyBitRmwTests` — bit-within-DINT read-modify-write serialization via
@@ -49,63 +43,6 @@ supplies a `FakeAbLegacyTag`.
- `AbLegacyHostAndStatusTests` — probe + host-status transitions driven by
fake-returned statuses
- `AbLegacyDriverTests``IDriver` lifecycle
- `AbLegacyDiagnosticsTests` — PR ablegacy-10 / #253 per-device diagnostic
counters: 5 reads (3 ok / 2 fail) → `RequestCount=5`, `ResponseCount=3`,
`ErrorCount=2`; `LastErrorCode` reflects the most recent libplctag status;
`RetryCount` increments per retry attempt beyond the first; counters reset
on `ReinitializeAsync`; discovery emits the canonical diagnostic variables
per device under `_Diagnostics/` (now 9 with PR ablegacy-12); collision
rejection at `InitializeAsync` for user tags shadowing reserved names or
`_Diagnostics/` addresses; the `_Diagnostics/<host>/<name>` short-circuit
returns the live snapshot through `ReadAsync` without bumping
`RequestCount`; two devices keep counters independent.
- `AbLegacyAutoDemoteTests`**PR ablegacy-12 / #255** auto-demote on comm
failure: 3 consecutive failures arm the demote window and surface
`HostState.Demoted`; subsequent reads short-circuit with
`BadCommunicationError` *without invoking libplctag* (verified via
`factory.Tags["N7:0"].ReadCount` not advancing); successful read resets
the consecutive-failure counter; failure-success-failure pattern doesn't
cross the threshold; `DemoteCount` + `LastDemotedUtc` surface via
`_Diagnostics/`; `Enabled=false` opts out (failures still count, demotion
never fires); `ReinitializeAsync` clears the active window but preserves
cumulative `DemoteCount`; cool-down expiry allows the next read through;
two devices in one driver — one faulty, one healthy — proves the faulty
side's demotion doesn't starve the healthy side; `BadNodeIdUnknown`
(terminal) does not count toward the comm-failure tally; DTO JSON
round-trip preserves `FailureThreshold` / `DemoteForMs` / `Enabled` at
the per-device level; `HostState.Demoted` enum value is wired through
`Core.Abstractions`. Companion integration test in
`tests/.../IntegrationTests/AbLegacyAutoDemoteTests.cs` runs the
two-device-one-unreachable scenario against a live ab_server fixture
using `127.0.0.1:1` as the unreachable peer.
- `RsLogixSymbolImportTests` — ablegacy-11 / #254 RSLogix CSV symbol-import parser:
canonical 8-row CSV (one row per N/F/B/L/ST/T/C/R) → 8 typed
`AbLegacyTagDefinition`s with the right `DataType`; header + comment-line
(`;` / `#`) skipping; malformed-row → log warning + skip (`IgnoreInvalid=true`
default) vs. `InvalidDataException` (`IgnoreInvalid=false`); empty stream →
empty result; UTF-8 BOM survival; embedded comma in quoted Description;
doubled-quote escape; `--scope` filter (Global vs. Local:N); `MaxRowsToImport`
cap; missing required header column → `InvalidDataException` regardless of
`IgnoreInvalid`; `TryResolveDataType` rejects garbage + bit-suffix overrides
the file letter (`N7:0/3` → Bit).
- `RsLogixSymbolImportGoldenTests` — golden-snapshot integration: loads
`Fixtures/rslogix-canonical.csv` (8-row canonical export covering every v1
file letter), serialises the resulting tag list, and compares to
`Fixtures/rslogix-canonical-expected.json`. On mismatch the actual JSON is
dumped to `%TEMP%/rslogix-canonical-actual.json` and the path printed in the
failure message so the dev can `cp` the golden after reviewing the diff.
- `AbLegacyDriverFactoryAddRsLogixImportTests` — covers the
`AbLegacyDriverFactoryExtensions.AddRsLogixImport` extension method:
appends imported tags onto an existing options object without dropping the
hand-rolled tags or the device list; mutates by-copy (immutability
guarantee); `AddRsLogixImportWithResult` tuple overload returns both the
modified options and the import counters.
- `AbLegacyDeadbandTests` — PR 8 per-tag deadband / change filter:
absolute-only suppression sequence `[10.0, 10.5, 11.5, 11.6] -> [10.0, 11.5]`,
percent-only suppression with a zero-prev short-circuit, both-set logical-OR
semantics (Kepware), Boolean edge-only publish, string change-only publish,
status-change always-publish, first-seen always-publish, ReinitializeAsync
cache wipe, JSON DTO round-trip.
Capability surfaces whose contract is verified: `IDriver`, `IReadable`,
`IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
@@ -132,17 +69,6 @@ driver-side correctness depends on libplctag being correct.
`IPerCallHostResolver` contract is verified; real PCCC wire routing across
multiple gateways is not.
### 3a. DH+ via 1756-DHRIO bridging (PR ablegacy-13 / #256)
Unit-only — coverage lives in `AbLegacyDhPlusBridgingTests`. The CIP-path
parser positive / negative cases (octal-station validation, slot bounds, port
shape) and the PLC-5-only family guard at `InitializeAsync` are exercised
against fakes. There is no Docker fixture for DH+ because libplctag's
`ab_server` doesn't simulate the DHRIO + DH+ + PLC-5 stack — wire-level
validation requires real hardware. See
[`AbLegacy-DH-Bridging.md`](AbLegacy-DH-Bridging.md) for the manual smoke
procedure.
### 4. Alarms / history
PCCC has no alarm object + no history object. Driver doesn't implement
@@ -185,33 +111,6 @@ cover the common ones but uncommon ones (`R` counters, `S` status files,
network; parts are end-of-life but still available. PLC-5 +
LogixPccc-mode behaviour + DF1 serial need specific controllers.
## Per-device options (`AbLegacyDeviceOptions`)
Each entry in `AbLegacyDriverOptions.Devices` carries:
| Field | Type | Default | Notes |
|---|---|---|---|
| `HostAddress` | string | required | `ab://host[:port]/cip-path` |
| `PlcFamily` | enum | `Slc500` | Slc500 / MicroLogix / Plc5 / LogixPccc |
| `DeviceName` | string | null | Friendly label used in browse + diagnostics |
| `Timeout` | TimeSpan? | null → driver-wide default | **PR 9 / #252** — wins over the driver-wide `Timeout`. Mix-and-match: SLC 5/01 ≈ 5 s, SLC 5/05 ≈ 2 s, MicroLogix 1100 ≈ 3 s |
| `Retries` | int? | null → driver-wide default → 0 | **PR 9 / #252** — retries on transient `BadCommunicationError`; terminal errors surface on the first attempt |
JSON shape (mirrored on `AbLegacyDeviceDto`):
```json
{
"HostAddress": "ab://192.168.1.10/1,0",
"PlcFamily": "Slc500",
"DeviceName": "slc-5-01-line-A",
"TimeoutMs": 5000,
"Retries": 1
}
```
Per-device overrides also flow into the probe loop — slow chassis won't be
falsely marked Stopped just because the driver-wide probe timeout is tight.
## Key fixture / config files
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/AbLegacyServerFixture.cs`
@@ -225,10 +124,5 @@ falsely marked Stopped just because the driver-wide probe timeout is tight.
— known-limitations write-up + resolution paths
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/FakeAbLegacyTag.cs`
in-process fake + factory
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/Fixtures/rslogix-canonical.csv`
— ablegacy-11 / #254 8-row canonical RSLogix CSV symbol export, one row per
v1 file letter (N/F/B/L/ST/T/C/R)
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/Fixtures/rslogix-canonical-expected.json`
— golden snapshot the import tests compare against
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — scope remarks
at the top of the file
+3 -69
View File
@@ -38,16 +38,6 @@ quirk. UDT / alarm / quirk behavior is verified only by unit tests with
- `--plc controllogix` and `--plc compactlogix` mode dispatch.
- The skip-on-missing-binary behavior (`AbServerFactAttribute`) so a fresh
clone without the simulator stays green.
- **Symbolic vs Logical addressing wall-clock** (PR abcip-3.2,
`AbCipAddressingModeBenchTests`) — both modes complete + emit timing.
**Emulate-tier only**: `ab_server` does not currently honour the CIP Symbol
Object class 0x6B `cip_addr` attribute that Logical mode sets, so on the
fixture the two modes measure the same wire path. The bench scaffold
asserts both complete + records timing for human inspection; the actual
Symbolic-vs-Logical perf comparison requires a real ControlLogix /
CompactLogix on the network. See
[`docs/drivers/AbCip-Performance.md`](AbCip-Performance.md) §"Addressing
mode" for the full caveat.
## What it does NOT cover
@@ -70,19 +60,6 @@ Unit coverage: `AbCipFetchUdtShapeTests`, `CipTemplateObjectDecoderTests`,
`AbCipDriverWholeUdtReadTests` — all with golden Template-Object byte buffers
+ offset-keyed `FakeAbCipTag` values.
PR abcip-3.3 layers a per-device **`ReadStrategy`** selector on top
(`WholeUdt` / `MultiPacket` / `Auto`, see
[`AbCip-Performance.md`](AbCip-Performance.md) §"Read strategy"). Strategy
switching is planner-side: the dispatcher picks between
`AbCipUdtReadPlanner` (whole-UDT) and `AbCipMultiPacketReadPlanner`
(per-member, bundled per parent) per batch. The selector + per-batch Auto
heuristic + family-compat fall-back + per-device dispatch counters are
**unit-tested only** in `AbCipReadStrategyTests``ab_server` cannot host
a 50-member UDT to exercise the sparse case the strategy is designed for,
and the libplctag .NET wrapper (1.5.x) does not expose explicit
Multi-Service-Packet bundling, so wire-level coverage stays Emulate-tier
in `AbCipEmulateMultiPacketReadTests` (gated on `AB_SERVER_PROFILE=emulate`).
### 2. ALMD / ALMA alarm projection (#177)
Depends on the ALMD UDT shape, which `ab_server` cannot emulate. The
@@ -119,15 +96,6 @@ value per PR 10, but `ab_server` accepts whatever the client asks for — the
cap's correctness is trusted from its unit test, never stressed against a
simulator that rejects oversized requests.
PR abcip-3.1 layers the **per-device `ConnectionSize` override** on top
(`AbCipDeviceOptions.ConnectionSize`, range `[500..4002]`, see
[`AbCip-Performance.md`](AbCip-Performance.md)). Same gap — `ab_server`
happily honours an oversized override against the CompactLogix profile, so
the legacy-firmware warning + Forward Open rejection that real 5069-L1/L2/L3
parts emit are unit-tested only. Live coverage stays Emulate / rig-only
(connect against a real CompactLogix L2 with `ConnectionSize=1500` to
confirm the Forward Open fails with CIP error 0x01/0x113).
### 6. BOOL-within-DINT read-modify-write (#181)
The `AbCipDriver.WriteBitInDIntAsync` RMW path + its per-parent `SemaphoreSlim`
@@ -139,48 +107,14 @@ the RMW path is not exercised end-to-end.
No smoke test for:
- `IWritable.WriteAsync` — atomic write coverage; PR abcip-4.2 added a
multi-write *suppression* smoke (jittery 5-write sequence with
`WriteDeadband: 1.0` against `ab_server`, asserting the driver's
diagnostics counter matches the expected suppression count) but pure
atomic-write coverage end-to-end is still unit-only.
- `IWritable.WriteAsync`
- `ITagDiscovery.DiscoverAsync` (`@tags` walker)
- `ISubscribable.SubscribeAsync` (poll-group engine)
- ~~`IHostConnectivityProbe` state transitions under wire failure~~ —
covered as of PR abcip-4.3. `AbCipSystemTagDiscoveryTests` connects to
`ab_server`, drives the discovery + read path against the synthetic
`_System/_ConnectionStatus` variable, and asserts the live snapshot
reflects the probe-driven `HostState`. Wire-failure transitions still
rely on unit-level `ThrowOnRead` injection rather than a real wire pull,
but the end-to-end probe → snapshot → OPC UA address-space link is
exercised against `ab_server`.
- `IHostConnectivityProbe` state transitions under wire failure
- `IPerCallHostResolver` multi-device routing
The driver implements all of these + they have unit coverage, but the only
end-to-end paths `ab_server` validates today are atomic `ReadAsync` and
write-deadband / write-on-change suppression.
### 8. ControlLogix HSBY paired-IP role probing (PR abcip-5.1)
`ab_server` has no second-chassis concept and no `WallClockTime.SyncStatus`
tag. The HSBY paired-IP role-prober (PR abcip-5.1) is unit-tested only —
`AbCipHsbyTests` drives two fake runtimes (primary + partner), pins each
chassis's role-tag value, and asserts the active-resolution rules + DTO
round-trip + diagnostics surface.
The `paired` Docker compose profile spins up two `ab_server` instances +
a stub `hsby-mux` sidecar so the topology is documented, but PR 5.2 follow-
up needs a patched `ab_server` image (or a Python shim) that actually
serves the role tag before the integration test
(`AbCipHsbyRoleProberTests`) can flip its `Assert.Skip` into a real wire
assertion. Until then the test is gated on `Category=Hsby` + skipped by
default.
Lab-rig coverage is the authoritative path — a real 1756-RM redundant
chassis pair is the only place the live `WallClockTime.SyncStatus` matrix
+ split-brain handling can be exercised end-to-end. See
[`AbCip-HSBY.md`](AbCip-HSBY.md) for the full configuration + role-tag
detection matrix.
end-to-end path `ab_server` validates today is atomic `ReadAsync`.
## Logix Emulate golden-box tier
+112 -131
View File
@@ -2,168 +2,149 @@
Coverage map + gap inventory for the FANUC FOCAS2 CNC driver.
**TL;DR: there is no integration fixture.** Every test uses a
`FakeFocasClient` injected via `IFocasClientFactory`. Fanuc's FOCAS library
(`Fwlib32.dll`) is closed-source proprietary with no public simulator;
CNC-side behavior is trusted from field deployments.
**Status:** as of 2026-04-24, OtOpcUa speaks FOCAS2 directly over TCP
via the pure-managed [`Focas.Wire`](https://github.com/Ladder99/focas-mock/tree/main/dotnet/Focas.Wire)
client. Integration tests run the managed driver end-to-end against the
vendored `focas-mock` Python server (at
[`tests/.../Docker/focas-mock/`](../../tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/focas-mock/VENDORED.md))
whose native FOCAS Ethernet responder is verified PDU-by-PDU against the
real `fwlibe64.dll`.
## What the fixture is
No shim DLL, no P/Invoke, no licensed binary — any dev box or CI runner
with Docker can run the full fixture end-to-end.
Nothing at the integration layer.
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` is unit-only. The driver ships
as Tier C (process-isolated) per `docs/v2/driver-stability.md` because the
FANUC DLL has known crash modes; tests can't replicate those in-process.
Hardware validation against a real CNC is still useful to catch
series-specific firmware quirks (see [§ Hardware-only gaps](#hardware-only-gaps))
but the mock's wire responder covers every FOCAS call OtOpcUa issues.
## What it actually covers (unit only)
## What the fixture covers
- `FocasCapabilityTests` — data-type mapping (PMC bit / word / float,
macro variable types, parameter types)
- `FocasCapabilityMatrixTests` — per-CNC-series range validation (macro
/ parameter / PMC letter + number) across 16i / 0i-D / 0i-F /
30i / PowerMotion. See [`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md)
for the authoritative matrix. 46 theory cases lock every documented
range boundary — widening a range without updating the doc fails a
test.
- `FocasReadWriteTests` — read + write against the fake, FOCAS native status
→ OPC UA StatusCode mapping
### Unit layer (no container required)
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` uses `FakeFocasClient`
injected via `IFocasClientFactory`:
- `FocasCapabilityTests` — data-type mapping (PMC bit / byte / word /
long / float / double, macro variable types, parameter types)
- `FocasCapabilityMatrixTests` — per-CNC-series range validation across
16i / 0i-D / 0i-F / 30i / Power Motion, 46 theory cases locking every
documented range boundary. See
[`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md).
- `FocasReadWriteTests` — read / write contract against the fake, FOCAS
native status → OPC UA `StatusCode` mapping
- `FocasScaffoldingTests``IDriver` lifecycle + multi-device routing
- `FocasPmcBitRmwTests` — PMC bit read-modify-write synchronization (per-byte
`SemaphoreSlim`, mirrors the AB / Modbus pattern from #181)
- `FwlibNativeHelperTests``Focas32.dll``Fwlib32.dll` bridge validation
+ P/Invoke signature validation
- `FocasPmcBitRmwTests` — PMC bit read-modify-write synchronisation
- `FocasAlarmProjectionTests` — raise / clear diffing, severity mapping
- `FocasHandleRecycleTests` — proactive session recycle cadence
Capability surfaces whose contract is verified: `IDriver`, `IReadable`,
`IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
`IPerCallHostResolver`.
`ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
`IPerCallHostResolver`, `IAlarmSource`. `IWritable` intentionally
returns `BadNotWritable` — OtOpcUa is read-only against FOCAS.
Pre-flight validation runs in `FocasDriver.InitializeAsync` — configs
referencing out-of-range addresses fail at load time with a diagnostic
message naming the CNC series + documented limit. This closes the
cheap half of the hardware-free stability gap; Tier-C process
isolation (task #220) closes the expensive half — see
[`docs/v2/implementation/focas-isolation-plan.md`](../v2/implementation/focas-isolation-plan.md).
message naming the CNC series + documented limit.
## What it does NOT cover
### Integration layer (mock only, no CNC, no shim)
### 1. FOCAS wire traffic
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/` drives the
managed `FocasDriver` end-to-end. A single gate:
No FOCAS TCP frame is sent. `Fwlib32.dll`'s TCP-to-FANUC-gateway exchange is
closed-source; the driver trusts the P/Invoke layer per #193. Real CNC
correctness is trusted from field deployments.
**Docker compose up** — tests skip when the TCP probe to
`localhost:8193` fails with a pointer to the compose command.
### 2. Alarm / parameter-change callbacks
When the mock is up, `WireFocasClient` dials it over TCP exactly like a
real CNC, and the mock's native FOCAS Ethernet responder replies with
binary PDUs against the documented command IDs. Covered assertions:
FOCAS has no push model — the driver polls via the shared `PollGroupEngine`.
There are no CNC-initiated callbacks to test; the absence is by design.
- Session open / close (`cnc_allclibhndl3` + `cnc_freelibhndl`)
- Parameter read-back after `mock_patch` seed → `cnc_rdparam`
- Macro read-back after seed → `cnc_rdmacro` (scaled-decimal
translation verified)
- PMC range read after seed → `pmc_rdpmcrng`
- `IAlarmSource` raise + clear transitions after `mock_patch`
alarm-list changes → `cnc_rdalmmsg2`
- Fixed-tree bootstrap: identity / axes / spindle / program / timers /
servo meters populate via `cnc_sysinfo`, `cnc_rdaxisname`,
`cnc_rdspdlname`, `cnc_rddynamic2`, `cnc_exeprgname2`,
`cnc_rdblkcount`, `cnc_rdopmode`, `cnc_rdsvmeter`, `cnc_rdspload`,
`cnc_rdspmaxrpm`, `cnc_rdtimer`
- Per-series profile selection via `mock_load_profile` — tests can
pin one profile and assert series-gated capability suppression
### 3. Macro / ladder variable types
### E2E script (CLI)
FANUC has CNC-specific extensions (macro variables `#100-#999`, system
variables `#1000-#5000`, PMC timers / counters / keep-relays) whose
per-address semantics differ across 0i-F / 30i / 31i / 32i Series. Driver
covers the common address shapes; per-model quirks are not stressed.
`scripts/e2e/test-focas.ps1` drives the Client.CLI against a running
OtOpcUa server. Accepts:
### 4. Model-specific behavior
- `-CncHost` / `-CncPort` for real hardware
- `-ProfileName <compose-profile>` for the Docker mock
- `-Series <csv>` for per-series matrix mode
- `-HandleLeakCycles <N>` for handle-leak stress
- Alarm retention across power cycles (model-specific CNC behavior)
- Parameter range enforcement (CNC rejects out-of-range writes)
- MTB (machine tool builder) custom screens that expose non-standard data
## Hardware-only gaps
### 5. Tier-C process isolation — architecture shipped, Fwlib32 integration hardware-gated
The mock has parity with the real `fwlibe64.dll` for the calls OtOpcUa
issues, but a real CNC can still surface things a reference
implementation can't:
The Tier-C architecture is now in place as of PRs #169#173 (FOCAS
PR AE, task #220):
1. **Series-specific firmware quirks** — alarm retention across power
cycles, parameter range enforcement by the CNC (not the driver),
MTB custom screens, series-specific option bits. Each series has
documented behaviours that only a bench CNC exercises.
2. **Wire-level stress** — burst reads, concurrent device writes,
network-partition recovery under load. The mock handles these
correctly but production behaviour is the source of truth.
3. **Transient operational states** — alarm floods, emergency-stop
transitions, power-on resync. These are easy to stub but hard to
cover comprehensively in the mock.
- `Driver.FOCAS.Shared` carries MessagePack IPC contracts
- `Driver.FOCAS.Host` (.NET 4.8 x86 Windows service via NSSM) accepts
a connection on a strictly-ACL'd named pipe + dispatches frames to
an `IFocasBackend`
- `Driver.FOCAS.Ipc.IpcFocasClient` implements the `IFocasClient` DI
seam by forwarding over IPC — swap the DI registration and the
driver runs Tier-C with zero other changes
- `Driver.FOCAS.Supervisor.FocasHostSupervisor` owns the spawn +
heartbeat + respawn + 3-in-5min crash-loop breaker + sticky alert
- `Driver.FOCAS.Host.Stability.PostMortemMmf`
`Driver.FOCAS.Supervisor.PostMortemReader` — ring-buffer of the
last ~1000 IPC operations survives a Host crash
Track the close-out under task #54 (live-CNC smoke). When the rig
lands, the hardware path runs alongside the mock path; the mock
stays as the CI quality gate.
The one remaining gap is the production `FwlibHostedBackend`: an
`IFocasBackend` implementation that wraps the licensed
`Fwlib32.dll` P/Invoke. That's hardware-gated on task #222 — we
need a CNC on the bench (or the licensed FANUC developer kit DLL
with a test harness) to validate it. Until then, the Host ships
`FakeFocasBackend` + `UnconfiguredFocasBackend`. Setting
`OTOPCUA_FOCAS_BACKEND=fake` lets operators smoke-test the whole
Tier-C pipeline end-to-end without any CNC.
## When to trust each layer
## When to trust FOCAS tests, when to reach for a rig
| Question | Unit | Integration (mock) | Real CNC |
| --- | :---: | :---: | :---: |
| "Does PMC address `R100.3` route to the right bit?" | ✅ | ✅ | ✅ |
| "Does the Fanuc status → OPC UA StatusCode map cover every documented code?" | ✅ (contract) | ✅ | ✅ |
| "Does `FocasDriver.ReadAsync` correctly decode a seeded parameter?" | no | ✅ | ✅ |
| "Does `IAlarmSource` fire raise + clear events?" | ✅ (Fake) | ✅ (wire) | ✅ |
| "Does a real read against a 30i Series return correct bytes?" | no | ✅ (via profile) | ✅ (required) |
| "Do series-specific firmware quirks behave as documented?" | no | no | ✅ (required) |
| "Does the driver survive real network partitions?" | no | partial (socket kill) | ✅ (required) |
| Question | Unit tests | Real CNC |
| --- | --- | --- |
| "Does PMC address `R100.3` route to the right bit?" | yes | yes |
| "Does the FANUC status → OPC UA StatusCode map cover every documented code?" | yes (contract) | yes |
| "Does a real read against a 30i Series return correct bytes?" | no | yes (required) |
| "Does `Fwlib32.dll` crash on concurrent reads?" | no | yes (stress) |
| "Do macro variables round-trip across power cycles?" | no | yes (required) |
## Running the integration fixture
## Alarm history (`cnc_rdalmhistry`) — issue #267, plan PR F3-a
```powershell
# 1) Start the mock on a chosen profile.
docker compose -f tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml up -d
`FocasAlarmProjection` ships two modes:
# 2) Run the tests. No shim build, no DLL copy — the driver dials the mock directly.
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/
```
- **`ActiveOnly`** (default) — surfaces only currently-active alarms.
No history poll. Same back-compat shape every prior FOCAS deployment used.
- **`ActivePlusHistory`** — additionally polls `cnc_rdalmhistry` on connect
+ on the configured cadence (`HistoryPollInterval`, default 5 min). Each
unseen entry fires an `OnAlarmEvent` with `SourceTimestampUtc` set from
the CNC's reported timestamp, not Now.
Unit-test coverage in `FocasAlarmProjectionTests`:
- mode `ActiveOnly` — no `ReadAlarmHistoryAsync` call ever issued
- mode `ActivePlusHistory` — first poll fires on subscribe (== "on connect")
- dedup — same `(OccurrenceTime, AlarmNumber, AlarmType)` triple across two
polls only emits once
- distinct entries with different timestamps each emit separately
- same alarm number / different type still emits both (type is part of the
dedup key)
- `OccurrenceTime` is the wire timestamp (round-trips a year-old stamp
without bleeding into Now)
- `HistoryDepth` clamp — user-supplied 500 collapses to 250 on the wire;
zero / negative falls back to the 100 default
- `FocasAlarmHistoryDecoder` — round-trips through `Encode` / `Decode` and
pins the simulator command id at `0x0F1A`
Future integration coverage (not yet shipped — no FOCAS integration test
project exists):
- a focas-mock with a per-profile ring buffer and `mock_patch_alarmhistory`
admin endpoint will let `cnc_rdalmhistry` round-trip end-to-end through
the wire protocol
- `FocasSimFixture.SeedAlarmHistoryAsync` will let series tests prime canned
history without per-test JSON
## Follow-up candidates
1. **Nothing public** — Fanuc's FOCAS Developer Kit ships an emulator DLL
but it's under NDA + tied to licensed dev-kit installations; can't
redistribute for CI.
2. **Lab rig** — used FANUC 0i-F simulator controller (or a retired machine
tool) on a dedicated network; only path that covers real CNC behavior.
3. **Process isolation first** — before trusting FOCAS in production at
scale, shipping the Tier-C out-of-process Host architecture (similar to
Galaxy) is higher value than a CI simulator.
Or use `scripts/integration/run-focas.ps1` which wraps compose up / test
/ compose down and accepts `-Profile <name>` to pin a per-series run.
## Key fixture / config files
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/focas-mock/`
— vendored `focas-mock` Python source + Dockerfile
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml`
— per-series compose profiles
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/FocasSimFixture.cs`
— collection fixture + mock admin API client
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Series/FixedTreePopulatesTests.cs`
— fixed-tree end-to-end tests
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Series/WireBackendTests.cs`
— pure-wire-backend end-to-end tests
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/FakeFocasClient.cs`
in-process fake implementing `IFocasClient`
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/FocasCapabilityMatrixTests.cs`
— parameterized theories locking the per-series matrix
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasDriver.cs` — ctor takes
`IFocasClientFactory`
in-process unit fake
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/Wire/WireFocasClient.cs` — the
managed wire client backing production deployments
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasCapabilityMatrix.cs`
per-CNC-series range validator (the matrix the doc describes)
per-series range validator
- `docs/v2/focas-version-matrix.md` — authoritative range reference
- `docs/v2/implementation/focas-isolation-plan.md` — Tier-C isolation
plan (task #220)
- `docs/v2/driver-stability.md` — Tier C scope + process-isolation rationale
+214 -257
View File
@@ -1,281 +1,238 @@
# FOCAS driver
# FOCAS Driver
Fanuc CNC driver for the FS 0i / 16i / 18i / 21i / 30i / 31i / 32i / 35i /
Power Mate i families. Talks to the controller via the licensed
`Fwlib32.dll` (Tier C, process-isolated per
[`docs/v2/driver-stability.md`](../v2/driver-stability.md)).
Getting-started guide for the FANUC FOCAS2 driver. This is the short path — for
the exhaustive per-node mapping read [`docs/v2/driver-specs.md §7`](../v2/driver-specs.md),
for deployment details read [`docs/v2/focas-deployment.md`](../v2/focas-deployment.md),
for the test-harness map read [FOCAS-Test-Fixture.md](FOCAS-Test-Fixture.md).
For range-validation and per-series capability surface see
[`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md).
## What it talks to
## Fixed-tree `Production/` projection — issue #258 (F1-b) + issue #272 (F5-a)
FANUC CNCs (0i-D / 0i-F / 0i-MF / 0i-TF / 16i / 30i / 31i / 32i / Power Motion i)
over the proprietary FOCAS2 protocol on TCP port 8193. The wire is spoken
directly by the pure-managed [`Focas.Wire`](https://github.com/Ladder99/focas-mock)
client — no Fwlib64.dll, no P/Invoke, no out-of-process isolation needed.
Per-device read-only nodes refreshed from the same `cnc_rdparam` /
cycle-timer poll the probe loop already runs. No additional wire calls
are issued for any of these — they are all cache-or-derive reads.
OtOpcUa is **read-only** against FOCAS; all reads go over the native wire
protocol using the documented command IDs. Writes return
`BadNotWritable` by design.
| Node | DataType | Source | Notes |
| --- | --- | --- | --- |
| `Production/PartsProduced` | `Int32` | `cnc_rdparam(6711)` | Active parts-count counter. Wraps to 0 on operator reset. |
| `Production/PartsRequired` | `Int32` | `cnc_rdparam(6712)` | Operator-set target. |
| `Production/PartsTotal` | `Int32` | `cnc_rdparam(6713)` | Lifetime parts counter. |
| `Production/CycleTimeSeconds` | `Int32` | `cnc_rdtimer` (channel 0) | Live cycle-time accumulator. Resets to 0 on next cycle start (CNC-side behaviour). |
| **`Production/LastCycleSeconds`** | **`Float64`** | **derived** | **Plan PR F5-a — seconds for the most recently completed cycle, computed as `CycleTimeSeconds(now) - CycleTimeSeconds(at previous parts-count increment)`. `null` until the second observed parts-count increment establishes a delta. Pure derivation, no new wire calls. See edge-case rules below.** |
| **`Production/LastCycleStartUtc`** | **`DateTime`** *(UTC)* | **derived** | **Plan PR F5-a — UTC wall-clock of the most-recent cycle's start, computed as `nowUtc - LastCycleSeconds`. `null` alongside `LastCycleSeconds` until the second observed increment.** |
## Project split
### F5-a derivation edge-case rules
| Project | Target | Role |
|---------|--------|------|
| `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/` | net10.0 | In-process driver — hosts `WireFocasClient` which speaks FOCAS2 over TCP directly |
- **First observation** establishes the baseline; `LastCycleSeconds` /
`LastCycleStartUtc` stay `null` until the second observed parts-count
increment produces the first delta.
- **Parts-count counter reset** (current value goes backwards, e.g.
shift-change zero) **preserves the last published values** so an
operator reading the tag mid-shift-change sees the last known cycle
duration rather than `null` / Bad. The next positive transition
produces a fresh delta from the new baseline.
- **Cycle-timer rollover** (delta would be negative — e.g. CNC zeroes
the cycle timer at part completion) **leaves the previously-published
values unchanged for one tick** and re-baselines so the next
increment produces a clean delta. The driver does NOT publish a
negative `LastCycleSeconds`.
- **Parts-count jumps `> 1`** (backfill — e.g. counter increments by
3 at once) publish the **timer delta over the window** as
`LastCycleSeconds`. The plan's "delta over the window between
successive parts-count increments" definition does not divide by the
count delta; the value reflects the actual elapsed timer between the
two observations.
- **Reconnect / reinit** clears the derivation state — the prior CNC
session's cycle-timer + parts-count snapshots may be invalidated by
the FWLIB session boundary, so the next post-reconnect probe tick
re-establishes the baseline before the next delta publishes.
Previous `Driver.FOCAS.Host` / `Driver.FOCAS.Shared` Tier-C split has been
retired — the managed wire client removes the native-crash blast radius
that justified the out-of-process service.
## Alarm history (`cnc_rdalmhistry`) — issue #267, plan PR F3-a
## Minimum deployment
`FocasAlarmProjection` exposes two modes via `FocasDriverOptions.AlarmProjection`:
| Mode | Behaviour |
| --- | --- |
| `ActiveOnly` *(default)* | Subscribe / unsubscribe / acknowledge wire up so capability negotiation works, but no history poll runs. Back-compat with every pre-F3-a deployment. |
| `ActivePlusHistory` | On subscribe (== "on connect") and on every `HistoryPollInterval` tick, the projection issues `cnc_rdalmhistry` for the most recent `HistoryDepth` entries. Each previously-unseen entry fires an `OnAlarmEvent` with `SourceTimestampUtc` set from the CNC's reported timestamp — OPC UA dashboards see the real occurrence time, not the moment the projection polled. |
### Config knobs
Register the driver instance in the main server's `appsettings.json`. No
separate service, no DLL deployment, no shared-secret handshake:
```jsonc
{
"AlarmProjection": {
"Mode": "ActivePlusHistory", // "ActiveOnly" (default) | "ActivePlusHistory"
"HistoryPollInterval": "00:05:00", // default 5 min
"HistoryDepth": 100 // default 100, capped at 250
}
}
```
### Dedup key
`(OccurrenceTime, AlarmNumber, AlarmType)`. The same triple across two
polls only emits once. The dedup set is in-memory and **resets on
reconnect** — first poll after reconnect re-emits everything in the ring
buffer. OPC UA clients that need exactly-once semantics dedupe client-side
on the same triple (the timestamp + type + number tuple is stable across
the boundary).
### `HistoryDepth` cap
Capped at `FocasAlarmProjectionOptions.MaxHistoryDepth = 250` so an
operator who types `10000` by accident can't blast the wire session with a
giant request. Typical FANUC ring buffers cap at ~100 entries; the default
`HistoryDepth = 100` matches the most common ring-buffer size.
### Wire surface
- Wire-protocol command id: `0x0F1A` (see
[`docs/v2/implementation/focas-wire-protocol.md`](../v2/implementation/focas-wire-protocol.md)).
- ODBALMHIS struct decoder: `Wire/FocasAlarmHistoryDecoder.cs`.
- Tier-C Fwlib32 backend short-circuits the packed-buffer decoder by
surfacing the FWLIB struct fields directly into
`FocasAlarmHistoryEntry`.
## Writes (opt-in, off by default) — issue #268 (F4-a) + #269 (F4-b) + #270 (F4-c)
Writes ship behind multiple independent opt-ins. All default off so a freshly
deployed FOCAS driver is read-only until the deployment makes a deliberate
choice. Decision record: [`docs/v2/decisions.md`](../v2/decisions.md) →
"FOCAS write-path opt-in".
| Knob | Default | Effect when off |
| --- | --- | --- |
| `FocasDriverOptions.Writes.Enabled` *(driver-level master switch)* | `false` | Every entry in a `WriteAsync` batch short-circuits to `BadNotWritable` with status text `writes disabled at driver level`. Wire client never gets touched. |
| **`FocasDriverOptions.Writes.AllowParameter`** *(F4-b granular kill switch)* | **`false`** | **`PARAM:` writes return `BadNotWritable` with no wire client constructed. Defense in depth — even if `Enabled = true` an operator must explicitly opt into parameter writes per kind because a misdirected `cnc_wrparam` can put the CNC in a bad state.** |
| **`FocasDriverOptions.Writes.AllowMacro`** *(F4-b granular kill switch)* | **`false`** | **`MACRO:` writes return `BadNotWritable` with no wire client constructed. Macro writes are the normal HMI-driven recipe / setpoint surface; gating them separately from `AllowParameter` lets a deployment open MACRO without exposing the heavier PARAM write surface.** |
| **`FocasDriverOptions.Writes.AllowPmc`** *(F4-c granular kill switch)* | **`false`** | **PMC writes (R/G/F/D/X/Y/K/A/E/T/C letters, both Bit and Byte) return `BadNotWritable` with no wire client constructed. PMC is ladder working memory — a mistargeted bit can move motion, latch a feedhold, or flip a safety interlock, so PMC writes are gated separately from PARAM/MACRO so an operator team can open PARAM (commissioning) without exposing the much higher-blast-radius PMC surface.** |
| `FocasTagDefinition.Writable` *(per-tag opt-in)* | `false` | The per-tag check returns `BadNotWritable` for that tag even when the driver-level flags are on. |
> **PMC SAFETY CALLOUT** — PMC is the FANUC ladder's working memory. A
> mistargeted bit can move motion (a Y-coil writing to a servo enable),
> latch a feedhold (an internal R-relay the ladder ANDs with cycle-start),
> or flip a safety interlock (an X-input shadow). **Treat PMC writes the
> same way you'd treat editing a live ladder:** verify e-stop is live and
> the machine is in jog mode before issuing the first write of a session.
> The driver gates these writes behind THREE independent opt-ins
> (`Writes.Enabled` + `Writes.AllowPmc` + per-tag `Writable`) precisely
> because the blast radius is higher than parameter writes.
### PMC bit-write read-modify-write semantics — F4-c
The FOCAS wire call `pmc_wrpmcrng` is **byte-addressed** — there is no
sub-byte write primitive. When the driver receives a write request on a
`Bit` tag (e.g. `R100.3`), it:
1. Reads the parent byte via `pmc_rdpmcrng` (1 byte at `R100`).
2. Masks the target bit (set: `current | (1 << bit)`; clear: `current & ~(1 << bit)`).
3. Writes the modified byte back via `pmc_wrpmcrng` (1 byte at `R100`).
A **per-byte semaphore** serialises concurrent bit writes against the same
byte so two updates that race never lose one another's bit. RMW means **a
PMC bit write reads first, then writes back the whole byte** — if the ladder
is also writing to that byte at the same instant, there is a small window
where the driver's value can clobber the ladder's. Operators who care about
this race must coordinate the write through a ladder-side handshake (e.g.
the operator sets a request bit, the ladder reads + clears it).
### Config shape — F4-c
```jsonc
{
"Writes": {
"Enabled": true,
"AllowParameter": true, // F4-b — opt into cnc_wrparam
"AllowMacro": true, // F4-b — opt into cnc_wrmacro
"AllowPmc": true // F4-c — opt into pmc_wrpmcrng (incl. RMW bit writes)
},
"Tags": [
{ "Name": "RPM", "Address": "PARAM:1815", "DataType": "Int32",
"Writable": true, "WriteIdempotent": false },
{ "Name": "Recipe", "Address": "MACRO:500", "DataType": "Int32",
"Writable": true, "WriteIdempotent": false },
{ "Name": "StartFlag", "Address": "R100.3", "DataType": "Bit",
"Writable": true, "WriteIdempotent": true }
]
}
```
### Server-layer ACL (LDAP groups)
Per the [`docs/v2/acl-design.md`](../v2/acl-design.md) tier model, the FOCAS
driver only declares per-tag `SecurityClassification`; `DriverNodeManager`
applies the gate. The classification post-F4-b is:
| Tag kind | Classification | LDAP group required (default mapping) |
| --- | --- | --- |
| `PARAM:N` writable | `Configure` | **`WriteConfigure`** |
| `MACRO:N` writable | `Operate` | `WriteOperate` |
| Other writable (PMC R/G/F/...) | `Operate` | `WriteOperate` |
| Non-writable | `ViewOnly` | (no write permission) |
Parameter writes need the heavier `WriteConfigure` group because they're
mostly emergency commissioning territory; macro writes use `WriteOperate`
because they're the normal HMI recipe surface. The driver-level
`AllowParameter` / `AllowMacro` kill switches sit independently of ACL — an
operator-team kill switch the deployment can flip without redeploying ACL
group memberships. See [`docs/security.md`](../security.md) for the full
group/permission map.
`WriteIdempotent` is plumbed through Polly retry by the server-layer
`CapabilityInvoker.ExecuteWriteAsync`. When `false` (default), failed writes
are NOT auto-retried per plan decisions #44/#45 — a timeout that fires after
the CNC already accepted the write would otherwise risk a duplicate
non-idempotent action (alarm acks, M-code pulses, recipe steps). Flip
`WriteIdempotent` on per tag for genuinely-idempotent writes (a parameter
value that the operator simply wants forced to a target).
### FOCAS password — issue #271 (F4-d)
Some controllers — notably 16i and certain 30i firmwares with the
parameter-protect switch on — gate `cnc_wrparam` and a handful of reads
behind a connection-level password. Without unlocking the session, every
gated wire call returns `EW_PASSWD`, which the F4-b mapping surfaces as
`BadUserAccessDenied`.
`FocasDeviceOptions.Password` plumbs the password through the device config:
```jsonc
{
"Drivers": {
"focas-cnc-1": {
"Type": "FOCAS",
"Config": {
"Backend": "wire",
"Devices": [
{
"HostAddress": "focas://10.0.0.5:8193",
"Password": "1234" // F4-d — optional CNC password
}
{ "HostAddress": "focas://10.20.30.40:8193", "Series": "ThirtyOne_i" }
],
"Tags": [
{ "Name": "Mode", "DeviceHostAddress": "focas://10.20.30.40:8193",
"Address": "PARAM:3402", "DataType": "Int32", "Writable": false },
{ "Name": "SpndLoad", "DeviceHostAddress": "focas://10.20.30.40:8193",
"Address": "MACRO:500", "DataType": "Float64", "Writable": false }
]
}
}
}
```
When set, the driver:
The main server opens two TCP sockets per configured device and speaks the
FOCAS2 binary protocol directly. No local privileged components, no
platform bitness constraint — the driver runs on every host OtOpcUa runs
on.
1. **On connect**, calls `IFocasClient.UnlockAsync(password, ct)` after
the FWLIB handle opens but before any read/write fires. The FWLIB-backed
client emits `cnc_wrunlockparam` with the password ASCII-encoded into
the 4-byte FOCAS password slot (right-padded with `0x00`, truncated at
4 bytes — that's the shape the public Fanuc samples document).
2. **On `BadUserAccessDenied` from any gated read or write**, re-issues
`UnlockAsync` and retries the call **exactly once**. A second
`EW_PASSWD` propagates unchanged so a wrong password doesn't loop
forever on the wire.
3. **Reset on reconnect** — FWLIB unlock state lives on the handle, so
any reconnect path (planned or unplanned) re-runs unlock automatically
via `EnsureConnectedAsync`.
## Address forms
**No-log invariant.** The password is a secret. The driver MUST NOT log
it. Specifically:
| Form | Example | Meaning |
|------|---------|---------|
| `X0.0` / `R100` / `R100.3` | PMC bit or byte | Letter + number; optional `.bit` for bit access |
| `PARAM:1815` / `PARAM:1815/0` | CNC parameter | Number + optional axis index |
| `MACRO:500` | Custom macro variable | System / user macro variable number |
- `FocasDeviceOptions` overrides the record's auto-generated `ToString`
to print `Password = ***` when the field is non-null. Any Serilog
destructure that flows the device options through `{Device}` gets the
redaction for free.
- `FwlibFocasClient.UnlockAsync` does not include the password in any
exception message — only the FWLIB return code (`EW_PASSWD`,
`EW_HANDLE`, etc.) makes it into the surface.
- `FocasDriver` logs only `"FOCAS unlock applied for {host}"` when the
unlock succeeds — no password.
- The Driver.FOCAS.Cli `--cnc-password` flag is also redacted at the
same `FocasDeviceOptions` choke point.
- See [`docs/v2/focas-deployment.md`](../v2/focas-deployment.md)
§ "FOCAS password handling" for the storage/rotation runbook + the
cross-link to [`docs/Security.md`](../Security.md).
Addresses are validated against the per-device `Series` at `InitializeAsync`
a config referencing a number outside the documented range for that series
fails at load time with an error message naming the limit. See
[`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md) for the
authoritative range table.
When the controller does **not** need a password, leave `Password`
unset (`null`) and the driver short-circuits the unlock call entirely —
no wire-level cost.
## Backend selection
### Status-code semantics post-F4-b
The driver picks its client from `Config.Backend`:
- `BadNotWritable` — one of: driver-level `Writes.Enabled = false`; per-tag
`Writable = false`; **`Writes.AllowParameter = false` for a `PARAM:` tag
(F4-b)**; **`Writes.AllowMacro = false` for a `MACRO:` tag (F4-b)**;
**`Writes.AllowPmc = false` for a PMC tag (F4-c)**. Same status code,
five distinct paths — operators distinguish by checking the knobs.
- `BadUserAccessDenied`**F4-b** — the CNC reported `EW_PASSWD`
(parameter-write switch off / unlock required). **F4-d** wires the
`cnc_wrunlockparam` retry path on top: when `Password` is configured
the driver re-issues unlock + retries the gated call once before
surfacing this status. A persistent `BadUserAccessDenied` after F4-d
means either (a) the password doesn't match the controller, or (b)
the parameter-write switch on the pendant is still off and the
controller wants both the switch + the password.
- `BadNotSupported` — both opt-ins flipped on, but the wire client doesn't
implement the kind being written (e.g. older transport variant). F4-a
wired the generic dispatch; F4-b adds typed `WriteParameterAsync` /
`WriteMacroAsync` entry points whose default impls return
`BadNotSupported` so transports compiled against a stale `IFocasClient`
surface still build.
- `BadNodeIdUnknown` — full-reference doesn't match any configured
`FocasTagDefinition.Name`.
- `BadCommunicationError` — wire failure (DLL not loaded, IPC peer dead,
etc.).
| Value | Client | Use it for |
|-------|--------|------------|
| `wire` (default) | `WireFocasClient` | Production — pure-managed FOCAS2 over TCP |
| `unimplemented` / `none` / `stub` | `UnimplementedFocasClientFactory` | Scaffolding a DriverInstance row before the CNC endpoint is reachable |
### CLI bypass
Previous backends (`fwlib`, `fwlib32`, `ipc`) have been retired along
with `Driver.FOCAS.Host` and the Fwlib P/Invoke path. Configs that still
reference them will throw at startup with a message pointing here.
`otopcua-focas-cli write` ([`docs/Driver.FOCAS.Cli.md`](../Driver.FOCAS.Cli.md))
sets `Writes.Enabled=true` locally for the lifetime of one invocation
because the CLI is a per-operator tool — not a long-lived process bound to
the central config DB. The server-side flag is untouched; configure-the-
server code paths remain safer-by-default.
## Capability surface
| Capability | Wire path | Notes |
|------------|-----------|-------|
| `IReadable` | `ReadAsync``cnc_rdpmcrng` / `cnc_rdparam` / `cnc_rdmacro` | One TCP request/response per read; `Focas.Wire` serializes requests on socket 2 internally |
| `IWritable` | returns `BadNotWritable` | OtOpcUa is read-only against FOCAS by design — no `cnc_wrparam` / `pmc_wrpmcrng` / `cnc_wrmacro` path is implemented |
| `ITagDiscovery` | `DiscoverAsync` | Emits `FOCAS/{device}/{tag}` folders per configured device |
| `ISubscribable` | polled via shared `PollGroupEngine` | FOCAS has no push model — subscriptions turn into per-tag polling groups |
| `IHostConnectivityProbe` | periodic `cnc_rdcncstat` | Probe cadence is `Probe.Interval`; transitions fire `OnHostStatusChanged` |
| `IPerCallHostResolver` | lookup in `_tagsByName` | Each call routes to the device of the referenced tag |
| `IAlarmSource` | polled `cnc_rdalmmsg2` via `FocasAlarmProjection` | Opt-in — set `AlarmProjection.Enabled=true`; diffs `(AlarmNumber, Type)` between ticks |
Ack is a no-op — FANUC clears alarms on its own once the underlying condition
resolves, so `AcknowledgeAsync` swallows the batch rather than surfacing
`BadNotSupported`.
## Fixed node tree
Enable a pre-defined hierarchy of CNC nodes populated automatically from
`cnc_sysinfo` + `cnc_rdaxisname` + `cnc_rddynamic2` + related FWLIB calls,
so operators get an out-of-the-box view of identity / axes / program /
timers without declaring per-address tags.
```jsonc
"Config": {
"Devices": [ ... ],
"Tags": [ ... ],
"FixedTree": {
"Enabled": true,
"PollInterval": "00:00:00.250", // fast — per-axis dynamic reads
"ProgramPollInterval": "00:00:01", // medium — program + mode changes
"TimerPollInterval": "00:00:30" // slow — cumulative counters
}
}
```
What gets populated (all under `FOCAS/{deviceHostAddress}/`):
| Subtree | Nodes | Source call |
|---------|-------|-------------|
| `Identity/` | `SeriesNumber`, `Version`, `MaxAxes`, `CncType`, `MtType`, `AxisCount` | `cnc_sysinfo` once at bootstrap |
| `Axes/{name}/` | `AbsolutePosition`, `MachinePosition`, `RelativePosition`, `DistanceToGo`, `ServoLoad` — one folder per discovered axis | `cnc_rdaxisname` once + `cnc_rddynamic2` + `cnc_rdsvmeter` per tick |
| `Axes/FeedRate/Actual`, `Axes/SpindleSpeed/Actual` | Current feed + spindle RPM | `cnc_rddynamic2` |
| `Spindle/{name}/` | `Load` (percentage), `MaxRpm` — one folder per discovered spindle | `cnc_rdspdlname` once + `cnc_rdspload` + `cnc_rdspmaxrpm` |
| `Program/` | `Name` (filename), `ONumber`, `Number`, `MainNumber`, `Sequence`, `BlockCount` | `cnc_exeprgname2` + `cnc_rdblkcount` + cached `cnc_rddynamic2` |
| `OperationMode/` | `Mode` (int), `ModeText` ("AUTO", "MDI", "EDIT", …) | `cnc_rdopmode` |
| `Timers/` | `PowerOnSeconds`, `OperatingSeconds`, `CuttingSeconds`, `CycleSeconds` | `cnc_rdtimer` × 4 |
### Per-series node suppression
The driver probes each optional call once at bootstrap. If the target CNC
returns `EW_FUNC` / `EW_NOOPT` / `EW_VERSION` on the wire, the
corresponding subtree is **not emitted** — the operator doesn't see nodes
that will only ever return `BadDeviceFailure`. Capability suppression
covers `Spindle/`, `Program/` + `OperationMode/`, `Timers/`, and
per-axis `ServoLoad` independently. Identity + `Axes/*` position reads
(which every Fanuc CNC supports) are always emitted.
Position values are scaled integers (matching FOCAS's convention). The
managed side exposes them as `Float64` OPC UA nodes; a future
`cnc_getfigure` integration will add per-axis decimal scaling. Until
then, treat the raw integer as the value the CNC reports and scale on
the client side if decimal precision matters.
**Still user-authored**: `PARAM:6711`, `MACRO:500`, `R100` etc. — specific
numbers whose meaning is MTB-specific. Those go under the device folder
alongside the fixed subtree.
## Alarm projection
Alarm surfacing is **disabled by default** because the polling cost is wasted
on sites that don't consume CNC alarms. Opt in per driver instance:
```jsonc
"Config": {
"Devices": [ ... ],
"Tags": [ ... ],
"AlarmProjection": {
"Enabled": true,
"PollInterval": "00:00:02"
}
}
```
Every alarm transition fires `OnAlarmEvent` with:
- `SourceNodeId` = the device host address (FOCAS has no per-node alarm model;
the CNC exposes a single flat active-alarm list per session)
- `ConditionId` = `"{host}#{Type}:{AlarmNumber}"`
- `AlarmType` = projected from FANUC's `ALM_TYPE_*` (e.g. `Overtravel`, `Servo`,
`Parameter`, `MacroAlarm`)
- `Severity` = Overtravel / Servo / PulseCode → `Critical`; Parameter / Macro
`Medium`; everything else → `High`
Cleared alarms fire a second event with `" (cleared)"` appended to the message
so downstream consumers can ignore the clear if they only care about raises.
## Handle recycling
FANUC CNCs have a finite FWLIB handle pool (~510 concurrent connections) and
certain series have documented handle-leak bugs that manifest after long uptime.
The driver can proactively close + reopen each device's session on a cadence to
release its slot back to the pool:
```jsonc
"Config": {
"Devices": [ ... ],
"HandleRecycle": {
"Enabled": true,
"Interval": "01:00:00"
}
}
```
Disabled by default — a healthy CNC + driver doesn't need it. Enable when field
experience shows handle exhaustion. Typical tuning: 30 min for sites running
multiple OtOpcUa instances against the same CNC (they share the pool); 6 h for a
single-client deployment. Reads / writes during recycle simply wait for the
reconnect rather than failing — worst case, an operator sees a brief read
latency spike once per cadence.
## Testing
- **Unit tests** — `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` cover the
driver surface via `FakeFocasClient`. Includes the alarm-projection raise /
clear diffing tests.
- **Integration tests** — `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/`
hold the Docker simulator scaffold (Stream B / C of the simulator plan —
`docs/v2/implementation/focas-simulator-plan.md`).
- **E2E script** — `scripts/e2e/test-focas.ps1` stages Host + Proxy + a real
CNC (or the simulator) and exercises connect → read → write → subscribe
round-trips. See [`docs/drivers/FOCAS-Test-Fixture.md`](FOCAS-Test-Fixture.md)
for the coverage map.
## Troubleshooting
| Symptom | Likely cause | Fix |
|---------|--------------|-----|
| `BadCommunicationError` on every read | CNC unreachable on TCP:8193 | Check firewall / LAN reachability; FOCAS Ethernet option must be licensed on the CNC side |
| Every read returns `BadNotWritable` on writes | Expected — OtOpcUa is read-only against FOCAS | If you actually need writes, open a feature request — the driver's managed wire client doesn't expose the write commands |
| `BadOutOfRange` on reads for a macro/parameter | Config address outside the declared `Series` range | Check `docs/v2/focas-version-matrix.md` — either fix the address or widen the `Series` |
| Alarm events never fire | `AlarmProjection.Enabled` left at default (false) | Set it to `true` in the driver config |
## Further reading
- [`docs/v2/driver-specs.md §7`](../v2/driver-specs.md) — full OPC UA node
mapping, pre-defined tag set, per-API notes
- [`docs/v2/focas-version-matrix.md`](../v2/focas-version-matrix.md) —
per-series macro / parameter / PMC range table
- [`docs/v2/implementation/focas-wire-protocol.md`](../v2/implementation/focas-wire-protocol.md)
— captured FOCAS2 wire semantics (magic prefix, handshake, command-id table)
- [upstream `Focas.Wire`](https://github.com/Ladder99/focas-mock/tree/main/dotnet/Focas.Wire)
— the managed client implementation OtOpcUa consumes as a NuGet dependency
+6 -5
View File
@@ -35,13 +35,14 @@ Multi-project test topology:
## How tests skip
- **E2E parity**: `ParityFixture.SkipIfUnavailable()` runs at class init and
checks Windows-only, non-admin user, ZB SQL reachable on
`localhost:1433`, Host EXE built in the expected `bin/` folder. Any miss
→ tests skip.
checks Windows-only, ZB SQL reachable on `localhost:1433`, Host EXE built
in the expected `bin/` folder. Any miss → tests skip.
- **Live-smoke** (`GalaxyRepositoryLiveSmokeTests`): `Assert.Skip` when ZB
unreachable. A `per project_galaxy_host_installed` memory on this repo's
dev box notes the MXAccess runtime is installed + pipe ACL denies Admins,
so live tests must run from a non-elevated shell.
dev box notes the MXAccess runtime is installed. The pipe ACL allows the
configured SID outright; elevation of the caller doesn't matter because
the per-connection SID check in `PipeServer.VerifyCaller` only compares
user SIDs (not group membership or integrity level).
- **Unit** tests (Shared, Proxy contract, most Host.Tests) have no skip —
they run anywhere.
+1 -1
View File
@@ -31,7 +31,7 @@ The same Tier-C isolation story applies to FOCAS (decision record in `docs/v2/pl
- Pipe name: `otopcua-galaxy-{DriverInstanceId}` (localhost-only, no TCP surface)
- Wire format: MessagePack-CSharp, length-prefixed frames
- ACL: pipe is created with a DACL that grants only the Server's service identity; the Admins group is explicitly denied so a live-smoke test running from an elevated shell fails fast rather than silently bypassing the handshake
- ACL: pipe is created with a DACL that grants `ReadWrite | Synchronize` only to the configured Server service-principal SID + denies `LocalSystem`. The per-connection SID check in `PipeServer.VerifyCaller` is the real authorization boundary — any caller whose impersonated token SID doesn't match the allowed SID is dropped before the first frame is read.
- Handshake: Proxy presents a shared secret at `OpenSessionRequest`; Host rejects anything else with `MessageKind.OpenSessionResponse{Success=false}`
- Heartbeat: Proxy sends a periodic ping; missed heartbeats trigger the Proxy-side crash-loop supervisor to restart the Host
-40
View File
@@ -47,13 +47,6 @@ the tests mock.
- `OpcUaClientSmokeTests.Client_subscribe_receives_StepUp_data_changes_from_live_server`
real `MonitoredItem` subscription against `ns=3;s=FastUInt1` (ticks every
100 ms); asserts `OnDataChange` fires within 3 s of subscribe
- `OpcUaClientReverseConnectSmokeTests.Driver_accepts_reverse_connect_from_opc_plc_rc_simulator`
reverse-connect (server-initiated) coverage. Driver binds
`opc.tcp://0.0.0.0:4844`, the `opc-plc-rc` docker service dials in via
`--rc opc.tcp://host.docker.internal:4844`, and a Read round-trips over
the inbound socket. Gated on `OPCUA_RC_SIM=1` because the simulator
requires `host.docker.internal` resolution which not every CI runner
exposes.
Wire-level surfaces verified: `IDriver` + `IReadable` + `ISubscribable` +
`IHostConnectivityProbe` (via the Secure Channel exchange).
@@ -166,35 +159,6 @@ Beyond that:
3. **Dedicated historian integration lab** — only path for
historian-specific coverage.
## HistoryRead aggregate coverage
PR-13 (issue #285) extended `HistoryAggregateType` from 5 to ~30 values
matching the OPC UA Part 13 §5 catalog. The mapping itself
(`OpcUaClientDriver.MapAggregateToNodeId`) is unit-tested via
`OpcUaClientAggregateMappingTests`:
- The full enum is swept with `Enum.GetValues<HistoryAggregateType>()`
every value must resolve to a non-null namespace-0 numeric `NodeId`.
- The 25 new aggregates each assert against a reflection-resolved
`Opc.Ua.ObjectIds.AggregateFunction_*` field by name, so a future SDK
upgrade that renames a constant trips the test loudly.
- The original 5 ordinals stay pinned to their pre-PR-13 NodeIds so existing
config files / persisted enums keep working.
This is **the well-known-NodeId test path** — the standard Part 13 NodeIds
are stable across SDK versions; round-tripping each one against a live
upstream is the integration suite's job and doesn't add coverage to the
mapping table itself.
`OpcUaClientAggregateSweepTests` is the integration counterpart. It loops
every enum value against a real opc-plc upstream and asserts the wire path
doesn't crash even when the simulator returns
`BadAggregateNotSupported` for an aggregate it doesn't honour. opc-plc's
default profile doesn't enable HistoryRead on the well-known nodes, so the
test currently `Assert.Skip`s — re-enables when the fixture image is
upgraded to a history-sim profile (`--useslowtypes --ut=10` or similar) and
a known-good historized NodeId is wired into `OpcPlcProfile`.
## Key fixture / config files
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests/` — unit tests with
@@ -204,7 +168,3 @@ a known-good historized NodeId is wired into `OpcPlcProfile`.
- `tests/ZB.MOM.WW.OtOpcUa.Server.Tests/OpcUaServerIntegrationTests.cs`
the server-side integration harness a future loopback client test could
piggyback on
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests/OpcUaClientAggregateMappingTests.cs`
— Part 13 aggregate enum-to-NodeId mapping coverage (PR-13)
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/OpcUaClientAggregateSweepTests.cs`
— wire-side aggregate sweep against opc-plc (build-only scaffold; PR-13)
-350
View File
@@ -1,350 +0,0 @@
# OPC UA Client driver
Tier-A in-process driver that opens a `Session` against a remote OPC UA server
and re-exposes its address space through the local OtOpcUa server. The
"gateway / aggregation" direction — opposite to the usual "server exposes PLC
data" flow.
For the test fixture (opc-plc) see [`OpcUaClient-Test-Fixture.md`](OpcUaClient-Test-Fixture.md).
For the configuration surface see `OpcUaClientDriverOptions` in
[`src/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/OpcUaClientDriverOptions.cs`](../../src/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/OpcUaClientDriverOptions.cs).
## Auto re-import on `ModelChangeEvent`
The driver subscribes to `BaseModelChangeEventType` (and its subtype
`GeneralModelChangeEventType`) on the upstream `Server` node (`i=2253`) at
the end of `InitializeAsync`. When the upstream server advertises a
topology change, the driver coalesces events over a debounce window and
runs a single re-import (equivalent to calling `ReinitializeAsync`
internally `ShutdownAsync` + `InitializeAsync`).
### Configuration
| Option | Default | Notes |
| --- | --- | --- |
| `WatchModelChanges` | `true` | Disable to skip the watch entirely (no extra subscription, no re-import on topology change). |
| `ModelChangeDebounce` | `5s` | Coalescing window. The first event starts the timer; further events extend it; when it elapses with no new events, the driver fires one re-import. |
### Behaviour
- One model-change subscription per driver instance, separate from the
data + alarm subscriptions. Created best-effort: a server that doesn't
advertise the event types or rejects the `EventFilter` falls through to
no-watch — `InitializeAsync` still succeeds.
- The `EventFilter` selects only the `EventType` field (a `WhereClause`
constrains by `OfType BaseModelChangeEventType`). Payload fields like
`Changes[]` are intentionally ignored: the driver always re-imports the
full upstream root, so per-event delta tracking would just add wire
overhead.
- Debounce is implemented via a single-shot `Timer`; every event calls
`Timer.Change(window, Infinite)` so a burst of N events triggers exactly
one re-import after the window elapses with no further events.
- The re-import path acquires the same `_gate` semaphore that `ReadAsync`
/ `WriteAsync` / `BrowseAsync` / `SubscribeAsync` use. Downstream callers
see a brief browse-gap (≈ the upstream `DiscoverAsync` duration) while
the gate is held — but no torn reads or split-batch writes.
- Failure during the re-import is best-effort: the next `ModelChangeEvent`
triggers another attempt, and the keep-alive watchdog covers permanent
upstream loss. Operators see failures through `DriverHealth.LastError`
+ the diagnostics counters.
### When to disable
Flip `WatchModelChanges` to `false` when:
- The upstream topology is known-static (e.g. firmware-pinned PLC) and
the driver should never run a re-import unprompted.
- The brief browse-gap during re-import is unacceptable and a manual
`ReinitializeAsync` call from the operator is preferred.
- The upstream server fires spurious `ModelChangeEvent`s that don't
reflect real topology changes, causing wasted re-imports. Tighten or
disable rather than chasing the noise downstream.
## Reverse Connect (server-initiated)
OPC UA's reverse-connect mode flips the transport direction: instead of the
client dialling the server, the **server** dials the client's listener. The
upstream sends a `ReverseHello` and the client continues the OPC UA
handshake on the inbound socket. Required for OT-DMZ deployments where the
plant firewall only permits outbound traffic from the upstream — the
gateway opens a listener, the upstream reaches out.
### Configuration
| Option | Default | Notes |
| --- | --- | --- |
| `ReverseConnect.Enabled` | `false` | Opt-in. When `true`, replaces the failover dial-sweep with a `WaitForConnection` call. |
| `ReverseConnect.ListenerUrl` | `null` | Local listener URL the SDK binds. Typically `opc.tcp://0.0.0.0:4844` (any interface) or a specific NIC for multi-homed gateways. **Required when `Enabled` is `true`.** |
| `ReverseConnect.ExpectedServerUri` | `null` | Upstream's `ApplicationUri` to filter inbound dials. `null` accepts the first connection (only safe with one upstream targeting the listener). |
### Shared listener (singleton)
A single underlying `Opc.Ua.Client.ReverseConnectManager` per process keyed
on `ListenerUrl`. Two driver instances that share a listener URL multiplex
onto one TCP socket; the SDK demuxes inbound dials by the upstream's
reported `ServerUri`. The wrapper (`ReverseConnectListener`) is
reference-counted — first `Acquire` binds the port, last `Release` tears it
down. Letting drivers come and go independently without races on
port-bind / port-unbind.
When two drivers share a listener:
- They MUST set `ExpectedServerUri` to disambiguate; otherwise the first
upstream to dial in wins regardless of which driver is waiting.
- They CAN come and go independently; the listener stays alive while at
least one driver references it.
### Behaviour
- The dial path is bypassed entirely when `Enabled` is `true`. Failover
across multiple `EndpointUrls` doesn't apply — there's no client-side
dial to fail over.
- `ExpectedServerUri` is the SDK's filter parameter to `WaitForConnectionAsync`.
Inbound `ReverseHello`s from a different upstream are ignored and the
caller keeps waiting.
- The same `EndpointDescription` derivation runs as the dial path — the
first `EndpointUrl` in the candidate list seeds `SecurityPolicy` /
`SecurityMode` / `EndpointUrl` for the session-create call. The actual
endpoint lives on the upstream and the SDK reconciles after the
`ReverseHello`.
- Cancellation: `Timeout` bounds the wait. A stuck listener with no inbound
dial throws after `Timeout` rather than hanging init forever.
- Shutdown releases the listener reference. The last release stops the
listener so the port can be re-bound by a future driver lifecycle.
### Wiring it up on the upstream
The upstream OPC UA server has to be configured to dial out. The `opc-plc`
simulator does this with `--rc=opc.tcp://<gateway-host>:4844`; for a real
upstream see your server's reverse-connect docs (most major implementations
expose a "ReverseConnect.Endpoint" config knob).
### When NOT to use
- Standard plant networks where the gateway can dial the upstream — the
conventional dial path is simpler and supports failover natively.
- Public-internet OPC UA: reverse-connect is a network-policy workaround,
not a security primitive. Always pair with `Sign` or `SignAndEncrypt`
+ a vetted user-token policy.
## HistoryRead Events
The driver passes through OPC UA `HistoryReadEvents` to the upstream server.
HistoryRead Raw / Processed / AtTime ship in the same code path
(`ExecuteHistoryReadAsync`); event history takes a slightly different shape
because the client sends an `EventFilter` (SelectClauses + WhereClause) rather
than a plain numeric / time-based detail block.
### Wire path
`IHistoryProvider.ReadEventsAsync(fullReference, EventHistoryRequest, ct)`
translates to:
```
new ReadEventDetails {
StartTime,
EndTime,
NumValuesPerNode,
Filter = EventFilter { SelectClauses, WhereClause }
}
```
…and is sent through `Session.HistoryReadAsync` to the upstream server. The
returned `HistoryEvent.Events` collection (one `HistoryEventFieldList` per
historical event) is unwrapped into `HistoricalEventBatch.Events`, where each
`HistoricalEventRow.Fields` dictionary is keyed by the
`SimpleAttributeSpec.FieldName` the caller supplied. The server-side history
dispatcher uses those keys to align fields with the wire-side SelectClause
order — drivers don't have to honour the entire OPC UA `EventFilter` shape
verbatim.
### SelectClauses
When `EventHistoryRequest.SelectClauses` is `null` the driver falls back to a
default set that matches `BuildHistoryEvent` on the server side:
| Field | Browse path | Notes |
| --- | --- | --- |
| `EventId` | `EventId` | BaseEventType — stable unique id. |
| `SourceName` | `SourceName` | Source-object name. |
| `Time` | `Time` | Process-side event timestamp. Used for `OccurrenceTime`. |
| `Message` | `Message` | LocalizedText payload. |
| `Severity` | `Severity` | OPC UA 1-1000 scale. |
| `ReceiveTime` | `ReceiveTime` | Server-side ingest timestamp. |
Custom SelectClauses are supported — pass any
`IReadOnlyList<SimpleAttributeSpec>`. Each entry's `TypeDefinitionId`
defaults to `BaseEventType` when `null`; pass an explicit NodeId text (e.g.
`"i=2782"` for `ConditionType`) to reach typed-condition fields.
### WhereClause
`ContentFilterSpec.EncodedOperands` carries the binary-encoded
`ContentFilter` from the wire. The driver decodes it into the SDK
`ContentFilter` and attaches it to the outgoing `EventFilter` verbatim — the
OPC UA Client driver is a passthrough for filter semantics, it does not
evaluate them. A malformed filter is dropped silently; the SelectClause
projection still goes out.
### Continuation points
Returned in `HistoricalEventBatch.ContinuationPoint`. The server-side
HistoryRead facade is responsible for round-tripping these so a paged event
read against a chatty upstream completes incrementally. The driver itself
doesn't track them — every `ReadEventsAsync` call issues a fresh
`HistoryReadAsync`.
## HistoryRead Aggregates (Part 13 catalog)
`IHistoryProvider.ReadProcessedAsync` takes a `HistoryAggregateType` and the
driver maps it to the standard `Opc.Ua.ObjectIds.AggregateFunction_*` NodeId
in `MapAggregateToNodeId`. PR-13 (issue #285) extended the enum from the
original 5 values (Average / Minimum / Maximum / Total / Count) to the full
OPC UA Part 13 §5 catalog — ~30 aggregates.
The mapping is best-effort: not every upstream OPC UA server implements every
aggregate. Aggregates the upstream rejects come back with
`StatusCode=BadAggregateNotSupported` on the per-row HistoryRead result; the
driver passes that through verbatim (cascading-quality rule, Part 11 §8) — it
does not throw. Servers advertise the aggregates they support via the
`AggregateConfiguration` object on the `Server` node; clients can probe it at
runtime.
### Catalog
| Enum value | SDK NodeId field | Part 13 § | Server-side support | Typical use |
| --- | --- | --- | --- | --- |
| `Average` | `AggregateFunction_Average` | §5.4 | almost always | smoothing |
| `Minimum` | `AggregateFunction_Minimum` | §5.5 | almost always | low watermark |
| `Maximum` | `AggregateFunction_Maximum` | §5.6 | almost always | high watermark |
| `Total` | `AggregateFunction_Total` | §5.10 | usually | totalisation |
| `Count` | `AggregateFunction_Count` | §5.18 | almost always | sample count |
| `TimeAverage` | `AggregateFunction_TimeAverage` | §5.4.2 | usually | time-weighted mean |
| `TimeAverage2` | `AggregateFunction_TimeAverage2` | §5.4.3 | sometimes | bounded time-weighted mean |
| `Interpolative` | `AggregateFunction_Interpolative` | §5.3 | usually | trend snapshot |
| `MinimumActualTime` | `AggregateFunction_MinimumActualTime` | §5.5.4 | sometimes | when low occurred |
| `MaximumActualTime` | `AggregateFunction_MaximumActualTime` | §5.6.4 | sometimes | when high occurred |
| `Range` | `AggregateFunction_Range` | §5.7 | usually | spread |
| `Range2` | `AggregateFunction_Range2` | §5.7 | sometimes | bounded spread |
| `AnnotationCount` | `AggregateFunction_AnnotationCount` | §5.21 | rarely | operator notes |
| `DurationGood` | `AggregateFunction_DurationGood` | §5.16 | sometimes | quality coverage |
| `DurationBad` | `AggregateFunction_DurationBad` | §5.16 | sometimes | gap accounting |
| `PercentGood` | `AggregateFunction_PercentGood` | §5.17 | sometimes | quality % |
| `PercentBad` | `AggregateFunction_PercentBad` | §5.17 | sometimes | gap % |
| `WorstQuality` | `AggregateFunction_WorstQuality` | §5.20 | sometimes | worst seen |
| `WorstQuality2` | `AggregateFunction_WorstQuality2` | §5.20 | rarely | bounded worst |
| `StandardDeviationSample` | `AggregateFunction_StandardDeviationSample` | §5.13 | sometimes | n-1 stddev |
| `StandardDeviationPopulation` | `AggregateFunction_StandardDeviationPopulation` | §5.13 | sometimes | n stddev |
| `VarianceSample` | `AggregateFunction_VarianceSample` | §5.13 | sometimes | n-1 variance |
| `VariancePopulation` | `AggregateFunction_VariancePopulation` | §5.13 | sometimes | n variance |
| `NumberOfTransitions` | `AggregateFunction_NumberOfTransitions` | §5.12 | sometimes | event count |
| `DurationInStateZero` | `AggregateFunction_DurationInStateZero` | §5.19 | sometimes | OFF time |
| `DurationInStateNonZero` | `AggregateFunction_DurationInStateNonZero` | §5.19 | sometimes | ON time |
| `Start` | `AggregateFunction_Start` | §5.8 | usually | first sample |
| `End` | `AggregateFunction_End` | §5.9 | usually | last sample |
| `Delta` | `AggregateFunction_Delta` | §5.11 | usually | end-start |
| `StartBound` | `AggregateFunction_StartBound` | §5.8 | sometimes | extrapolated start |
| `EndBound` | `AggregateFunction_EndBound` | §5.9 | sometimes | extrapolated end |
"Server-side support" is heuristic — see your upstream's `AggregateConfiguration`
node for the authoritative list. AVEVA Historian, KEPServerEX, Prosys, and
opc-plc each implement different subsets.
### Driver-side validation
The mapping itself is unit-tested over the full enum
(`OpcUaClientAggregateMappingTests`) — every value resolves to a non-null
namespace-0 NodeId, and the original 5 ordinals stay pinned. Wire-side
behaviour against a live server is exercised by
`OpcUaClientAggregateSweepTests` (build-only scaffold pending an opc-plc
history-sim profile).
## Upstream redundancy (`ServerArray`)
When the upstream OPC UA server is itself a redundant pair (warm or hot per
OPC UA Part 4 §6.6.2), the driver supports **mid-session failover** driven by
the upstream's own `Server.ServerRedundancy.RedundancySupport` +
`ServerUriArray` + `Server.ServiceLevel` nodes. Distinct from the static
boot-time failover sweep on `EndpointUrls`: that path picks a single survivor
at session-create time; this path swaps the active session live when the
upstream signals degradation, transferring subscriptions onto the secondary so
monitored-item handles stay valid.
### Configuration
| Option | Default | Notes |
| --- | --- | --- |
| `Redundancy.Enabled` | `false` | Opt-in. When `false`, the driver doesn't read `RedundancySupport` / `ServerUriArray` and doesn't subscribe to `ServiceLevel`. |
| `Redundancy.ServiceLevelThreshold` | `200` | Byte value below which the driver triggers failover. OPC UA spec convention: 200+ = healthy primary, 100..199 = degraded, 0..99 = unrecoverable. |
| `Redundancy.RecheckInterval` | `5s` | Lower bound between two consecutive failovers — suppresses oscillation when ServiceLevel flaps around the threshold. |
### Behaviour
- At session activation the driver reads
`Server.ServerRedundancy.RedundancySupport`. When `None`, the driver records
an empty peer list and the failover path becomes a no-op (`ServiceLevel`
drops are still observable via diagnostics but trigger nothing).
- When the upstream advertises `Cold` / `Warm` / `WarmActive` / `Hot`, the
driver pulls `Server.ServerRedundancy.ServerUriArray` for the peer list,
falling back to the top-level `Server.ServerArray` for legacy upstreams that
don't expose the redundancy node.
- A dedicated subscription on `Server.ServiceLevel` (publish interval 1s,
separate from the alarm + data subscriptions) drives every failover decision
via the SDK's notification path — no polling loop.
- On a drop below `ServiceLevelThreshold` the driver picks the next URI in the
peer list that isn't the active one, opens a parallel session against it,
and calls `Session.TransferSubscriptionsAsync(other, sendInitialValues:true)`
to migrate every live subscription (data + alarm + model-change +
service-level itself). On success the driver swaps `Session`, closes the
old one, and bumps `RedundancyFailoverCount`.
- On any failure (`BadSecureChannelClosed`, `BadCertificateUntrusted`,
`TransferSubscriptions` returning `false`, secondary unreachable) the driver
leaves the existing session untouched, increments
`RedundancyFailoverFailures`, and waits for the next ServiceLevel
notification. The keep-alive watchdog continues to cover full
upstream-loss scenarios.
### Shared client-cert prerequisite
`TransferSubscriptionsAsync` requires the secondary's secure channel to accept
the same client certificate the primary did. Operators running heterogeneous
secondaries (different cert trust stores) will see `BadCertificateUntrusted`
on every failover attempt and the failures counter climbing. The fix is to
push the gateway driver's application-instance certificate into both
upstreams' `TrustedPeerCertificates` store before enabling redundancy. A
follow-up adds a fallback path that re-creates subscriptions instead of
transferring when the secondary rejects the channel.
### Diagnostics
The `driver-diagnostics` RPC surfaces three new counters via
`DriverHealth.Diagnostics`:
| Key | Type | Notes |
| --- | --- | --- |
| `RedundancyFailoverCount` | `double` (long-counted) | Successful mid-session swaps since driver start. |
| `RedundancyFailoverFailures` | `double` (long-counted) | Swap attempts that bailed (TransferSubscriptions false, secondary unreachable, etc.). |
| `ActiveServerUri` | string (in `OpcUaClientDiagnostics.ActiveServerUri`) | URI of the upstream the driver is currently bound to. Updates on every successful failover. |
### Forced-failover runbook
To validate the wiring against a real redundant upstream pair:
1. Confirm the upstream advertises `RedundancySupport != None` and a
non-empty `ServerUriArray`. Use the Client CLI:
`dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- redundancy -u <primary>`.
2. Set `Redundancy.Enabled = true` on the gateway's `OpcUaClient` driver
instance and restart.
3. Tail driver diagnostics:
`driver-diagnostics --instance <id>` — note `RedundancyFailoverCount = 0`
pre-test.
4. Drive a `ServiceLevel` drop on the primary. On AVEVA / KEPServer this is
typically a "force standby" Admin action; on a custom server it's a write
to the simulated ServiceLevel node.
5. Observe `RedundancyFailoverCount = 1` within `RecheckInterval` of the
drop, the gateway's `HostName` swap to the secondary URI, and downstream
reads/subscriptions continuing without interruption.
For non-redundant upstreams (single-server deployments) the recommended
configuration is to leave `Redundancy.Enabled = false` and rely on
`EndpointUrls` for boot-time failover only.
+4 -1
View File
@@ -26,7 +26,7 @@ Driver type metadata is registered at startup in `DriverTypeRegistry` (`src/ZB.M
| AB CIP | `Driver.AbCip` | A | libplctag CIP | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IAlarmSource | ControlLogix / CompactLogix. Tag discovery uses the `@tags` walker to enumerate controller-scoped + program-scoped symbols; UDT member resolution via the UDT template reader |
| AB Legacy | `Driver.AbLegacy` | A | libplctag PCCC | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | SLC 500 / MicroLogix. File-based addressing (`N7:0`, `F8:0`) — no symbol table, tag list is user-authored in the config DB |
| TwinCAT | `Driver.TwinCAT` | B | Beckhoff `TwinCAT.Ads` (`TcAdsClient`) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | The only native-notification driver outside Galaxy — ADS delivers `ValueChangedCallback` events the driver forwards straight to `ISubscribable.OnDataChange` without polling. Symbol tree uploaded via `SymbolLoaderFactory` |
| FOCAS | `Driver.FOCAS` | C | FANUC FOCAS2 (`Fwlib32.dll` P/Invoke) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | Tier C — FOCAS DLL has crash modes that warrant process isolation. CNC-shaped data model (axes, spindle, PMC, macros, alarms) not a flat tag map |
| [FOCAS](FOCAS.md) | `Driver.FOCAS` | A | Pure-managed `FocasWireClient` FOCAS/2 Ethernet binary protocol on TCP:8193, inlined into the driver assembly | IDriver, ITagDiscovery, IReadable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IAlarmSource | Read-only by design (WriteAsync returns `BadNotWritable`). CNC-shaped data model (axes, spindle, PMC, macros, alarms) not a flat tag map. Previously Tier-C (Host + P/Invoke + shim DLL); retired in the 2026-04-24 migration when the managed wire client landed |
| OPC UA Client | `Driver.OpcUaClient` | B | OPCFoundation `Opc.Ua.Client` | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IAlarmSource, IHistoryProvider, IHostConnectivityProbe | Gateway/aggregation driver. Opens a single `Session` against a remote OPC UA server and re-exposes its address space. Owns its own `ApplicationConfiguration` (distinct from `Client.Shared`) because it's always-on with keep-alive + `TransferSubscriptions` across SDK reconnect, not an interactive CLI |
## Per-driver documentation
@@ -35,6 +35,9 @@ Driver type metadata is registered at startup in `DriverTypeRegistry` (`src/ZB.M
- [Galaxy.md](Galaxy.md) — COM bridge, STA pump, IPC, runtime probes
- [Galaxy-Repository.md](Galaxy-Repository.md) — ZB SQL reader, `LocalPlatform` scope filter, change detection
- **FOCAS** has a short getting-started doc because the Tier-C two-project deployment + backend-selection env var + alarm projection opt-in all need explaining up front:
- [FOCAS.md](FOCAS.md) — deployment, config, capability surface, alarm projection, troubleshooting
- **All other drivers** share a single per-driver specification in [docs/v2/driver-specs.md](../v2/driver-specs.md) — addressing, data-type maps, connection settings, and quirks live there. That file is the authoritative per-driver reference; this index points at it rather than duplicating.
## Test-fixture coverage maps
-338
View File
@@ -1,338 +0,0 @@
# S7 — TIA Portal CSV & STEP 7 Classic AWL symbol import
PR-S7-D1 / [#299](https://github.com/dohertj2/lmxopcua/issues/299) — bulk-import
TIA Portal "Show all tags" CSV exports and STEP 7 Classic AWL declaration files
into the S7 driver. Saves operators from hand-typing every `%MW0` /
`%DB1.DBW0` row of a several-hundred-tag PLC into `appsettings.json`.
## Supported formats — v1
| Format | Status | Notes |
|---|---|---|
| TIA Portal `.CSV` ("Show all tags" export) | **supported** | Header columns `Name,Path,Data type,Logical address,Comment,Hmi accessible,…`; en-US (`,`) and DE-locale (`;` separator + `,` decimal) auto-detected |
| STEP 7 Classic `.AWL` (`VAR_GLOBAL` + `DATA_BLOCK`) | **supported, best-effort** | Position-based offset assignment (no exact byte offsets in hand-exported AWL — see below) |
| STEP 7 / TIA Portal native binary (`.s7p`, `.zap`) | **out of scope** | Proprietary; no community parser. Use TIA's "Show all tags" CSV export |
| TIA Portal Openness API | **out of scope** | Requires a licensed TIA install + OpenAPI license; future PR |
## TIA Portal CSV column reference
| Column | Required | Notes |
|---|---|---|
| `Name` | yes | OPC UA tag name. TIA symbols are stable across deployments; the importer uses them verbatim |
| `Logical address` (or `Address`) | yes | TIA-style address with leading `%` (e.g. `%MW0`, `%DB1.DBW10`, `%DB1.DBX2.3`). Stripped on import |
| `Data type` | recommended | TIA primitive type (`Int`, `Real`, `Bool`, `String`, …) — drives the imported `S7DataType` |
| `Comment` | no | Parsed but currently unused — `S7TagDefinition` has no `Description` field at the v2 schema layer (see [#248](https://github.com/dohertj2/lmxopcua/issues/248)). Held in the column contract for future schema bumps |
| `Hmi accessible` | no | Filter — rows with `False` / `FALSCH` / `nein` are skipped (internal symbols TIA shows in the editor but doesn't expose to client interfaces). Missing column defaults to `True` |
| `Hmi visible` / `Hmi writeable` | no | Currently unused — held for future Admin-UI-side metadata |
| `Length` | no | For `String` rows: max length. Default 254. Drives `StringLength` on the imported tag |
| `Path` | no | TIA tag-table path (`Default tag table`, custom names). Currently unused; held in the contract |
### TIA `Data type``S7DataType` mapping
| TIA type | Maps to | Notes |
|---|---|---|
| `Bool` | `Bool` | Bit access; address must include a `.bit` suffix |
| `Byte`, `SInt`, `USInt` | `Byte` | 1-byte unsigned/signed |
| `Int` | `Int16` | Signed 16-bit |
| `Word`, `UInt` | `UInt16` | Unsigned 16-bit |
| `DInt` | `Int32` | Signed 32-bit |
| `DWord`, `UDInt` | `UInt32` | Unsigned 32-bit |
| `LInt` | `Int64` | 64-bit signed (S7-1500 only) |
| `LWord`, `ULInt` | `UInt64` | 64-bit unsigned (S7-1500 only) |
| `Real` | `Float32` | IEEE-754 32-bit |
| `LReal` | `Float64` | IEEE-754 64-bit (S7-1500 only) |
| `String` | `String` | S7 STRING with 2-byte header; `Length` column drives `StringLength` |
| `WString` | `WString` | S7 WSTRING (UTF-16BE) |
| `Char` / `WChar` | `Char` / `WChar` | Single-character |
| `Date` | `Date` | UInt16 days since 1990-01-01 |
| `Time` | `Time` | Int32 ms |
| `TOD` / `Time_Of_Day` | `TimeOfDay` | UInt32 ms since midnight |
| `DT` / `Date_And_Time` | `DateAndTime` | 8-byte BCD |
| `DTL` | `Dtl` | 12-byte structured (S7-1200 / S7-1500) |
| `S5Time` | `S5Time` | 16-bit BCD duration |
| `Struct` / quoted UDT name | UDT placeholder | See below |
### UDT placeholders
UDT-typed symbols (TIA `Data type` = `"MyUdt"` quoted, or the literal `Struct`)
import as a **placeholder** — the resulting tag lands in the driver options so
it shows up in the Admin UI tag list, but its data type is forced to `Byte`
and the row is marked `Writable = false`.
`S7ImportResult.UdtPlaceholderCount` tracks how many of the imported tags
landed in this bucket.
#### Cooperation with `Udts` declarations (PR-S7-D2 / #300)
PR-S7-D2 ships UDT fan-out via `S7DriverOptions.Udts` + `S7TagDefinition.UdtName`.
The importer and the `Udts` declaration cooperate as follows:
1. The importer emits a placeholder row for each UDT-typed symbol — same as
today (data type forced to `Byte`, `Writable = false`).
2. The operator hand-edits the placeholder row in the resulting JSON / options
object and:
- Sets `UdtName` to the UDT type name from the TIA "Data type" column
- Removes the `Writable: false` marker (UDT leaves inherit the parent's
writability)
3. The operator declares the matching `S7UdtDefinition` in
`S7DriverOptions.Udts` (member offsets come from the TIA UDT definition
in the project file — TIA's "Show all tags" CSV does not export struct
field offsets, hence the manual layout step).
4. At driver init, the fan-out replaces the placeholder with one scalar leaf
per UDT member.
The importer does NOT auto-populate `Udts` — UDT layouts live in the project
file, not the symbol-table CSV. A future enhancement may parse the SCL UDT
declaration alongside the CSV; for now the cooperation is "importer flags it,
operator declares the layout, driver fans out at init".
See [`docs/v2/s7.md` "UDT / STRUCT support"](../v2/s7.md#udt--struct-support)
for the full fan-out semantics, the 4-level nesting cap, and the
Optimized-block-access prerequisite.
## Instance DBs / FB parameters
PR-S7-D3 / [#301](https://github.com/dohertj2/lmxopcua/issues/301) — multi-instance
Function-Block (FB) instances are addressed symbolically inside the PLC program
(`MyFB_Instance.MyParam`) but the runtime wire access still needs the absolute
`DBn.DBW_offset`. TIA Portal's "Show all tags" CSV export distinguishes these
rows from regular global DBs via the **`DB type`** column.
### `DB type` column convention
| `DB type` value | Meaning | Path |
|---|---|---|
| (empty) | Legacy export — no column at all (TIA pre-v15 / partial export). Treated as Global. | D1 (existing) |
| `Global DB` / `Global` / `Global Data Block` | Standalone DB declared in the project tree. | D1 (existing) |
| `Globaler Datenbaustein` | Same as above, DE locale. | D1 (existing) |
| `Instance DB` / `Instance` / `Instance Data Block` | Multi-instance FB instance. Member tags are the FB's `IN` / `OUT` / `IN_OUT` / `STAT` parameters. | **D3 (new)** |
| `Instance-DB` / `Instanz-DB` / `Instanz-Datenbaustein` | Same as above (locale + dashing variants). | **D3 (new)** |
The `DB type` column is matched case-insensitively; quoting and surrounding
whitespace are tolerated.
### `MyFB_Instance.MyParam``DBn.DBW_offset`
The TIA Portal export ships the **resolved absolute address** in the
`Logical address` column for every instance-DB member — TIA itself walks the FB
interface declaration at export time and writes out the byte-offset-anchored
address verbatim. The importer accepts these rows the same way as a Global-DB
row, with two differences:
1. The row counts under `S7ImportResult.InstanceDbCount` (a sub-counter of
`ParsedCount`) so the operator can see how much of the import depends on the
FB-interface layout.
2. The row is rejected from the UDT placeholder path even if the data type
column happens to match a UDT name pattern — instance-DB members always
import as fully-functional scalar tags.
Example fixture row:
```csv
Name,Path,Data type,Logical address,Comment,Hmi accessible,DB type
MotorFB_1.Speed,FB instances,Int,%DB7.DBW0,Speed setpoint,True,Instance DB
```
The imported `S7TagDefinition` ends up with:
```csharp
new S7TagDefinition(
Name: "MotorFB_1.Speed",
Address: "DB7.DBW0",
DataType: S7DataType.Int16,
Writable: true);
```
### Empty-`Logical address` fallback
When TIA exports an instance-DB row with an empty `Logical address` column
(rare in practice — happens when the export was generated against a
not-yet-compiled project), `InstanceDbResolver` can compute the absolute
address from explicit parent-DB / parent-base-offset / member-offset inputs.
This fallback is exposed at the resolver-class level for advanced bootstrap
scenarios; the CSV path itself does not currently parse interface declarations
out of the file (TIA's CSV doesn't carry them).
For now the operator workflow is: re-export from TIA after compiling the
project so every instance-DB row carries a resolved `Logical address`.
### Re-import on FB-interface edit — caveat
When the FB interface changes — a member is added, removed, or reordered in
TIA — the instance-DB layout shifts on the PLC side. Member byte offsets that
worked yesterday point at the wrong word today; absolute-offset addressing has
no in-band schema check.
**The driver does not auto-detect this.** Operators must:
1. Recompile the FB in TIA Portal.
2. Download the updated program to the PLC.
3. **Re-export "Show all tags" CSV** from the updated project.
4. Re-import the CSV via `AddTiaCsvImport` or the `import-symbols` CLI.
5. Restart the driver instance (Admin UI → Drivers → Reload).
A stale import will silently read / write the wrong byte offsets — the values
will look like valid PLC data but reference whichever member used to live at
that offset before the interface edit. There is no runtime guard; this is the
same caveat that applies to all absolute-offset DB addressing on S7-1200 /
1500 (see [`docs/v2/s7.md` "UDT / STRUCT support"](../v2/s7.md#udt--struct-support)
for the parallel UDT-edit story).
A future enhancement may add a project-fingerprint compare at driver init —
hashing the interface offsets at import time and re-checking against a known
PLC system function. Tracked as a follow-up; not in PR-S7-D3.
## DE locale handling
TIA Portal honours the Windows display locale when writing CSV. A DE-locale
install emits:
- Field separator `;` (because `,` is the decimal separator)
- Decimal-comma in addresses: `%MW0,5` rather than `%MW0.5` for bit addresses
- Boolean column values `WAHR` / `FALSCH` rather than `True` / `False`
The importer **auto-detects** the locale from the first non-blank line:
- Field-separator detection: counts `;` vs `,` occurrences in the header
- Decimal-comma detection: scans the first data row's address column for a
digit-comma-digit pattern
- Boolean column values: recognises both languages (`true/false/wahr/falsch/yes/no/ja/nein`,
case-insensitive) plus bare `0`/`1`
The address column is rewritten to en-US shape (`%MW0,5``MW0.5`) before the
strict `S7AddressParser` runs, so the rest of the driver pipeline sees a
single canonical address shape.
## STEP 7 Classic AWL — `VAR_GLOBAL` + `DATA_BLOCK`
Best-effort parser for legacy STEP 7 Classic projects:
- `VAR_GLOBAL … END_VAR` — global memory area declarations. Each entry maps to
a sequential `M{B|W|D}{offset}` address based on declaration order.
- `DATA_BLOCK DBn … END_DATA_BLOCK` — DB declarations. Each field maps to a
`DB{n}.DB{B|W|D}{offset}` address based on declaration order; the DB number
is parsed from the `DATA_BLOCK` line's `DBn` keyword.
### Position-based addressing — heuristic
Real STEP 7 Classic projects carry exact byte offsets in the symbol table /
.gr8 deployment artefact, but a hand-exported AWL file omits them. The
importer assumes:
| Type | Bytes |
|---|---|
| `BOOL` | 1 (rounded up to byte alignment) |
| `BYTE` / `SINT` / `USINT` / `CHAR` | 1 |
| `INT` / `WORD` / `UINT` | 2 |
| `DINT` / `DWORD` / `UDINT` / `REAL` | 4 |
| `LREAL` / `LINT` / `ULINT` / `LWORD` | 8 |
| `STRING[N]` | N + 2 (2-byte header) |
| `STRING` (no length) | 256 |
| `STRUCT` / `Array[…] of …` / quoted UDT name | UDT placeholder (8-bit Byte at next aligned offset) |
S7 alignment rule: offsets round up to a 2-byte boundary for any 16-bit-or-larger
type. Sites needing exact offsets should drive their symbol import from the
TIA Portal CSV path instead — the CSV carries the offsets verbatim.
Comments (`(* ... *)` block, `// ...` line) are stripped before declaration
parsing. Initial-value clauses (`:= 0`) are recognised and discarded.
## CLI subcommand — `import-symbols`
```powershell
otopcua-s7-cli import-symbols --help
```
| Flag | Default | Purpose |
|---|---|---|
| `-f` / `--file` | **required** | Path to the TIA CSV or `.AWL` file |
| `--format` | `tia` | `tia` (CSV) or `awl` (STEP 7 Classic) |
| `-d` / `--device` | none | Optional documentation tag (reserved for symmetry with `import-rslogix`) |
| `--emit` | `appsettings-fragment` | `appsettings-fragment` (JSON) or `summary` (one-line counter) |
| `-o` / `--output` | stdout | Optional path; when set the JSON fragment is written there + summary line goes to stdout |
| `--max-rows` | unlimited | Defensive cap on rows imported |
| `--strict` | off | Fail-fast on the first malformed row (default permissive: skip + log) |
### `appsettings-fragment` output shape
The default `--emit appsettings-fragment` mode writes a JSON object whose
`Tags` array is shaped like the `S7DriverConfigDto.Tags` array — paste
straight into the driver-instance config under
`Drivers/<instance>/Config/Tags`.
```json
{
"Tags": [
{
"Name": "MotorSpeed",
"Address": "MW0",
"DataType": "Int16",
"Writable": true,
"StringLength": 254
},
]
}
```
### Summary line
`--emit summary` writes a single line:
```
Imported 142 tag(s), skipped 3, errors 0, udt-placeholders 5, instance-db 9.
```
`Skipped` covers HMI-accessible-false rows + missing-required-field rows;
`errors` covers rows whose `Address` failed to parse as an S7 address;
`udt-placeholders` covers UDT-typed rows that imported as placeholders;
`instance-db` (PR-S7-D3) covers rows whose `DB type` column tagged them as
multi-instance FB-instance members.
## API surface — `IS7SymbolImporter` + `AddTiaCsvImport` / `AddAwlImport`
For server-side / bootstrap use-cases the importer is reachable via:
```csharp
using ZB.MOM.WW.OtOpcUa.Driver.S7;
using ZB.MOM.WW.OtOpcUa.Driver.S7.SymbolImport;
var options = new S7DriverOptions { Host = "192.168.1.30", CpuType = CpuType.S71500 };
// Append imported tags onto an existing options object.
var updated = options.AddTiaCsvImport(
path: @"C:\plc\tia-export.csv",
out var result);
Console.WriteLine($"Imported {result.ParsedCount} tags ({result.UdtPlaceholderCount} placeholders)");
// AWL variant — same shape.
var withAwl = updated.AddAwlImport(
path: @"C:\plc\classic.awl",
out var awlResult);
```
For a hand-managed importer instance (e.g. supplying a custom `ILogger`) call
`new TiaCsvImporter(logger).Parse(stream, opts)` or
`new AwlImporter(logger).Parse(stream, opts)` directly.
## Operational notes
- The importers are **additive**`AddTiaCsvImport` / `AddAwlImport` concatenate
onto the existing `Tags` list rather than replacing it. Hand-rolled tags
(system-status variables, computed fields the operator added by hand) survive
a re-import.
- Re-imports are not idempotent — calling `AddTiaCsvImport` twice will produce
duplicate tag rows. Operators are expected to start from a clean options
object or de-duplicate themselves; a future schema rev may add a
`replace=true` switch.
- UDT placeholders surface in the Admin UI as non-writable Byte tags. PR-S7-D2
added the runtime UDT fan-out (`S7DriverOptions.Udts` + `S7TagDefinition.UdtName`)
— operators upgrade a placeholder row by setting `UdtName` and declaring the
matching `S7UdtDefinition`; see "Cooperation with `Udts` declarations" above.
Placeholder-only rows still work as a Byte view of the first byte but
can't browse / read their members until the layout is declared.
- Description metadata is dropped on the floor today — see the column
reference above. When [#248](https://github.com/dohertj2/lmxopcua/issues/248)
lands a `Description` field on `S7TagDefinition` the importer will start
populating it without further changes to the CSV contract.
+3 -106
View File
@@ -44,10 +44,6 @@ The driver ctor change that made this possible:
bool-with-bit in one batch call; proves typed decode per S7DataType
- `S7_1500SmokeTests.Driver_write_then_read_round_trip_on_scratch_word`
`DB1.DBW100` write → read-back; proves write path + buffer visibility
- `S7_1500DiagnosticsTests.Driver_exposes_negotiated_pdu_size_post_init`
asserts `DriverHealth.Diagnostics["S7.NegotiatedPduSize"]` is non-zero
after `InitializeAsync`; proves the negotiated PDU size surfaces in
driver health (Snap7 fixture pins this at 240 bytes — see fixture README)
### Unit
@@ -88,59 +84,10 @@ real PLC latency is not exercised.
S7-1200 vs S7-1500 vs S7-300/400 connection semantics (PG vs OP vs S7-Basic)
not differentiated at test time.
**Optimized DB / S7Plus** is the variant-shaped gap with the biggest field
impact. snap7 happens to behave like a classic-S7comm-only PLC, so the
integration suite cannot reproduce the shape that an S7-1500 with default
"Optimized block access" checked would return (`BadDeviceFailure` on every
absolute-offset read). The decision is documented at
[`docs/v2/s7.md` § Optimized DB constraint (S7Plus)](../v2/s7.md#optimized-db-constraint-s7plus)
and tracked in [`docs/featuregaps.md`](../featuregaps.md) row #1; the
project ships **Track 1** (operator unchecks Optimized block access in TIA
Portal) and **Track 3** (bridge via the `OpcUaClient` driver against the
CPU's onboard OPC UA server). A custom S7Plus implementation is out of
scope.
### 5. Data types beyond the scalars
`STRING` with length-prefix quirks, `DTL` / `DATE_AND_TIME`, arrays of
structs — not covered. UDT fan-out IS covered (PR-S7-D2 / #300) via the
`udt_layout` meta-seed in `Docker/profiles/s7_1500.json` and the
`Driver_fans_out_udt_into_member_tags` integration test.
### 6. SZL (System Status List) — `@System.*` virtual addresses
PR-S7-E1 / [#302](https://github.com/dohertj2/dohertj2/lmxopcua/issues/302)
adds a virtual `@System.*` address surface (CPU type, firmware, scan-cycle
stats, diagnostic-buffer ring) backed by SZL reads. **snap7 does not
implement SZL** — the simulator answers every SZL request with a function-
not-supported error, so the integration profile exercises only the
not-supported semantics (`@System.CpuType` against snap7 returns
`BadNotSupported`). Live-firmware SZL coverage is parked behind a
`[Fact(Skip = ...)]` until either S7netplus exposes a public `ReadSzlAsync`
or we ship a raw S7comm PDU helper. See
[`docs/v2/s7.md` "CPU diagnostics (SZL)"](../v2/s7.md#cpu-diagnostics-szl)
for the wire-status detail.
### 7. Password / protection levels — not modelled by snap7
PR-S7-E2 / [#303](https://github.com/dohertj2/lmxopcua/issues/303) adds
`Password` + `ProtectionLevel` options that emit a connection-level password
right after `OpenAsync`. **snap7 does not model S7 protection levels** — the
simulator accepts every connection regardless of the password set on the
client, so the integration profile cannot distinguish "password sent
correctly" from "password ignored". Coverage stays at the unit-test seam:
`S7PasswordOptionsTests` injects a fake `IS7PlcAuthGate` to assert the
dispatch contract (Password=null skips the call; Password+SupportsSendPassword
calls the gate; auth-failed wraps to a clean `InvalidOperationException`),
plus the no-log invariant on `S7DriverOptions.ToString()`.
The wire path is also fundamentally limited until S7netplus 0.20 exposes a
public `SendPassword` — the driver currently logs a warning and continues
when the API is missing. See
[`docs/v2/s7.md` "PLC password / protection levels"](../v2/s7.md#plc-password--protection-levels)
for the library-limitation note. Live-firmware coverage of the unlock path
requires a hardened S7-1500 lab rig with TIA Portal "Protection & Security"
configured, which is parked as a follow-up.
UDT fan-out, `STRING` with length-prefix quirks, `DTL` / `DATE_AND_TIME`,
arrays of structs — not covered.
## When to trust the S7 tests, when to reach for a rig
@@ -150,7 +97,7 @@ configured, which is parked as a follow-up.
| "Does the driver lifecycle hang / crash?" | yes | yes |
| "Does a real read against an S7-1500 return correct bytes?" | no | yes (required) |
| "Does mailbox serialization actually prevent PG timeouts?" | no | yes (required) |
| "Does a UDT fan-out produce usable member variables?" | yes (Snap7 + `udt_layout` meta-seed) | yes |
| "Does a UDT fan-out produce usable member variables?" | no | yes (required) |
## Follow-up candidates
@@ -162,56 +109,6 @@ configured, which is parked as a follow-up.
lab rig but not CI.
3. **Real S7 lab rig** — cheapest physical PLC (CPU 1212C) on a dedicated
network port, wired via self-hosted runner.
4. **PR-S7-C5 — PUT/GET-disabled pre-flight rejection.** Snap7 does *not*
model the hardened-CPU PUT/GET response (it accepts every read once the
COTP handshake completes), so the **failure** path of the pre-flight
probe — `S7PutGetDisabledException` thrown from `InitializeAsync` when
the PLC rejects the probe read with `ErrorCode.WrongCPU_Type` /
`ErrorCode.ReadData` — needs a real S7-1500 with PUT/GET disabled in TIA
Portal. The integration suite covers the *happy* path
(`Driver_preflight_passes_when_probe_address_seeded`); the failure path
should be added as a `--with-real-plc` opt-in test that the self-hosted
runner with the lab rig executes. The classifier branch
(`S7PreflightClassifier.IsPutGetDisabled`) is unit-tested without a
network in `S7PreflightTests.Classifier_matches_only_PUT_GET_disabled_error_codes`.
5. **Live-firmware Optimized-block-access toggle (PR-S7-F / [#304](https://github.com/dohertj2/lmxopcua/issues/304)).**
snap7 happens to behave like a classic-S7comm CPU, so the integration
profile cannot reproduce the failure that a default new TIA Portal V14+
project produces (`BadDeviceFailure` on `DB1.DBW0` against an Optimized
DB). A manual smoke test on the lab rig, gated behind `--with-real-plc`,
would close that loop. Suggested checklist on a real S7-1500 V2.5+:
1. Create `DB1` in TIA Portal with three INT members at offsets 0, 2, 4.
Leave **Optimized block access checked** (the default).
2. Compile + download to the PLC.
3. Drive the OtOpcUa S7 driver against `DB1.DBW0` — assert that the read
returns `BadDeviceFailure` (the Track-1-not-applied symptom). This is
the failure shape the docs warn about.
4. Open `DB1`'s properties → **uncheck Optimized block access**
compile → download. Re-run the read; assert it returns the seeded
INT value at offset 0. (Track 1 verified end-to-end.)
5. **Track 3 verification (separate run on the same rig):** with
Optimized access re-enabled on `DB1`, activate the CPU's onboard
OPC UA server in TIA Portal, expose `DB1.<MemberName>` through a
Server interface, register an `OpcUaClient` driver against
`opc.tcp://<plc-ip>:4840`, and assert the symbolic read returns the
same seeded value. This proves the bridge path against a real
Optimized DB without the operator having to disable Optimized
access.
The test must stay manual: TIA Portal compile + download cannot be
automated from CI without a Siemens engineering toolchain license, and
download-with-CPU-stop is destructive on a shared lab rig. Document
results inline in PR descriptions when the rig is available.
6. **PR-S7-E1 — live SZL test against a real S7-1500.** snap7 doesn't
implement SZL at all, and S7netplus 0.20 doesn't expose a public
`ReadSzlAsync`, so the `@System.*` virtual address surface currently
answers `BadNotSupported` against every backend. The parser
(`S7SzlParser`) is unit-tested against golden bytes; flipping the wire
path on requires either an S7netplus PR or a raw-PDU helper. Once that's
in, [`S7_1500SzlTests.System_CpuType_against_live_S7_1500_returns_non_empty_string`](../../tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/S7_1500/S7_1500SzlTests.cs)
should be flipped from `[Fact(Skip = ...)]` to env-var-gated against the
self-hosted runner with the lab rig.
Without any of these, S7 driver correctness against real hardware is trusted
from field deployments, not from the test suite.
+119 -230
View File
@@ -2,68 +2,85 @@
Coverage map + gap inventory for the Beckhoff TwinCAT ADS driver.
**TL;DR:** Integration-test scaffolding lives at
`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/` (task #221).
`TwinCATXarFixture` probes TCP 48898 on an operator-supplied VM; three
smoke tests (read / write / native notification) run end-to-end through
the real ADS stack when the VM is reachable, skip cleanly otherwise.
**Remaining operational work**: stand up a TwinCAT 3 XAR runtime in a
Hyper-V VM, author the `.tsproj` project documented at
`TwinCatProject/README.md`, rotate the 7-day trial license (or buy a
paid runtime). Unit tests via `FakeTwinCATClient` still carry the
exhaustive contract coverage.
**TL;DR:** Integration-test suite lives at
`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/`. `TwinCATXarFixture`
probes TCP 48898 on an operator-supplied runtime; the suite runs **14
`[TwinCATFact]` methods + one 16-case `[TwinCATTheory]` = 30 test cases** end-to-end
through the real ADS stack when the runtime is reachable, skips cleanly
otherwise. The runtime can be a Hyper-V XAR VM or a TCBSD VM
(`TwinCatProject/README.md` covers both). Unit tests via `FakeTwinCATClient`
still carry the exhaustive contract coverage alongside.
TwinCAT is the only driver outside Galaxy that uses **native
notifications** (no polling) for `ISubscribable`, and the fake exposes a
fire-event harness so notification routing is contract-tested rigorously
at the unit layer.
TwinCAT is the only driver outside Galaxy that uses **native notifications**
(no polling) for `ISubscribable`. The integration suite verifies that path on
the wire; the fake exposes a fire-event harness so notification routing is
also contract-tested rigorously at the unit layer.
## What the fixture is
**Integration layer** (task #221, scaffolded):
`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/`
`TwinCATXarFixture` TCP-probes ADS port 48898 on the host specified by
`TWINCAT_TARGET_HOST` + requires `TWINCAT_TARGET_NETID` (AmsNetId of the
VM). No fixture-owned lifecycle — XAR can't run in Docker because it
bypasses the Windows kernel scheduler, so the VM stays
operator-managed. `TwinCatProject/README.md` documents the required
`.tsproj` project state; the file itself lands once the XAR VM is up +
the project is authored. Three smoke tests:
`Driver_reads_seeded_DINT_through_real_ADS`,
`Driver_write_then_read_round_trip_on_scratch_REAL`, and
`Driver_subscribe_receives_native_ADS_notifications_on_counter_changes`
— all skip cleanly via `[TwinCATFact]` when the runtime isn't
reachable.
**Integration layer**: `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/`
`TwinCATXarFixture` TCP-probes ADS port 48898 on the host supplied by
`TWINCAT_TARGET_HOST` (defaults to `localhost`) + requires
`TWINCAT_TARGET_NETID` (AmsNetId of the runtime). Optionally takes
`TWINCAT_TARGET_PORT` (default `851` = TC3 PLC runtime 1). No fixture-owned
lifecycle — XAR / TCBSD can't run in Docker because they bypass the host
kernel scheduler, so the runtime stays operator-managed.
`TwinCatProject/README.md` documents the required project state; the tests
gate on `[TwinCATFact]` / `[TwinCATTheory]` and skip cleanly when
`TWINCAT_TARGET_NETID` is unset or the probe fails.
**Unit layer**: `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` is
still the primary coverage. `FakeTwinCATClient` also fakes the
`AddDeviceNotification` flow so tests can trigger callbacks without a
running runtime.
**Unit layer**: `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/` remains the
primary contract coverage. `FakeTwinCATClient` fakes the
`AddDeviceNotification` flow so tests can trigger callbacks without a running
runtime.
## What it actually covers
### Integration (XAR VM, task #221 — code scaffolded, needs VM + project)
### Integration (live runtime)
- `TwinCAT3SmokeTests.Driver_reads_seeded_DINT_through_real_ADS` — real AMS
handshake + ADS read of `GVL_Fixture.nCounter` (seeded at 1234, MAIN
increments each cycle)
- `TwinCAT3SmokeTests.Driver_write_then_read_round_trip_on_scratch_REAL`
real ADS write + read on `GVL_Fixture.rSetpoint`
- `TwinCAT3SmokeTests.Driver_subscribe_receives_native_ADS_notifications_on_counter_changes`
— real `AddDeviceNotification` against the cycle-incrementing counter;
observes `OnDataChange` firing within 3 s of subscribe
Every capability the driver implements is exercised on the wire:
All three gated on `TWINCAT_TARGET_HOST` + `TWINCAT_TARGET_NETID` env
vars; skip cleanly via `[TwinCATFact]` when the VM isn't reachable or
vars are unset.
- **Read**`Driver_reads_seeded_DINT_through_real_ADS` (AMS handshake +
symbolic read of `GVL_Fixture.nCounter`)
- **Write + read round-trip**`Driver_write_then_read_round_trip_on_scratch_REAL`
on `GVL_Fixture.rSetpoint`
- **Array element round-trip**`Driver_round_trips_array_element_write_and_read`
on `GVL_Arrays.aReal1D[5]` (exercises `TwinCATSymbolPath` subscript
rendering)
- **Subscribe (native ADS notifications)**
`Driver_subscribe_receives_native_ADS_notifications_on_counter_changes`;
observes `OnDataChange` firing within 10 s of subscribe
- **Symbol browse (direct client path)**
`Driver_browses_committed_symbol_hierarchy_via_real_ADS` via
`ITwinCATClient.BrowseSymbolsAsync`
- **Symbol browse (through DiscoverAsync + `IAddressSpaceBuilder` pipeline)**
`DiscoverAsync_renders_declared_tags_and_controller_browse_hits_address_space_builder`
verifies the real `TwinCAT/ → device/ → Discovered/` folder tree
- **Auto-reconnect**`Driver_auto_reconnects_after_underlying_client_is_disposed`
disposes the `AdsClient` mid-flight; next read must re-establish
- **Primitive type coverage**`Driver_reads_every_primitive_type_with_correct_mapping`
runs as a `[Theory]` against the 16 primitives in `GVL_Primitives`
(Bool, SInt, USInt, Int, UInt, DInt, UDInt, LInt, ULInt, Real, LReal,
String, Time, TimeOfDay, Date, DateTime) — asserts status + CLR type +
seed value where ergonomic
- **Bit-indexed BOOL**`Driver_reads_bit_indexed_BOOL_from_word` against
`GVL_Primitives.vWord.3` + `.4` (bits of `0xBEEF`)
- **Nested UDT navigation**`Driver_reads_deeply_nested_UDT_path` reads
`GVL_Plant.Line1.Stations[1].Axes[1].Motor.Temperature` (LREAL) + `.Running` (BOOL)
- **Multi-device routing + isolation**
`Driver_routes_reads_per_device_and_isolates_unreachable_peers` pairs the
real runtime with a bogus AmsNetId; healthy device reads still succeed
- **Probe loop + `IHostConnectivityProbe`**
`Probe_loop_raises_host_status_transition_to_Running_on_reachable_target`
asserts `OnHostStatusChanged → Running` and snapshot parity
- **Negative error mappings**
`Driver_reports_errors_for_unknown_tag_and_nonexistent_symbol_and_readonly_write`
covers `BadNodeIdUnknown`, ghost-symbol communication errors, and the
`BadNotWritable` short-circuit
PR 4.1 / #315 adds `TwinCATUdtBrowseTests.Driver_browses_UDT_tree_and_flattens_to_atomic_leaves`
which exercises `TwinCATDriver.DiscoverAsync` end-to-end against the
`GVL_Plant` UDT fixture. Asserts the discovery surface emits one OPC UA
variable per atomic leaf and folds `aAlarmRecords[1..2000]` into a
single `IsArrayRoot` placeholder when the element count exceeds the
default 1024-element cap (UDT per-member coverage; see
`TwinCatProject/README.md §Complex hierarchy` for the supporting DUTs).
All tests gate on `TWINCAT_TARGET_NETID` (required) via `[TwinCATFact]` /
`[TwinCATTheory]`; `TWINCAT_TARGET_HOST` (default `localhost`) and
`TWINCAT_TARGET_PORT` (default `851`) are optional overrides.
### Unit
@@ -73,87 +90,69 @@ default 1024-element cap (UDT per-member coverage; see
- `TwinCATReadWriteTests` — read + write through the fake, status mapping
- `TwinCATSymbolPathTests` — symbol-path routing for nested struct members
- `TwinCATSymbolBrowserTests``ITagDiscovery.DiscoverAsync` via
`ReadSymbolsAsync` (#188) + system-symbol filtering
- `TwinCATTypeWalkerTests` — PR 4.1 / #315 nested-UDT decomposition:
atomic / single-level struct / nested struct / array-of-atomic
(in / over `MaxArrayExpansion`) / array-of-struct / alias chain /
pointer skip / self-referencing struct depth-cap / per-leaf
`MaxArrayExpansion` honored / ReadOnly propagation. Stub `IDataType`
/ `IStructType` / `IArrayType` / `IMember` / `IDimensionCollection`
trees built in-test so the walker is exercised without
`Beckhoff.TwinCAT.Ads`-internal ctors.
- `TwinCATNativeNotificationTests``AddDeviceNotification` (#189)
registration, callback-delivery-to-`OnDataChange` wiring, unregister on
unsubscribe
`BrowseSymbolsAsync` + system-symbol filtering
- `TwinCATNativeNotificationTests``AddDeviceNotification` registration,
callback-delivery-to-`OnDataChange` wiring, unregister on unsubscribe
- `TwinCATDriverTests``IDriver` lifecycle
Capability surfaces whose contract is verified: `IDriver`, `IReadable`,
`IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`,
`IPerCallHostResolver`, `IAlarmSource` (PR 5.1 / #316, gated behind
`EnableAlarms=true` — see capability matrix below).
Capability surfaces whose contract is verified at the unit layer: `IDriver`,
`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`,
`IHostConnectivityProbe`, `IPerCallHostResolver`. The integration suite now
verifies `ITagDiscovery` + `IHostConnectivityProbe` on the wire as well.
## Capability matrix
## Bugs caught by live runs
| Capability | Status | Notes |
| --- | --- | --- |
| `IDriver` | yes | Lifecycle + health |
| `IReadable` | yes | Sum-read for scalars; per-tag for bit / array |
| `IWritable` | yes | Sum-write for scalars; per-tag for bit-RMW / array |
| `ITagDiscovery` | yes | Pre-declared + opt-in symbol-table walk |
| `ISubscribable` | yes | Native ADS notifications by default; poll fallback |
| `IHostConnectivityProbe` | yes | `ReadStateAsync` + system-symbol diagnostics |
| `IPerCallHostResolver` | yes | Tag → device hostAddress |
| `IAlarmSource` (PR 5.1 / #316) | partial | Scaffold + unit-tested; live wire decode is best-effort against AMS port 110, see `docs/v3/twincat-eventlogger-spike.md` |
| `IHistoryProvider` | no | Not in scope for this driver family |
The integration suite surfaced three driver defects that `FakeTwinCATClient`
couldn't, since each lived below the abstraction boundary:
1. **Notification cycle time unit**`NotificationSettings(cycleTime, maxDelay)`
takes **milliseconds** per Beckhoff InfoSys
(`tcadsnetref/7313319051`), but the driver was multiplying by `10_000`
under a "100 ns units" assumption. A requested 250 ms cycle was being set
to ~41 minutes — subscribe never fired. Fix in `AdsTwinCATClient.AddNotificationAsync`.
2. **`STRING(N)` / `WSTRING(N)` type mapper** — `MapSymbolTypeName` only
matched bare `"STRING"` / `"WSTRING"`, so sized strings (the common case)
fell off `BrowseSymbolsAsync` entirely. Fix: strip the `(…)` bound before
the switch.
3. **Bit-indexed BOOL path** — driver was sending `"GVL.vWord.3"` to ADS as
a BOOL read. TwinCAT's symbol table doesn't expose bit-access paths; the
read returned `DeviceSymbolNotFound`. Fix: strip the `.N` suffix, read
the parent word as `uint`, extract the bit locally via `ExtractBit`.
All three paths are now pinned by live-wire tests.
## What it does NOT cover
### 1. AMS / ADS wire traffic
### 1. AMS / ADS wire framing
No real AMS router frame is exchanged. Beckhoff's `TwinCAT.Ads` NuGet (their
own .NET SDK, not libplctag-style OSS) has no in-process fake; tests stub
the `ITwinCATClient` abstraction above it.
No raw AMS packet is inspected. Beckhoff's `TwinCAT.Ads` NuGet (their own
.NET SDK, not libplctag-style OSS) has no in-process fake at the frame
level; tests run against a real router.
### 2. Multi-route AMS
ADS supports chained routes (`<localNetId> → <routerNetId> → <targetNetId>`)
for PLCs behind an EC master / IPC gateway. Parse coverage exists; wire-path
coverage doesn't.
coverage is single-hop only.
### 3. Notification reliability under jitter
### 3. Notification coalescing under jitter
`AddDeviceNotification` delivers at the runtime's cycle boundary; under high
CPU load or network jitter real notifications can coalesce. The fake fires
one callback per test invocation — real callback-coalescing behavior is
untested.
PR 3.1 (#313) makes the per-tag `MaxDelay` configurable via
`TwinCATTagDefinition.MaxDelayMs` — the runtime can buffer changes for up to
that many milliseconds before dispatch, deliberately coalescing bursty
high-frequency signals so the OPC UA queue downstream doesn't flood. Default
`null` / `0` preserves the pre-PR-3.1 "fire ASAP" behaviour.
`TwinCATMaxDelayTests.Driver_coalesces_notifications_at_max_delay` exercises
the wire-side coalescer end-to-end against `GVL_Fixture.nCounter`; the unit
suite (`TwinCATNativeNotificationTests`) covers the plumbing contract via
the `FakeTwinCATClient.FakeNotification.MaxDelayMs` capture.
`AddDeviceNotification` delivers at the runtime's cycle boundary; under
sustained CPU load or network jitter real notifications can coalesce. The
live test only asserts at-least-one delivery within a generous window —
coalescing behavior under stress isn't verified.
### 4. TC2 vs TC3 variant handling
TwinCAT 2 (ADS v1) and TwinCAT 3 (ADS v2) have subtly different
`GetSymbolInfoByName` semantics + symbol-table layouts. Driver targets TC3;
TC2 compatibility is not exercised.
`GetSymbolInfoByName` semantics + symbol-table layouts. Driver + tests target
TC3; TC2 compatibility is not exercised.
### 5. Cycle-time alignment for `ISubscribable`
### 5. Alarms / history
Native ADS notifications fire on the PLC cycle boundary. The fake test
harness assumes notifications fire on a timer the test controls;
cycle-aligned firing under real PLC control is not verified.
### 6. History
Driver doesn't implement `IHistoryProvider` — not in scope for this
driver family. (Alarms now have a dedicated `IAlarmSource` bridge — see
the capability matrix below + `docs/drivers/TwinCAT.md`.)
Driver doesn't implement `IAlarmSource` or `IHistoryProvider` — not in scope
for this driver family. TwinCAT 3's TcEventLogger could theoretically back
an `IAlarmSource`, but shipping that is a separate feature.
## When to trust TwinCAT tests, when to reach for a rig
@@ -163,135 +162,25 @@ the capability matrix below + `docs/drivers/TwinCAT.md`.)
| "Does notification → `OnDataChange` wire correctly?" | yes (contract) | yes |
| "Does symbol browsing filter TwinCAT internals?" | yes | yes |
| "Does a real ADS read return correct bytes?" | no | yes (required) |
| "Do notifications coalesce under load?" | no | yes (required) |
| "Does auto-reconnect work on router restart?" | no (contract only) | yes (required) |
| "Do notifications coalesce under sustained load?" | no | yes (required) |
| "Does a TC2 PLC work the same as TC3?" | no | yes (required) |
## Performance
PR 2.1 (Sum-read / Sum-write, IndexGroup `0xF080..0xF084`) replaced the per-tag
`ReadValueAsync` loop in `TwinCATDriver.ReadAsync` / `WriteAsync` with a
bucketed bulk dispatch — N tags addressed against the same device flow through a
single ADS sum-command round-trip via `SumInstancePathAnyTypeRead` (read) and
`SumWriteBySymbolPath` (write). Whole-array tags + bit-extracted BOOL tags
remain on the per-tag fallback path because the sum surface only marshals
scalars and bit-RMW writes need the per-parent serialisation lock.
**Baseline → Sum-command delta** (dev box, 1000 × DINT, XAR VM over LAN):
| Path | Round-trips | Wall-clock |
| --- | --- | --- |
| Per-tag loop (pre-PR 2.1) | 1000 | ~58 s |
| Sum-command bulk (PR 2.1) | 1 | ~250600 ms |
| Ratio | — | ≥ 10× typical, ≥ 5× CI floor |
The perf-tier test
`TwinCATSumCommandPerfTests.Driver_sum_read_1000_tags_beats_loop_baseline_by_5x`
asserts the ratio with a conservative 5× lower bound that survives noisy CI /
VM scheduling. It is gated behind both `TWINCAT_TARGET_NETID` (XAR reachable)
and `TWINCAT_PERF=1` (operator opt-in) — perf runs aren't part of the default
integration pass because they hit the wire heavily.
The required fixture state (1000-DINT GVL + churn POU) is documented in
`TwinCatProject/README.md §Performance scenarios`; XAE-form sources land at
`TwinCatProject/PLC/GVLs/GVL_Perf.TcGVL` + `TwinCatProject/PLC/POUs/FB_PerfChurn.TcPOU`.
### Handle caching (PR 2.2)
Per-tag reads / writes route through an in-process ADS variable-handle cache.
The first read of a symbol resolves a handle via `CreateVariableHandleAsync`;
subsequent reads / writes of the same symbol issue against the cached handle.
On the wire this trades a multi-byte symbolic path (`GVL_Perf.aTags[742]` =
20+ bytes) for a 4-byte handle, and the device server skips name resolution
on every subsequent op. Cache lifetime is process-scoped; entries are evicted
on `AdsErrorCode.DeviceSymbolVersionInvalid` (with one retry against a fresh
handle), wiped on reconnect (handles are per-AMS-session), and deleted
best-effort on driver disposal.
`TwinCATHandleCachePerfTests.Driver_handle_cache_avoids_repeat_symbol_resolution`
asserts the contract on real XAR by reading 50 symbols twice and verifying
the second pass issues zero new `CreateVariableHandleAsync` calls. It runs
under the standard `[TwinCATFact]` gate (XAR reachable; no `TWINCAT_PERF`
opt-in needed because 50 symbols is cheap).
**Self-invalidation (PR 2.3)**: handle cache is now self-invalidating on
TwinCAT online changes. `AdsTwinCATClient` registers an
`AdsSymbolVersionChanged` event listener (Beckhoff's high-level wrapper
around the SymbolVersion ADS notification, IndexGroup `0xF008`) on connect;
when the PLC's symbol-version counter increments — full re-init after a
download / activate-config — the listener fires and wipes the handle cache
proactively. Three-layered defence in depth: (1) proactive listener
preempts the next read entirely on full re-inits, (2) the
`DeviceSymbolVersionInvalid` evict-and-retry path from PR 2.2 catches the
narrower "symbol survives but its descriptor moved" race, and (3)
operators can still call `ITwinCATClient.FlushOptionalCachesAsync` manually
for the truly-paranoid case. The bulk Sum-read / Sum-write path remains
on symbolic paths in PR 2.2 (the bulk path's per-call symbol resolution
is already amortised across N tags; the perf delta vs. handle-batched
bulk is marginal — tracked as a follow-up for the Phase-2 perf sweep).
## Diagnostics
PR 3.2 (#314) augments the probe loop. On every successful tick (post `ReadStateAsync`)
the driver also reads four well-known system symbols off the AMS target and stashes
them on `DeviceState.LastDiagnostics` as a `TwinCATDeviceDiagnostics` record. The same
snapshot is folded into `DriverHealth.Diagnostics` so the cross-driver
`driver-diagnostics` RPC (added for Modbus, task #154) renders TwinCAT cycle-time /
jitter / online-change counters next to its peers without a per-driver special-case.
| Symbol | Type | Diagnostic key | Notes |
| --- | --- | --- | --- |
| `TwinCAT_SystemInfoVarList._AppInfo.AppName` | `STRING(80)` | (record only) | Running PLC project name, e.g. `"Plc1"` |
| `TwinCAT_SystemInfoVarList._AppInfo.OnlineChangeCnt` | `UDINT` | `TwinCAT.OnlineChangeCnt` | Increments on every accepted online change; informational |
| `TwinCAT_SystemInfoVarList._TaskInfo[1].CycleTime` | `UDINT` (100 ns ticks) | `TwinCAT.CycleTimeMs` | Configured task period after `÷10000` ms conversion |
| `TwinCAT_SystemInfoVarList._TaskInfo[1].LastExecTime` | `UDINT` (100 ns ticks) | `TwinCAT.LastExecTimeMs` | Wall-clock duration of the last task tick |
| (computed) | `double` | `TwinCAT.JitterMs` | `LastExecTimeMs - CycleTimeMs`; positive = overrun |
| (computed) | `long` | `TwinCAT.OnlineChangeIncrements` | Cumulative deltas observed since the driver started; only emitted once non-zero |
Each individual read is wrapped in best-effort try/catch. A runtime that doesn't
expose `_TaskInfo[1]` (older TwinCAT 2 builds, some soft-PLC implementations) still
produces a partial snapshot; the missing fields fall back to the previous tick's value
or the type default for the first probe tick. Wholesale failure of all four reads
leaves the previous snapshot in place and the next tick retries.
Single-device deployments produce flat keys (`TwinCAT.CycleTimeMs`); multi-device
deployments prefix with the AMS host address (`TwinCAT.<hostAddress>.CycleTimeMs`)
so the readout is unambiguous when one driver instance owns multiple AMS targets.
Wire-level coverage lives in
`TwinCATDiagnosticsIntegrationTests.Probe_loop_surfaces_cycle_time_and_online_change_count`
(asserts `CycleTimeMs > 0` + `OnlineChangeCnt >= 0` within one probe interval against a
reachable XAR runtime). Unit-level coverage of the dictionary shape, the per-symbol
try/catch, and the multi-device prefixing lives in `TwinCATDeviceDiagnosticsTests`
the `FakeTwinCATClient.SetSystemSymbolValue` helper drives the surface deterministically.
## Follow-up candidates
1. **XAR VM live-population** — scaffolding is in place (this PR); the
remaining work is operational: stand up the Hyper-V VM, install XAR,
author the `.tsproj` per `TwinCatProject/README.md`, configure the
bilateral ADS route, set `TWINCAT_TARGET_HOST` + `TWINCAT_TARGET_NETID`
on the dev box. Then the three smoke tests transition skip → pass.
Tracked as #221.
2. **License-rotation automation** — XAR's 7-day trial expires on
schedule. Either automate `TcActivate.exe /reactivate` via a
scheduled task on the VM (not officially supported; reportedly works
for some TC3 builds), or buy a paid runtime license (~$1k one-time
per runtime per CPU) to kill the rotation. The doc at
`TwinCatProject/README.md` §License rotation walks through both.
3. **Lab rig** — cheapest IPC (CX7000 / CX9020) on a dedicated network;
the only route that covers TC2 + real EtherCAT I/O timing + cycle
jitter under CPU load.
Deferred to v3 — see [`docs/v3/twincat-backlog.md`](../v3/twincat-backlog.md).
Covers TC2 coverage, notification-coalescing-under-load, multi-hop AMS,
license-rotation automation, and a dedicated lab IPC.
## Key fixture / config files
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCATXarFixture.cs`
— TCP probe + skip-attributes + env-var parsing
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCAT3SmokeTests.cs`
three wire-level smoke tests
— wire-level test suite (14 `[TwinCATFact]` + 16-case `[TwinCATTheory]`)
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md`
— project spec + VM setup + license-rotation notes
- `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Tests/FakeTwinCATClient.cs`
in-process fake with the notification-fire harness used by
`TwinCATNativeNotificationTests`
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs` — ctor takes
`ITwinCATClientFactory`
in-process fake with the notification-fire harness
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs` — ctor is
`(TwinCATDriverOptions, string driverInstanceId, ITwinCATClientFactory? = null)`
-115
View File
@@ -1,115 +0,0 @@
# TwinCAT driver — operator guide
Beckhoff TwinCAT 2 / TwinCAT 3 ADS driver. Talks to the runtime via
`Beckhoff.TwinCAT.Ads` v6 (managed); requires a reachable AMS router on
the host (local TwinCAT XAR, the standalone `Beckhoff.TwinCAT.Ads.TcpRouter`
NuGet, or any Windows box with TwinCAT installed and an authorised AMS
route).
## Configuration surface
`TwinCATDriverOptions` (one instance supports N AMS targets, each a
`TwinCATDeviceOptions`). Wire format mirrors the C# class on the JSON
side — every `init`-only property round-trips through
`System.Text.Json` with the default options.
| Option | Type | Default | Notes |
| --- | --- | --- | --- |
| `Devices` | `TwinCATDeviceOptions[]` | `[]` | One entry per AMS target. |
| `Tags` | `TwinCATTagDefinition[]` | `[]` | Pre-declared symbol set. |
| `Probe.Enabled` | `bool` | `true` | Per-tick `ReadStateAsync` against the runtime. |
| `Probe.Interval` | `TimeSpan` | `5 s` | |
| `Timeout` | `TimeSpan` | `2 s` | Per-operation timeout. |
| `UseNativeNotifications` | `bool` | `true` | False = fall through to PollGroupEngine. |
| `EnableControllerBrowse` | `bool` | `false` | Walk symbol table on `DiscoverAsync`. |
| `MaxArrayExpansion` | `int` | `1024` | Per-element cutoff during nested-UDT browse. |
| `EnableAlarms` (PR 5.1) | `bool` | `false` | Opt-in TC3 EventLogger bridge — see "Alarms" below. |
## Alarms (TC3 EventLogger bridge, PR 5.1 / #316)
When `EnableAlarms=true`, the driver implements `IAlarmSource` by
opening a second `AdsClient` against AMS port **110**
(`AMSPORT_EVENTLOG`) and adding a device notification on
`ADSIGRP_TCEVENTLOG_ALARMS`. Subscribers receive `OnAlarmEvent`
notifications for every transition the EventLogger surfaces (raise /
clear / acknowledge).
### Decode caveat
Beckhoff doesn't ship a managed wrapper for `TcEventLogger` in the
regular `Beckhoff.TwinCAT.Ads` v6 NuGet — only the C++ TcCOM headers
exist. The driver therefore decodes the AMS-port-110 binary payload
manually. The current implementation is best-effort: event class GUIDs
and source names usually decode cleanly; some less-common fields may
surface as `"Unknown"` until a follow-up PR lands a complete decoder.
Spike output captured at
[`docs/v3/twincat-eventlogger-spike.md`](../v3/twincat-eventlogger-spike.md).
### Wire path
| Layer | What it does |
| --- | --- |
| Primary `AdsClient` | The existing per-device session against the PLC runtime port (default `851`) — handles reads / writes / native subscriptions. |
| Secondary `AdsClient` (alarms) | Opens against AMS port `110` on the same target NetId. Adds one device notification on `ADSIGRP_TCEVENTLOG_ALARMS` with a `length=...` payload covering the full alarm-list shape. |
| `ITwinCATAlarmGate` (driver-internal) | Decodes incoming notifications into `TwinCATAlarmEvent` records (`EventClass`, `Source`, `Severity`, `Message`, `OccurrenceUtc`, `Acked`). |
| `TwinCATAlarmSource` | Projects `TwinCATAlarmEvent` onto the driver-agnostic `IAlarmSource.OnAlarmEvent`. |
### Severity mapping (TC3 → OPC UA AC)
TC3 EventLogger severity is a 0255 `USINT`. The driver maps it onto
the four-bucket `AlarmSeverity` enum the OPC UA AC layer consumes:
| TC3 severity | `AlarmSeverity` |
| --- | --- |
| 064 | `Low` |
| 65128 | `Medium` |
| 129192 | `High` |
| 193255 | `Critical` |
### Acknowledge
`AcknowledgeAsync` round-trips through `ITwinCATAlarmGate.AcknowledgeAsync`,
which writes to the EventLogger ack index group. Best-effort — the wire
format isn't documented in managed code, so individual ack failures don't
poison the batch and the gate returns silently when the EventLogger isn't
configured.
### Disabling
`EnableAlarms=false` (default) returns a sentinel handle from
`SubscribeAlarmsAsync` and never opens the secondary `AdsClient`.
`OnAlarmEvent` simply never fires. Capability negotiation still works,
which is why the driver advertises `IAlarmSource` unconditionally.
## CLI
The `otopcua-twincat-cli` test client exposes an `alarms` subcommand
that wraps the bridge end-to-end:
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli -- alarms `
--ams-net-id 5.23.91.23.1.1 --ams-port 851 `
--source Conveyor1.MotorOverload
```
See [`docs/Driver.TwinCAT.Cli.md`](../Driver.TwinCAT.Cli.md) for the
full CLI surface.
## Test coverage
- **Unit**: `TwinCATAlarmSourceTests` covers (a) feature-gating off vs.
on, (b) gate-event projection shape, (c) multi-event ordering, (d)
source-filter matching, (e) acknowledge round-trip, (f) JSON DTO
round-trip.
- **Integration**: `TwinCATAlarmIntegrationTests.Driver_raises_alarm_event_when_PLC_logs_event`
ships build-only in PR 5.1; the GVL + FB_AlarmHarness ship as XAE
stubs at `tests/.../TwinCatProject/PLC/`. Once the XAR project
imports them the test transitions skip → pass.
## See also
- [`docs/v3/twincat-eventlogger-spike.md`](../v3/twincat-eventlogger-spike.md)
— spike output for the managed-wrapper question
- [`docs/drivers/TwinCAT-Test-Fixture.md`](TwinCAT-Test-Fixture.md)
— coverage map + capability matrix
- [`docs/Driver.TwinCAT.Cli.md`](../Driver.TwinCAT.Cli.md) — CLI guide
-20
View File
@@ -1,20 +0,0 @@
# Feature gaps — driver-side limitations and decisions
Cross-driver registry of known capability gaps, the workaround we ship, and
whether the gap is on the roadmap. Each row links to the driver-specific
deep-dive document. Closed entries stay in the table for traceability — they
are not deleted, only marked.
| # | Driver | Gap | Status | Workaround / decision | Roadmap | Reference |
|---|--------|-----|--------|-----------------------|---------|-----------|
| 1 | S7 | **Optimized DB / S7Plus** — S7netplus speaks classic S7comm only and cannot read S7-1200 / S7-1500 DBs that have "Optimized block access" checked (the TIA Portal V14+ default). Absolute-offset reads against an Optimized DB return `BadDeviceFailure`. | **Decided — Track 1 + Track 3 (closed by [#304](https://github.com/dohertj2/lmxopcua/issues/304))** | **Track 1 (docs):** operators uncheck "Optimized block access" in TIA Portal on every DB the driver reads, recompile, and download. **Track 3 (bridge):** for shops that won't or can't disable Optimized access, run an `OpcUaClient` driver instance against the S7-1500 V2.5+ CPU's onboard OPC UA server (Siemens runtime OPC UA license required). | **Track 2 (custom S7Plus library) is out of scope** unless a customer funds the ≥4-week initial implementation plus ongoing protocol-revision maintenance. Sharp7 / Snap7Net don't help — they are also classic-S7comm-only. | [`docs/v2/s7.md` § Optimized DB constraint](v2/s7.md#optimized-db-constraint-s7plus) · [`docs/drivers/OpcUaClient.md`](drivers/OpcUaClient.md) |
## How to read this table
- **Status** is one of `Open` (work pending), `Decided` (architectural
decision made; docs reflect it; no code change planned), or
`Closed` (delivered).
- **Roadmap** captures whether the gap is funded for the next phase. A blank
cell means "no roadmap; doc-only outcome."
- The numeric **#** is stable — new rows append at the bottom and keep their
number across deletions/edits so cross-references survive.
-73
View File
@@ -1,73 +0,0 @@
# Decisions
Architecture-level decisions taken during the v2 implementation, captured
once and referenced from feature docs / PR descriptions / ADR-style
follow-ups. Each entry lists the decision, the alternatives we considered,
and the rationale that tipped the call.
## FOCAS write-path opt-in
**Issue:** [#268](https://github.com/dohertj2/lmxopcua/issues/268). **Plan PR:** F4-a.
### Decision
The FOCAS driver ships writes behind two independent opt-ins, both default
off:
1. **Driver-level master switch**`FocasDriverOptions.Writes.Enabled`,
default `false`. When off, every entry in a `WriteAsync` batch short-
circuits to `BadNotWritable` with status text `writes disabled at
driver level`. The wire client is never touched.
2. **Per-tag opt-in**`FocasTagDefinition.Writable`, default `false`
(flipped from `true` in F4-a). A `Writable = false` tag returns
`BadNotWritable` even when the driver-level flag is on.
`BadNotSupported` is reserved for kinds the wire client hasn't yet
implemented; F4-b/c land actual macro / parameter / PMC writes that
currently dispatch to `BadNotSupported` (or to `Good` against the F4-a
fake) for unimplemented branches.
### Alternatives considered
- **Always-on writes (the pre-F4-a default).** Rejected: a single
misconfigured tag flipping `Writable = true` by accident would let an
operator overwrite a CNC parameter from any OPC UA client. The two-
opt-in posture means an accidental tag flip alone isn't enough.
- **Driver-level switch only.** Rejected: doesn't protect against an
operator with admin rights flipping the master switch to do bulk diag
reads but inheriting write capability for tags that were intended
read-only.
- **Per-tag opt-in only.** Rejected: doesn't give the deployment an "all
writes off" emergency lever — useful during a CNC commissioning where
writes are unsafe across the board for a period.
### Rationale
CNC writes are non-idempotent in the field's worst-case shape: feed
overrides, M-code pulses, alarm acks, recipe-step advances. Two opt-ins
is the cheapest defence-in-depth posture that still lets writes ship.
Both default off so a fresh deployment is read-only — the explicit choice
to enable writes lands at config time where it's reviewable, not at
runtime where it's invisible.
`WriteIdempotent` plumbs through `CapabilityInvoker.ExecuteWriteAsync`
into the Polly retry pipeline; default `false` means failed writes are
not auto-retried (plan decisions #44 / #45). Per-tag flip required for
genuinely-idempotent writes.
### CLI carve-out
`otopcua-focas-cli write` sets `Writes.Enabled = true` locally for the
lifetime of one process and synthesises a `Writable = true` tag. The CLI
is a per-operator direct-to-CNC tool — not a long-lived process bound to
the central config DB. Configuring the server still requires both opt-ins
to be set explicitly in the DriverInstance JSON. The bypass is documented
in `docs/Driver.FOCAS.Cli.md` so operators understand the asymmetry.
### Migration
Pre-F4-a deployments that relied on the `Writable = true` default need to
add `"Writable": true` to every tag they intend to write + an enclosing
`"Writes": { "Enabled": true }` block in their DriverInstance JSON.
Bootstrap rows seeded before F4-a get `Writable = false` after upgrade —
this is intentional; review-then-flip is the safer migration path.
+1 -1
View File
@@ -174,7 +174,7 @@ Common contract for the proxy in the main server:
Named pipes default to allowing connections from any local user. Without explicit ACLs, any process on the host machine that knows the pipe name could connect, bypass the OPC UA server's authentication and authorization layers, and issue reads, writes, or alarm acknowledgments directly against the driver host. **This is a real privilege-escalation surface** — a service account with no OPC UA permissions could write field values it should never have access to. Every Tier C driver enforces the following:
1. **Pipe ACL**: the host creates the pipe with a `PipeSecurity` ACL that grants `ReadWrite | Synchronize` only to the OtOpcUa server's service principal SID. All other local users — including LocalSystem and Administrators — are explicitly denied. The ACL is set at pipe-creation time so it's atomic with the pipe being listenable.
1. **Pipe ACL**: the host creates the pipe with a `PipeSecurity` ACL that grants `ReadWrite | Synchronize` only to the OtOpcUa server's service principal SID. `LocalSystem` is explicitly denied. The ACL is set at pipe-creation time so it's atomic with the pipe being listenable. Administrators are **not** added to the deny list — UAC's filtered token carries the Admins group SID as deny-only, so a deny ACE on Administrators would fire even for non-elevated callers whose user account happens to be a member (common on dev boxes). The per-connection SID check in §2 remains the authorization boundary.
2. **Caller identity verification**: on each new pipe connection, the host calls `NamedPipeServerStream.GetImpersonationUserName()` (or impersonates and inspects the token) and verifies the connected client's SID matches the configured server service SID. Mismatches are logged and the connection is dropped before any RPC frame is read.
3. **Per-message authorization context**: every RPC frame includes the operation's authenticated OPC UA principal (forwarded by the Core after it has done its own authn/authz). The host treats this as input only — the driver-level authorization (e.g. "is this principal allowed to write Tune attributes?") is performed by the Core, but the host's own audit log records the principal so post-incident attribution is possible.
4. **No anonymous endpoints**: the heartbeat pipe has the same ACL as the data-plane pipe. There are no "open" pipes a generic client can probe.
-321
View File
@@ -1,321 +0,0 @@
# FOCAS deployment guide
Per-driver runbook for deploying the FANUC FOCAS driver. See
[`docs/drivers/FOCAS.md`](../drivers/FOCAS.md) for the per-feature
reference and [`focas-version-matrix.md`](./focas-version-matrix.md) for
the per-CNC-series capability surface.
## Operator config-knob cheat sheet
| Knob | Where | Default | Notes |
| --- | --- | --- | --- |
| `Devices[].HostAddress` | `FocasDriverOptions.Devices` | — | `focas://{ip}[:{port}]` |
| `Devices[].Series` | `FocasDriverOptions.Devices` | `Unknown` | Drives per-series range validation in `FocasCapabilityMatrix`. |
| `Devices[].OverrideParameters` | `FocasDriverOptions.Devices` | `null` | MTB-specific parameter numbers for Feed/Rapid/Spindle/Jog overrides. `null` suppresses the `Override/` subtree. |
| `Probe.Enabled` | `FocasDriverOptions.Probe` | `true` | Background reachability probe. |
| `Probe.Interval` | `FocasDriverOptions.Probe` | `00:00:05` | Probe cadence. |
| `FixedTree.ApplyFigureScaling` | `FocasDriverOptions.FixedTree` | `true` | Divide position values by 10^decimal-places (issue #262). |
| **`AlarmProjection.Mode`** | **`FocasDriverOptions.AlarmProjection`** | **`ActiveOnly`** | **`ActiveOnly` keeps today's behaviour. `ActivePlusHistory` polls `cnc_rdalmhistry` on connect + on `HistoryPollInterval` ticks (issue #267, plan PR F3-a).** |
| **`AlarmProjection.HistoryPollInterval`** | **`FocasDriverOptions.AlarmProjection`** | **`00:05:00`** | **Cadence of the history poll. Operator dashboards run the default; high-frequency rigs can drop to 30 s.** |
| **`AlarmProjection.HistoryDepth`** | **`FocasDriverOptions.AlarmProjection`** | **`100`** | **Most-recent-N ring-buffer entries pulled per poll. Hard-capped at `250` so misconfigured values can't blast the wire session.** |
## Sample `appsettings.json` snippet for `ActivePlusHistory`
```jsonc
{
"Drivers": {
"FOCAS": {
"Devices": [
{ "HostAddress": "focas://10.0.0.5:8193", "Series": "Series30i" }
],
"AlarmProjection": {
"Mode": "ActivePlusHistory",
"HistoryPollInterval": "00:05:00",
"HistoryDepth": 100
}
}
}
}
```
The history projection emits each unseen entry through
`IAlarmSource.OnAlarmEvent` with `SourceTimestampUtc` set from the CNC's
reported wall-clock — keep CNC clocks on UTC so the dedup key
`(OccurrenceTime, AlarmNumber, AlarmType)` stays stable across DST
transitions.
## Derived telemetry — issue #272 (plan PR F5-a)
The `Production/` subtree gains two **derived** nodes alongside the four
F1-b wire-sourced fields:
- `Production/LastCycleSeconds` (`Float64`)
- `Production/LastCycleStartUtc` (`DateTime` UTC)
**No new wire calls.** Both nodes are computed client-visible from the
same `cnc_rdparam(6711)` + `cnc_rdtimer` poll the F1-b projection
already runs on every probe tick. There is no per-device knob — the
nodes are present for every CNC the driver connects to and surface
`null` until the second observed parts-count increment produces the
first delta.
This means:
- **No additional CNC load.** Probe-tick wire traffic is unchanged.
- **No new opt-in.** The nodes ship enabled by default and are
read-only (`SecurityClassification.ViewOnly`); no LDAP group needs
the new permission.
- **Reconnect re-baselines.** Per the FWLIB session boundary the
derivation state resets on reconnect / reinit, so the first cycle
observed after a reconnect re-establishes the baseline before
publishing the first post-reconnect delta.
See [`docs/drivers/FOCAS.md`](../drivers/FOCAS.md) § "Fixed-tree
`Production/` projection" for the full edge-case behaviour matrix
(parts-count counter reset, cycle-timer rollover, parts-count jumps
&gt; 1).
## Write safety — issue #269 (PARAM/MACRO, F4-b) + issue #270 (PMC, F4-c)
The FOCAS driver supports `cnc_wrparam`, `cnc_wrmacro`, and `pmc_wrpmcrng`
writes behind multiple independent opt-ins. A misdirected parameter write
can put the CNC in a bad state; a misdirected PMC write can move motion or
latch a feedhold. The runbook below MUST be followed before flipping any
of the granular kill switches on.
### Operator pre-checks (every deployment, every change)
1. **CNC must be in MDI mode.** Most parameter writes fail with `EW_PASSWD`
(surfaces as `BadUserAccessDenied`) unless the CNC is in MDI. The
server-side write returns immediately with the access-denied status; no
value reaches the wire.
2. **Parameter-write switch enabled on the CNC pendant.** Even in MDI mode
protected parameters require the operator to physically enable the
parameter-write switch. Without it `cnc_wrparam` returns `EW_PASSWD`.
Plan PR F4-d will land an OPC UA-side unlock workflow; today the only
path is the pendant.
3. **Verify each tag's address against the FANUC manual.** Ranges vary per
CNC series; the
[`focas-version-matrix`](./focas-version-matrix.md) capability matrix
rejects out-of-range numbers at startup, but address-vs-meaning is the
operator's job.
4. **Dry run with `Writable = true` but `Writes.AllowParameter = false`.**
Staged opt-in catches mis-mapped tags: every PARAM write returns
`BadNotWritable` until you flip the granular flag, so you can confirm
the tag list before any wire write fires.
### PMC pre-checks (in addition to the above) — F4-c
PMC writes have a higher blast radius than PARAM/MACRO writes because PMC
is the ladder's working memory — bits in R/G/F/D directly drive servo
enables, feedhold latches, and safety interlocks. Before flipping
`Writes.AllowPmc` on:
1. **E-stop verified live + reachable.** The first PMC write of a session
should be issued with the operator's hand on the e-stop. PMC writes
bypass the ladder's normal MDI-mode protections; a misdirected bit can
move motion the moment it lands on the wire.
2. **Machine in JOG mode (or equivalent low-energy mode).** Auto / MEM
modes interpret PMC state immediately; JOG / MDI surface symptoms
slowly enough that the e-stop is the recovery path. **Never issue the
first PMC write of a deployment in Auto.**
3. **Audit the PMC tag list against the ladder print-out.** `R100.3` on
one machine is "homing complete"; on another it's "feedhold released".
The driver has no way to distinguish — the ladder source is the only
ground truth.
4. **Bit writes are read-modify-write — see
[`docs/drivers/FOCAS.md`](../drivers/FOCAS.md) "PMC bit-write read-modify-write semantics".**
`pmc_wrpmcrng` is byte-addressed; the driver reads the parent byte
first, masks the target bit, and writes the byte back. Concurrent
ladder writes to the same byte create a small race window. Coordinate
through a ladder-side handshake when this matters.
5. **Dry run with `Writable = true` but `Writes.AllowPmc = false`.** Same
staged-opt-in pattern as PARAM/MACRO — confirm tag mapping before any
PMC byte hits the wire.
### LDAP group requirements
Per [`docs/security.md`](../security.md) the server-layer ACL maps
`SecurityClassification` to LDAP groups. Post-F4-b:
| Tag kind | LDAP group required |
| --- | --- |
| `PARAM:N` (writable) | **`WriteConfigure`** — heaviest write tier; matches commissioning roles |
| `MACRO:N` (writable) | `WriteOperate` — standard HMI recipe / setpoint group |
| PMC R/G/F (writable) | `WriteOperate` |
| Read-only | `ReadOnly` |
Per the `feedback_acl_at_server_layer` design note, the FOCAS driver
declares the classification but does NOT enforce it; `DriverNodeManager`
applies the gate before the driver's `WriteAsync` ever runs. A user
without `WriteConfigure` who attempts a `PARAM:` write gets
`BadUserAccessDenied` from the server with no driver-level audit entry —
the OPC UA layer's audit log catches it.
### Audit-log expectations
Every successful write produces:
- An OPC UA AuditWriteEvent (server layer — see
[`docs/security.md`](../security.md) "Audit logging").
- A FOCAS driver-level Serilog entry tagged `Driver=FOCAS DriverInstanceId=...
TagName=... Address=... ResultStatus=...`.
- A `Writes/LastWriteAt` and `Writes/LastWriteStatus` diagnostic counter
refresh on the device's `Diagnostics/` fixed-tree node (planned;
populated as F4-c lands).
Failures to write (`BadUserAccessDenied`, `BadCommunicationError`, etc.)
produce the same audit entries with the failure status code so a
post-incident reviewer sees the same shape regardless of whether the write
succeeded.
**Audit PMC writes specifically.** Because PMC writes have the highest blast
radius of the three write kinds, ops should set up a saved-search /
dashboard query for `Driver=FOCAS` + `Address` matching the PMC letter
prefixes (`R*`, `G*`, `F*`, `D*`, `Y*`, etc.) and review on the same
cadence as ladder change reviews. A spike in PMC write rate or a write
to an address outside the audited tag list is the leading indicator of a
misconfigured client or compromised credential.
### Granular config example
```jsonc
{
"Drivers": {
"FOCAS": {
"Devices": [
{ "HostAddress": "focas://10.0.0.5:8193", "Series": "Series30i" }
],
"Writes": {
"Enabled": true,
"AllowMacro": true, // recipe / setpoint writes — operator role
"AllowParameter": false, // commissioning only — keep locked except during planned work
"AllowPmc": false // PMC writes — keep locked unless the deployment specifically needs them
},
"Tags": [
{ "Name": "Recipe.PartCount", "DeviceHostAddress": "focas://10.0.0.5:8193",
"Address": "MACRO:500", "DataType": "Int32",
"Writable": true, "WriteIdempotent": true },
{ "Name": "MaxFeedrate", "DeviceHostAddress": "focas://10.0.0.5:8193",
"Address": "PARAM:1815", "DataType": "Int32",
"Writable": false /* keep read-only until commissioning window */ },
{ "Name": "OperatorRequest", "DeviceHostAddress": "focas://10.0.0.5:8193",
"Address": "R100.3", "DataType": "Bit",
"Writable": false /* keep PMC read-only until ladder handshake reviewed */ }
]
}
}
}
```
Flipping `AllowParameter` / `AllowPmc` on for the commissioning window
(and back off afterward) is the recommended deployment cadence — the
granular kill switches are lightweight runtime toggles, not config-DB
redeploys. PMC in particular should default OFF in production and only
flip on for windows where the ladder team has signed off on the write
path.
## FOCAS password handling — issue #271 (F4-d)
Some controllers (16i + certain 30i firmwares with parameter-protect on)
gate `cnc_wrparam` and selected reads behind a connection-level password.
The driver supports this via the `Password` field on `FocasDeviceOptions`
which is emitted via `cnc_wrunlockparam` on connect and re-emitted on any
`EW_PASSWD` read/write retry path. See
[`docs/drivers/FOCAS.md`](../drivers/FOCAS.md) § "FOCAS password" for the
driver-side behaviour; this section covers the deployment side.
### Storage in `appsettings.json`
```jsonc
{
"Drivers": {
"Focas01": {
"DriverConfigJson": {
"Backend": "fwlib",
"Series": "Sixteen_i",
"Devices": [
{
"HostAddress": "focas://10.0.0.5:8193",
"Password": "1234"
}
]
}
}
}
}
```
For dev environments, the password is materialised under
`.local/focas-passwords.txt` (or whichever .local subkey the deployment
team prefers); production deployments use the same secrets-store /
KeyVault pattern the LDAP `Authentication.Ldap.Password` field follows.
**The `.local/` directory is .gitignore'd** — this is the same posture
as `.local/galaxy-host-secret.txt` and other dev secrets in this repo.
### No-log invariant
The driver guarantees the password is **never logged**:
1. **`FocasDeviceOptions` ToString redaction.** The record overrides
`PrintMembers` so any Serilog destructure of the device options renders
`Password = ***` when the field is non-null. This catches the most
common leak path — a structured-log statement that included
`{@Device}` for diagnostic context.
2. **No password in exception messages.** `FwlibFocasClient.UnlockAsync`
omits the password from its `InvalidOperationException` text — only
the FWLIB error code (`EW_PASSWD`, `EW_HANDLE`, etc.) makes it through.
3. **Driver log line uses host only.** When unlock succeeds the driver
updates `DriverHealth.StatusText` to `"FOCAS unlock applied for
{host}"` — no password.
4. **CLI flag covered by the same choke point.** The
`Driver.FOCAS.Cli --cnc-password` flag flows through
`FocasDeviceOptions.Password`, so its redaction is identical to the
server's. The PowerShell e2e harness (`scripts/e2e/test-focas.ps1
-CncPassword`) follows the same path.
Any new logging surface that touches `FocasDeviceOptions` MUST continue
to use the record's `ToString` (or otherwise omit `Password`). A code
review checklist item: "no log statement contains `device.Options.Password`
or `device.Password` directly."
### Password-rotation runbook
When the CNC password rotates (operator team flipped a parameter-protect
gate, or your security policy requires periodic rotation):
1. **Update the password on the controller** (CNC pendant or vendor's
admin tool). The exact path varies by series — Fanuc service manual
page reference depends on the MTB.
2. **Update `appsettings.json`** in place with the new value.
- Production: bump the secrets-store entry that backs the
`Devices[*].Password` config-DB column. Same workflow as rotating
the LDAP service-account password.
- Dev: update `.local/focas-passwords.txt` (or wherever the dev
deployment sources the secret).
3. **Restart the OtOpcUa server** (or trigger a config-DB bump that
forces driver reinitialise). The driver picks up the new password
on the next `EnsureConnectedAsync` call. **No need to manually
reconnect each device** — `cnc_wrunlockparam` emits on the next
wire-call boundary.
4. **Verify**. The first wire call after restart logs
`"FOCAS unlock applied for focas://{host}:{port}"` at info. A wrong
password surfaces as `BadUserAccessDenied` on the next gated read or
write.
5. **Audit.** OPC UA wrote-event entries (per
[`audit-log-rules.md`](audit-log-rules.md)) cover the
parameter/macro write paths. Password rotation itself is NOT logged
beyond "unlock applied" — same posture as LDAP service-account
rotation, where the password change is logged out-of-band by the IAM
system.
### Cross-references
- [`docs/Security.md`](../Security.md) — server-wide secrets handling +
the same `.local/` pattern used for LDAP and the Galaxy.Host pipe
secret. The FOCAS password follows the same posture.
- [`docs/drivers/FOCAS.md`](../drivers/FOCAS.md) § "FOCAS password" —
driver-side behaviour, EW_PASSWD retry semantics, status-code
surface.
- [`docs/v2/implementation/focas-wire-protocol.md`](implementation/focas-wire-protocol.md)
§ "cnc_wrunlockparam" — wire-frame layout for the password buffer.
+2 -2
View File
@@ -6,7 +6,7 @@ enforces at driver init time. Every row cites the Fanuc FOCAS Developer
Kit function whose documented input range determines the ceiling.
**Why this exists** — we have no FOCAS hardware on the bench and no
working simulator. Fwlib32 returns `EW_NUMBER` / `EW_PARAM` when you
working simulator. FWLIB (Fwlib64, or Fwlib32 on legacy deployments) returns `EW_NUMBER` / `EW_PARAM` when you
hand it an address outside the controller's supported range; the
driver would map that to a per-read `BadOutOfRange` at steady state.
Catching at `InitializeAsync` with this matrix surfaces operator
@@ -140,6 +140,6 @@ matrix: Macro variable #50000 is outside the documented range
This validation closes the cheap half of the FOCAS hardware-free
stability gap — config errors now fail at load instead of per-read.
The expensive half is Tier-C process isolation so that a crashing
`Fwlib32.dll` doesn't take the main OPC UA server down with it. See
`Fwlib64.dll` doesn't take the main OPC UA server down with it. See
[`docs/v2/implementation/focas-isolation-plan.md`](implementation/focas-isolation-plan.md)
for that plan (task #220).
@@ -0,0 +1,38 @@
# Admin UI Phase 6 status audit (2026-04-24)
Audit pass that closes the Phase 6 Admin-UI tasks that were tracked as still-open (#128#131) but already had their Blazor pages shipped. Every page listed below compiles against the current `OtOpcUaConfigDbContext` schema + the current Admin service surface, has substantive (non-stub) content, and is covered by `ZB.MOM.WW.OtOpcUa.Admin.Tests` (112/112 green).
## Task #128 — /hosts column refresh (Phase 6.1 Stream E.2/E.3)
`Components/Pages/Hosts.razor` — 233 LOC. Route `/hosts`. Ships:
- Per-driver circuit-breaker columns (`ConsecutiveFailures`, `LastCircuitBreakerOpenUtc`).
- Stale-row detection via `HostStatusService.IsStale` (publisher heartbeat ≥ 30 s stale threshold).
- Summary cards: Running / Stale / Faulted / total.
- Auto-refresh every `RefreshIntervalSeconds` driven by the `FleetStatusHub` SignalR feed.
- Health band via `DriverHostState` enum colour coding.
## Task #129 — RoleGrantsTab + AclsTab + Probe (Phase 6.2 Stream D)
- `Components/Pages/RoleGrants.razor` — 192 LOC. Route `/role-grants`. Edits LDAP-group → OPC-UA-role mappings with live reload over `AclChangeNotifier` SignalR.
- `Components/Pages/Clusters/AclsTab.razor` — 279 LOC. NodeAcl CRUD + the **"Probe this permission"** form (task #196 slice 1, embedded at line 38 onward). Binds `_probeGroup` / `_probeNamespaceId` / `_probeUnsAreaId` / `_probeUnsLineId` / `_probeEquipmentId` / `_probeTagId` / `_probePermission` through `PermissionProbeService`.
## Task #130 — RedundancyTab (Phase 6.3 Stream E)
`Components/Pages/Clusters/RedundancyTab.razor` — 175 LOC. Topology table, per-peer reachability (via `FleetStatusHub`), ServiceLevel band + `ApplyLeaseRegistry` / `RecoveryStateManager` state surfaces, failover action button. Live updates over the same SignalR hub `RedundancyPublisherHostedService` ticks.
## Task #131 — Draft / publish / diff / identification (Phase 6.4 Streams AD)
- `Components/Pages/Clusters/DraftEditor.razor` — 105 LOC. Route `/clusters/{ClusterId}/draft/{GenerationId:long}`. Calls `DraftValidationService` + `GenerationService`.
- `Components/Pages/Clusters/Generations.razor` — 73 LOC. Publish flow (generation state transitions through `sp_PublishGeneration`).
- `Components/Pages/Clusters/DiffViewer.razor` — 87 LOC. Route `/clusters/{ClusterId}/draft/{GenerationId:long}/diff`. Renders `sp_ComputeGenerationDiff` output.
- `Components/Pages/Clusters/IdentificationFields.razor` — 49 LOC. OPC 40010 Identification folder editor bound to the `Equipment` entity.
## What's NOT in this audit
- `#124` — Phase 6.2 3-user interop matrix. Authz layer is now covered by `ThreeUserInteropMatrixTests` in `ZB.MOM.WW.OtOpcUa.Server.Tests` (drives the 5 GLAuth users + admin through `LdapUserAuthenticator``AuthorizationGate.IsAllowed` for the role × operation matrix). The wire-level OPC UA-client cross-vendor leg still needs a UserName-token endpoint policy + manual client drill — that part stays a manual deliverable.
- `#119` — Phase 6.3 client interop matrix. Manual Ignition/Kepware/Aveva drills.
- `#113` — OPC UA CTT conformance pass. Manual CTT run.
- `#114` / `#115` — Redundancy cutover + deployment checklist. Manual.
Those remain GA-gating but require a human at a console, not a code change.
+129
View File
@@ -0,0 +1,129 @@
# Phase 3 Exit Gate — Driver Fleet (reconstructed retroactively)
> **Status**: **CLOSED (reconstructed 2026-04-23)**. The original plan split the
> driver work across Phases 3 / 4 / 5 (Modbus alone → four PLC drivers → two
> specialty drivers). In execution, all seven non-Galaxy drivers shipped under
> one umbrella against `Core.Abstractions` + `Core`'s generic driver-hosting
> machinery. This doc captures the closure retroactively; no forward work
> remains under these three original phase numbers.
>
> **Plan doc**: none — phases 3/4/5 were intentionally not split out into
> separate plan docs once it was clear the capability-interface contract
> introduced in Phase 1 (`Core.Abstractions` — plan decision #4) was stable
> enough that each driver could land as its own stream rather than as a
> gated mini-phase. See `docs/v2/plan.md` §6 for the now-consolidated
> migration strategy.
## Scope
All seven drivers in the v2 target list (Decision #5) minus Galaxy (closed
separately under Phase 2). The Galaxy Proxy+Host+Shared split exited under
`exit-gate-phase-2-final.md`; this gate does not re-cover it.
## What shipped
### Drivers
| Driver | Project | Capability surface | Test projects |
|---|---|---|---|
| Modbus TCP | `Driver.Modbus` + `Driver.Modbus.Cli` | `IDriver` + `ITagDiscovery` + `IReadable` + `IWritable` + `ISubscribable` + `IHostConnectivityProbe` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| AB CIP | `Driver.AbCip` + `Driver.AbCip.Cli` | all of the above + `IPerCallHostResolver` + `IAlarmSource` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| AB Legacy (PCCC / DF1) | `Driver.AbLegacy` + `Driver.AbLegacy.Cli` | `IDriver` + `IReadable` + `IWritable` + `ITagDiscovery` + `ISubscribable` + `IHostConnectivityProbe` + `IPerCallHostResolver` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| Siemens S7 | `Driver.S7` + `Driver.S7.Cli` | `IDriver` + `ITagDiscovery` + `IReadable` + `IWritable` + `ISubscribable` + `IHostConnectivityProbe` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| Beckhoff TwinCAT (ADS) | `Driver.TwinCAT` + `Driver.TwinCAT.Cli` | `IDriver` + `IReadable` + `IWritable` + `ITagDiscovery` + `ISubscribable` + `IHostConnectivityProbe` + `IPerCallHostResolver` | `Tests`, `IntegrationTests`, `Cli.Tests` |
| FANUC FOCAS | `Driver.FOCAS` + `Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `Driver.FOCAS.Cli` | `IDriver` + `IReadable` + `IWritable` + `ITagDiscovery` + `ISubscribable` + `IHostConnectivityProbe` + `IPerCallHostResolver`; Tier-C out-of-process backend mirrors the Galaxy Proxy/Host split. `Fwlib64FocasBackend` shipped 2026-04-23 as the production backend (P/Invoke against `Fwlib64.dll`); Host retargeted from net48 x86 to net10.0-windows x64 at the same time. | `Tests`, `Host.Tests`, `Shared.Tests`, `Cli.Tests` |
| OPC UA Client (gateway) | `Driver.OpcUaClient` | `IDriver` + `ITagDiscovery` + `IReadable` + `IWritable` + `ISubscribable` + `IHostConnectivityProbe` + `IAlarmSource` + `IHistoryProvider` (richest surface in the fleet — it's bridging another UA server) | `Tests`, `IntegrationTests` |
### Supporting infrastructure
| PR / Task | Summary |
|---|---|
| #248 | `DriverFactoryRegistry` + `DriverInstanceBootstrapper` — central DB `DriverInstance` rows materialise into live `IDriver` instances at server startup. |
| #210 | Modbus server-side factory + seed SQL (closed first child of umbrella #209). |
| #211 #212 #213 | AB CIP / S7 / AB Legacy server-side factories + seed SQL. |
| #220 (FOCAS) | FOCAS factory wired into the bootstrap pipeline; Tier-C split (`Driver.FOCAS.Host` process launcher, named-pipe IPC, NSSM install scripts, post-mortem MMF) shipped across the five-PR series. |
| (this session) | TwinCAT factory wired in + Server project reference added; all seven driver factories now register uniformly in `Server/Program.cs`. |
| #249 #250 #251 | Per-driver test-client CLI suite (`otopcua-<driver>-cli`) — shared lib + one CLI per driver for direct-to-PLC smoke testing independent of the server. |
| #253 + follow-ups | E2E CLI test scripts (`scripts/e2e/test-<driver>.ps1`) — five-stage bidirectional bridge + subscribe-sees-change assertions per driver, plus `test-all.ps1` matrix runner. |
| (this session) | OPC UA Client e2e script shipped (`test-opcuaclient.ps1`, 8 stages) — the only driver that was missing an e2e script. |
### Docs
Per-driver test-fixture documentation:
- `docs/drivers/Modbus-Test-Fixture.md`
- `docs/drivers/AbServer-Test-Fixture.md` (covers AB CIP fixture)
- `docs/drivers/AbLegacy-Test-Fixture.md`
- `docs/drivers/S7-Test-Fixture.md`
- `docs/drivers/TwinCAT-Test-Fixture.md`
- `docs/drivers/FOCAS-Test-Fixture.md`
- `docs/drivers/OpcUaClient-Test-Fixture.md`
Driver-level ops docs:
- `docs/Driver.Modbus.Cli.md`, `docs/Driver.AbCip.Cli.md`, `docs/Driver.AbLegacy.Cli.md`, `docs/Driver.S7.Cli.md`, `docs/Driver.TwinCAT.Cli.md`, `docs/Driver.FOCAS.Cli.md`
- `docs/v2/driver-specs.md` — unified capability-matrix spec for all eight drivers (Galaxy + seven).
## Compliance evidence
No dedicated `phase-3-compliance.ps1` exists — scope was too broad to fit the
single-script pattern that worked for Phases 6.x and 7. Verification instead
takes the form of the per-driver test suites + e2e scripts:
- [x] **Unit tests** — every driver has a `Tests` project with capability-interface contract tests; `dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.*.Tests` is green.
- [x] **Integration tests**`Driver.*.IntegrationTests` stands up Docker-hosted simulators (pymodbus, ab_server, python-snap7, opc-plc) at collection init and exercises real wire-level read/write/subscribe/probe per driver.
- [x] **CLI tests**`Driver.*.Cli.Tests` covers the per-driver test-client CLIs (#249#251).
- [x] **E2E scripts**`scripts/e2e/test-<driver>.ps1` covers the driver-CLI → PLC → OtOpcUa server → OPC UA client round-trip for all seven drivers + Galaxy; `test-all.ps1` aggregates; README status section (rewritten this session) summarises live-boot evidence.
- [x] **Factory registration** — all seven factories plus Galaxy register in `src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` inside the `DriverFactoryRegistry` composition; the `DriverInstanceBootstrapper` can materialise any configured row.
- [x] **Seed SQL**#210#213 provide per-driver Config DB seed scripts so a fresh Config DB is populatable without Admin UI interaction.
### Live-boot verification
Recorded across the session-level tracking tasks:
| Driver | Fixture | Stages | Tracking |
|---|---|---|---|
| Modbus | pymodbus (dl205 profile) | 5/5 | #209 exit gate; bidirectional + subscribe-sees-change added in #253 follow-ups |
| AB CIP | `ab_server` ControlLogix | 5/5 | #220 |
| S7 | python-snap7 | 5/5 | #220 |
| AB Legacy | `ab_server` SLC500 / MicroLogix / PLC-5 (requires `/1,0` cip-path for Docker fixture) | 5/5 | #222 partial |
| OPC UA Client | opc-plc Docker fixture | 5/8 (probe, remote read, forward bridge, subscribe, browse) | (this session) |
| TwinCAT | TCBSD VM @ 10.100.0.128 (AmsNetId `41.169.163.43.1.1`) — real TwinCAT runtime under FreeBSD on ESXi; bypasses the Hyper-V/RTIME conflict that blocks XAR on this dev box | features validated | fixture is the TCBSD VM; `TWINCAT_TRUST_WIRE=1` still gates the e2e script by default so unintentional runs against cold fixtures don't false-pass |
| FOCAS | Lab-rig CNC + `Fwlib64.dll` | — | **deferred**`Fwlib64FocasBackend` shipped 2026-04-23; wire-level live-boot gated `FOCAS_TRUST_WIRE=1`, lab rig tracked under #222 follow-up |
| Galaxy | Live Galaxy + `OtOpcUaGalaxyHost` (this dev box) | 7/7 (read / write / subscribe / alarms / history) | closed under Phase 2 |
## Deferred to post-gate follow-ups
Items intentionally not blocking closure of this umbrella — each is hardware-
dependent and tracked separately:
- [ ] **FOCAS wire-level live-boot**`test-focas.ps1` against a real CNC once `Fwlib64.dll` is on PATH and `FOCAS_TRUST_WIRE=1` (#222 follow-up). The `Fwlib64FocasBackend` shipped 2026-04-23 — code exists, unit-tests green; only the live-CNC smoke test remains.
- [x] **FOCAS `Fwlib64FocasBackend`****CLOSED 2026-04-23**. The production backend in `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/Backend/Fwlib64FocasBackend.cs` wraps `FwlibFocasClient` to fulfil `IFocasBackend` against the licensed `Fwlib64.dll`. Host project retargeted to `net10.0-windows` x64. Default when `OTOPCUA_FOCAS_BACKEND` is unset. 6 new backend tests green. Only wire-level live-boot against real hardware remains — see item above.
- [ ] **OPC UA Client stages 5/7/8** — reverse-bridge, alarm, history stages are opt-in via sidecar NodeId params because opc-plc's default image has no writable nodes and doesn't historize. Against a richer upstream (Prosys, UA Expert sample server) all eight stages can run.
## Completion checklist
- [x] Modbus driver shipped + unit + integration + CLI tests green
- [x] AB CIP driver shipped + tests green + live-boot 5/5
- [x] AB Legacy driver shipped + tests green + live-boot 5/5
- [x] S7 driver shipped + tests green + live-boot 5/5
- [x] TwinCAT driver shipped + tests green + features validated against the TCBSD VM virtual-PLC fixture
- [x] FOCAS driver shipped (Tier-C split) + tests green (wire-live deferred)
- [x] OPC UA Client driver shipped + tests green + live-boot 5/8
- [x] `DriverFactoryRegistry` + `DriverInstanceBootstrapper` shipped
- [x] All seven factories registered in `Server/Program.cs`
- [x] Per-driver test-client CLI suite shipped
- [x] E2E test scripts shipped + `test-all.ps1` aggregator green
- [x] Per-driver test-fixture docs present
- [x] `docs/v2/driver-specs.md` unified capability spec present
- [x] `scripts/e2e/README.md` status section reflects current live-boot matrix
- [x] Exit gate doc checked in (this file)
- [x] TwinCAT validated against the TCBSD VM virtual-PLC fixture — `TWINCAT_TRUST_WIRE=1` + e2e script still gated by default to prevent false-pass against cold fixtures
- [ ] FOCAS lab-rig follow-up filed + tracked (#222)
## Why no compliance script
The Phases 6.1/6.2/6.3/6.4/7 pattern of a single `phase-N-compliance.ps1`
worked because each of those phases touched a narrow slice of server-side
runtime. A "phase-3-compliance.ps1" would have had to boot seven simulators,
configure seven DriverInstance rows, and run seven e2e scripts — which is
exactly what `scripts/e2e/test-all.ps1` already does. The aggregate runner
+ its README is the compliance artefact for this umbrella.
+9 -9
View File
@@ -1,6 +1,6 @@
# Phase 7 Exit Gate — Scripting, Virtual Tags, Scripted Alarms, Historian Sink
> **Status**: Open. Closed when every compliance check passes + every deferred item either ships or is filed as a post-v2-release follow-up.
> **Status**: **FULLY CLOSED** 2026-04-23 audit — the three original follow-ups (#239 / #240 / #241) were all shipped under later branches but this exit-gate doc wasn't updated at the time. All three verified against the repo + tests green.
>
> **Compliance script**: `scripts/compliance/phase-7-compliance.ps1`
> **Plan doc**: `docs/v2/implementation/phase-7-scripting-and-alarming.md`
@@ -45,13 +45,13 @@ Covered by `scripts/compliance/phase-7-compliance.ps1`:
- [x] Walker emits `NodeSourceKind.Virtual` + `NodeSourceKind.ScriptedAlarm` variables
- [x] `DriverNodeManager` dispatch routes Reads by source; Writes to non-Driver rejected with `BadUserAccessDenied` (plan #6)
## Deferred to Post-Gate Follow-ups
## Deferred to Post-Gate Follow-ups (all closed as of 2026-04-23 audit)
Kept out of the capstone so the gate can close cleanly while the less-critical wiring lands in targeted PRs:
Originally kept out of the capstone so the gate could close cleanly. Each landed as a targeted follow-up PR; audit this session verified them against the repo:
- [ ] **SealedBootstrap composition root** (task #239) — instantiate `VirtualTagEngine` + `ScriptedAlarmEngine` + `SqliteStoreAndForwardSink` in `Program.cs`; pass `VirtualTagSource` + `ScriptedAlarmSource` as the new `IReadable` parameters on `DriverNodeManager`. Without this, the engines are dormant in production even though every piece is tested.
- [ ] **Live OPC UA end-to-end smoke** (task #240) — Client.CLI browse + read a virtual tag computed by Roslyn; Client.CLI acknowledge a scripted alarm via the Part 9 method node; historian-disabled deployment returns `BadNotFound` for virtual nodes rather than silent failure.
- [ ] **sp_ComputeGenerationDiff extension** (task #241) — emit Script / VirtualTag / ScriptedAlarm sections alongside the existing Namespace/DriverInstance/Equipment/Tag/NodeAcl rows so the Admin DiffViewer shows Phase 7 changes between generations.
- [x] **SealedBootstrap composition root** (task #239) — **CLOSED**. `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs` instantiates `VirtualTagEngine` + `ScriptedAlarmEngine` via `Phase7EngineComposer.Compose`, and `SqliteStoreAndForwardSink` in `ResolveHistorianSink` when a registered driver provides `IAlarmHistorianWriter` (today: `GalaxyProxyDriver`). `OpcUaServerService.ExecuteAsync` calls `Phase7Composer.PrepareAsync` then `OpcUaApplicationHost.SetPhase7Sources` **before** `applicationHost.StartAsync` so `OtOpcUaServer` + `DriverNodeManager` capture the `VirtualReadable` / `ScriptedAlarmReadable` at construction. 38 tests green under `tests/ZB.MOM.WW.OtOpcUa.Server.Tests/Phase7/` + `SealedBootstrapIntegrationTests`. The work landed under the label "Phase 7 follow-up #246" and was never re-labelled against #239.
- [x] **Live OPC UA end-to-end smoke** (task #240) — **CLOSED**. `scripts/e2e/test-phase7-virtualtags.ps1` drives a full Client.CLI read of a driver-sourced input, reads the VirtualTag computed off it, triggers a scripted alarm by writing the trigger value, and subscribes to the alarm condition — all through a running OtOpcUa server. Covered in `scripts/e2e/test-all.ps1` + `scripts/e2e/README.md` matrix.
- [x] **sp_ComputeGenerationDiff extension** (task #241) — **CLOSED**. Migration `20260420232000_ExtendComputeGenerationDiffWithPhase7.cs` extends the stored proc to emit Script / VirtualTag / ScriptedAlarm sections alongside the existing NodeAcl / Tag / Equipment / DriverInstance / Namespace output. Admin DiffViewer picks them up through its existing section-plugin architecture (Phase 6.4 Stream C).
## Completion Checklist
@@ -66,9 +66,9 @@ Kept out of the capstone so the gate can close cleanly while the less-critical w
- [x] `phase-7-compliance.ps1` present and passes
- [x] Full solution `dotnet test` passes (no new failures beyond pre-existing tolerated CLI flake)
- [x] Exit-gate doc checked in
- [ ] `SealedBootstrap` composition follow-up filed + tracked
- [ ] Live end-to-end smoke follow-up filed + tracked
- [ ] `sp_ComputeGenerationDiff` extension follow-up filed + tracked
- [x] `SealedBootstrap` composition follow-up shipped (#239 / Phase 7 follow-up #246)
- [x] Live end-to-end smoke follow-up shipped (#240`scripts/e2e/test-phase7-virtualtags.ps1`)
- [x] `sp_ComputeGenerationDiff` extension follow-up shipped (#241 — migration `ExtendComputeGenerationDiffWithPhase7`)
## How to run
+16 -5
View File
@@ -1,10 +1,21 @@
# FOCAS Tier-C isolation — plan for task #220
> **Status**: PRs AE shipped. Architecture is in place; the only
> remaining FOCAS work is the hardware-dependent production
> integration of `Fwlib32.dll` into a real `IFocasBackend`
> (`FwlibHostedBackend`), which needs an actual CNC on the bench
> and is tracked as a follow-up on #220.
> **Status**: **FULLY SHIPPED** (code). PRs AE shipped the architecture; the
> 2026-04-23 follow-up shipped the production `Fwlib64FocasBackend` wrapping
> the licensed `Fwlib64.dll`. Only the wire-level live-boot against real
> hardware remains (task #222 / requires a bench CNC).
>
> **Major update 2026-04-23 — Host retargeted to .NET 10 x64 + Fwlib64**:
> Both `Fwlib32.dll` and `Fwlib64.dll` are licensed for this project. The
> original plan put the Host on .NET 4.8 x86 because Fwlib32 was assumed.
> With Fwlib64 available, the Host moves to `net10.0-windows` x64 — same
> runtime as the rest of the fleet. **Tier-C isolation stays anyway** — the
> blast-radius argument against a closed-source vendor P/Invoke is independent
> of bitness. Galaxy (forced x86 by MXAccess COM) is a pure bitness forcing;
> FOCAS is a pure blast-radius choice. Body of this document still reflects
> the original x86 assumptions in a few places — read them as historical
> design context; the current shape is in `docs/drivers/FOCAS-Test-Fixture.md`
> and `exit-gate-phase-3.md`.
>
> **Pre-reqs shipped**: version matrix + pre-flight validation
> (PR #168 — the cheap half of the hardware-free stability gap).
@@ -1,394 +0,0 @@
# FOCAS simulator (focas-mock) plan
Notes on the focas-mock simulator that the FOCAS driver's integration
tests will eventually talk to. Today there is no FOCAS integration-test
project; this doc is the contract the future fixture will be built
against. Keeping the contract tracked in repo means the wire-protocol
command ids (and their request/response payloads) don't drift between the
.NET wire client and a future Python implementation.
## Ground rules
- Append-only command ids. Mirror
[`focas-wire-protocol.md`](./focas-wire-protocol.md) verbatim.
- Per-profile state. The simulator hosts N CNC profiles concurrently
(`Series0i`, `Series30i`, `PowerMotion`, ...). Each profile has its own
alarm-history ring buffer + its own override map.
- Admin endpoints under `POST /admin/...` mutate state without going
through the wire protocol; integration tests use these to seed canned
inputs.
## Protocol surface (current scope)
| Cmd | API | State impact |
| --- | --- | --- |
| `0x0001` | `cnc_rdcncstat` | reads cached ODBST per profile |
| `0x0002` | `cnc_rdparam` | reads parameter map per profile |
| `0x0003` | `cnc_rdmacro` | reads macro variables per profile |
| `0x0004` | `cnc_rddiag` | reads diagnostic map per profile |
| `0x0010` | `pmc_rdpmcrng` | reads PMC byte ranges |
| `0x0020` | `cnc_modal` | reads cached modal MSTB per profile |
| ... | ... | ... |
| **`0x0102`** | **`cnc_wrparam`** | **mutates per-profile parameter map; returns `EW_PASSWD` (`11`) when the profile's `unlock_state` is off (sets up F4-d's unlock workflow) — issue #269, plan PR F4-b** |
| **`0x0103`** | **`cnc_wrmacro`** | **mutates per-profile macro map; integer-only writes for now (decimalPointCount=0) — issue #269, plan PR F4-b** |
| **`0x0104`** | **`pmc_wrpmcrng`** | **mutates per-profile PMC byte tables; byte-aligned writes preserve untouched bytes; bit-level writes never reach the simulator (driver wraps with RMW) — issue #270, plan PR F4-c** |
| **`0x0105`** | **`cnc_wrunlockparam`** | **flips the per-profile `unlock_state` to true when the supplied 4-byte password buffer matches the profile's `unlock_password`; otherwise returns `EW_PASSWD`. State persists for the connection lifetime (per-session). — issue #271, plan PR F4-d** |
| **`0x0F1A`** | **`cnc_rdalmhistry`** | **dumps the per-profile alarm-history ring buffer (issue #267, plan PR F3-a)** |
## `cnc_rdalmhistry` mock behaviour
The simulator keeps a per-profile ring buffer of alarm-history entries.
Default fixture seeds 5 profiles with 10 canned entries each (per the F3-a
plan).
### Request decode
```
[int16 LE depth]
```
### Response encode
Use `FocasAlarmHistoryDecoder.Encode` semantics in reverse: emit the
count followed by `ALMHIS_data` blocks padded to 4-byte boundaries. The
.NET-side decoder consumes the same format verbatim, so a Python encoder
written against the table in
[`focas-wire-protocol.md`](./focas-wire-protocol.md) interoperates without
extra glue.
### Admin endpoint — `POST /admin/mock_patch_alarmhistory`
Replaces the alarm-history ring buffer for a profile.
```
POST /admin/mock_patch_alarmhistory
{
"profile": "Series30i",
"entries": [
{
"occurrenceTime": "2025-04-01T09:30:00Z",
"axisNo": 1,
"alarmType": 2,
"alarmNumber": 100,
"message": "Spindle overload"
},
...
]
}
```
`entries` order is interpreted as ring-buffer order (most-recent first to
match FANUC's natural surface).
### `FocasSimFixture.SeedAlarmHistoryAsync`
The future test-support helper wraps the admin endpoint:
```csharp
await fixture.SeedAlarmHistoryAsync(
profile: "Series30i",
entries: new []
{
new FocasAlarmHistoryEntry(
new DateTimeOffset(2025, 4, 1, 9, 30, 0, TimeSpan.Zero),
AxisNo: 1, AlarmType: 2, AlarmNumber: 100, Message: "Spindle overload"),
});
```
Integration test `Series/AlarmHistoryProjectionTests.cs` will assert:
- historic events fire once with the seeded timestamps
- second poll yields zero new events (dedup honoured end-to-end)
- active-alarm raise/clear still works alongside the history poll
These tests are blocked on the focas-mock + integration-test project
landing; the unit-test coverage in `FocasAlarmProjectionTests` already
exercises every same-process invariant.
## `cnc_wrparam` / `cnc_wrmacro` mock behaviour — issue #269, plan PR F4-b
When the focas-mock fixture lands, it MUST implement the contract below.
The .NET side already ships against this contract (`FwlibFocasClient.cs`
write helpers, `FakeFocasClient` round-trip support); writing the simulator
to the same shape lets the existing integration-test scaffolds at
`tests/.../IntegrationTests/Series/ParameterWriteTests.cs` and
`MacroWriteTests.cs` (when they materialise) light up without driver
changes.
### Per-profile state
Each profile owns:
- `parameters: Dict[int, int]` — map from parameter number to current value.
- `macros: Dict[int, int]` — map from macro number to current scaled-int
value (decimal-point count fixed at 0 for F4-b).
- `unlock_state: bool` — defaults `False`. When `False`, every
`cnc_wrparam` returns `EW_PASSWD` (numeric `11`) regardless of
parameter. Macro writes are NOT gated by `unlock_state`.
- `unlock_password: bytes` (4-byte buffer) — defaults to the profile's
fixture default (e.g. `b"1234"` for Series30i). Compared byte-for-byte
by the `cnc_wrunlockparam` handler; flips `unlock_state = True` on
match, leaves it untouched on mismatch (and returns `EW_PASSWD`).
Mutable via `POST /admin/mock_set_password` for tests that exercise
rotation. Issue #271, plan PR F4-d.
- `last_write: Optional[LastWrite]` — most-recent successful
`(kind, number, value, ts)` tuple, surfaced via the admin endpoint
below for audit-log assertions.
### `cnc_wrparam` request decode
```
[int16 LE datano][int16 LE axis][int8|int16|int32 LE value]
```
Width of the value field is determined by the request frame trailer
length per the table in
[`focas-wire-protocol.md`](./focas-wire-protocol.md). On
`unlock_state == False` short-circuit to `[int16 LE 11]` (`EW_PASSWD`).
Otherwise mutate `parameters[datano] = value`, set `last_write`, return
`[int16 LE 0]`.
### `cnc_wrmacro` request decode
```
[int16 LE number][int16 LE length=8][int32 LE mcr_val][int16 LE dec_val]
```
Always accept (no `unlock_state` gate). Mutate
`macros[number] = mcr_val` (we ignore `dec_val` for F4-b — integer-only).
Return `[int16 LE 0]`. Round-trip: a subsequent `cnc_rdmacro(number)`
returns `(mcr_val, 0)`.
### Admin endpoint — `POST /admin/mock_set_unlock_state`
Toggles `unlock_state` for the F4-d unlock workflow tests. Without this,
F4-b parameter-write integration tests can't reproduce the
`EW_PASSWD``BadUserAccessDenied` mapping.
```
POST /admin/mock_set_unlock_state
{ "profile": "Series30i", "unlocked": true }
```
### `cnc_wrunlockparam` request decode — issue #271, plan PR F4-d
```
[byte[4] password]
```
Match `password == profile.unlock_password` byte-for-byte. On match:
flip `unlock_state = True`, return `[int16 LE 0]`. On mismatch: leave
`unlock_state` untouched, return `[int16 LE 11]` (`EW_PASSWD`).
The simulator deliberately keeps unlock state per-session (per OpenSession
handle) so a reconnect drops back to `unlock_state = False` — matching the
FWLIB lifetime semantics described in
[`focas-wire-protocol.md`](./focas-wire-protocol.md) § "cnc_wrunlockparam".
### Admin endpoint — `POST /admin/mock_set_password`
Rotates the per-profile `unlock_password` for tests that exercise the
F4-d password-rotation runbook (`docs/v2/focas-deployment.md`
§ "FOCAS password handling"). Idempotent — call again to revert.
```
POST /admin/mock_set_password
{ "profile": "Series30i", "password": "5678" }
```
The endpoint accepts the password as a UTF-8/ASCII string and applies
the same right-pad-to-4-bytes / truncate-to-4-bytes normalisation the
driver does, so simulator-side matching is byte-symmetric with the
production wire encoder.
### Admin endpoint — `GET /admin/mock_get_last_write`
Returns the simulator's view of the most-recent successful write, used by
F4-b audit-log integration assertions ("did the write actually reach the
fixture, and is the audit log capturing the right kind/number/value?").
```
GET /admin/mock_get_last_write?profile=Series30i
->
{
"kind": "param", // "param" | "macro"
"number": 1815,
"value": 100,
"writtenAt": "2026-04-25T13:30:00Z"
}
```
When no write has happened the endpoint returns `null` rather than 404 so
the test helper can assert "no writes since fixture reset" without
exception handling.
## `pmc_wrpmcrng` mock behaviour — issue #270, plan PR F4-c
The simulator keeps a per-profile PMC byte table keyed by `(addr_type,
byte_address)` — the same map the existing `pmc_rdpmcrng` handler reads
from. The write handler mutates the same map so a subsequent read sees
the written bytes.
### Per-profile state
Each profile carries:
```python
pmc: Dict[int, bytearray] # addr_type -> bytearray (one per PMC letter, default 256 bytes each)
```
`addr_type` is the PMC area code (R=5, G=4, F=3, D=8, X=1, Y=2, K=10,
A=11, E=12, T=6, C=7); the existing `pmc_rdpmcrng` fixture seeds the
defaults (zeros + a few canned bits per the dl205-style profile fixtures).
### `pmc_wrpmcrng` request decode
| Offset | Width | Field |
| --- | --- | --- |
| 0 | int16 LE | `addr_type` |
| 2 | int16 LE | `data_type` (must be `0` = byte; the driver only emits byte writes) |
| 4 | uint16 LE | `datano_s` |
| 6 | uint16 LE | `datano_e` |
| 8 | bytes | `data[]``(datano_e - datano_s + 1)` bytes |
Handler steps:
1. Look up the per-profile bytearray for `addr_type` (allocate on first
write, default 256 zeros).
2. **Validate** `0 <= datano_s <= datano_e < len(bytearray)` — otherwise
return `EW_NUMBER` (`4`).
3. **Validate** `len(data) == datano_e - datano_s + 1` — otherwise
return `EW_LENGTH` (`14`).
4. **Validate** `data_type == 0` — otherwise return `EW_DATA` (`9`)
because the driver only ever emits byte writes (bit writes wrap with
driver-side RMW so they reach the simulator as 1-byte writes).
5. Copy `data[]` into `bytearray[datano_s:datano_e+1]`. Other bytes
in the array are untouched.
6. Update `last_write` admin-endpoint state (kind=`pmc`, address-type,
start byte, length, bytes).
7. Return `ew_status = 0`.
### Round-trip invariant
The simulator MUST satisfy:
```
write(R, [10..12], [0xAA, 0xBB, 0xCC]); read(R, [10..12]) == [0xAA, 0xBB, 0xCC]
```
and the **byte-isolation invariant**:
```
write(R, [11], [0xFF]); bytes[10] == prior bytes[10] && bytes[12] == prior bytes[12]
```
The integration tests `Series/PmcRangeWriteTests.cs` and
`Series/PmcBitRmwIntegrationTests.cs` assert both shapes.
### Admin endpoint — `GET /admin/mock_get_last_write` extension
The `last_write` payload gains a `kind: "pmc"` variant:
```
{
"kind": "pmc",
"addr_type": 5, // R
"datano_s": 100,
"datano_e": 100,
"bytes": "0x08", // hex-encoded
"writtenAt": "2026-04-25T13:30:00Z"
}
```
Bit-level writes never appear here as a separate kind — they reach the
simulator as 1-byte writes after the driver's RMW wrapper, so the audit
shape is identical to a byte write at the same address.
## Cycle-time per part / last cycle delta — F5-a (issue #272)
Plan PR F5-a derives `Production/LastCycleSeconds` +
`Production/LastCycleStartUtc` from the existing `cnc_rdparam(6711)` +
`cnc_rdtimer` snapshot stream — **pure derivation, no new wire calls**.
The simulator does NOT need new wire commands; the existing
`cnc_rdparam` + `cnc_rdtimer` handlers already cover the read surface.
What focas-mock DOES need is an admin endpoint + test-fixture helper
that lets integration tests atomically increment the parts-count
counter alongside the cycle-time timer so the driver sees a clean
"cycle completed" transition on the next probe tick.
### Per-profile state
Already covered by the existing F1-b state map:
- `parameters: Dict[int, int]` (entry `6711` is the parts-count counter).
- `timers: Dict[int, int]` (entry `0` is the live cycle-time counter,
in seconds).
### Admin endpoint — `POST /admin/mock_simulate_cycle_completion`
Atomically advances both values to model "the CNC just finished a
cycle". Atomicity matters: the F5-a derivation samples both fields on
every probe tick, so if the simulator updated parts-count and the
timer in two separate writes the test could observe an intermediate
state where parts-count incremented but the timer hasn't updated yet
(producing a misleading `LastCycleSeconds`).
```
POST /admin/mock_simulate_cycle_completion
{
"profile": "Series30i",
"partsDelta": 1, // default 1; tests asserting backfill use 3+
"newCycleTimerSeconds": 18 // absolute value, NOT a delta
}
```
Handler steps:
1. `parameters[6711] += partsDelta` (under the per-profile lock).
2. `timers[0] = newCycleTimerSeconds`.
3. Return `200 OK` with the new values for verification.
The endpoint MUST hold the profile's update lock for the full
read-modify-write so a concurrent `cnc_rdparam` + `cnc_rdtimer` poll
sees both fields in their pre-update OR post-update state — never
half-applied.
### `FocasSimFixture.SimulateCycleCompletionAsync`
The future test-support helper wraps the admin endpoint:
```csharp
await fixture.SimulateCycleCompletionAsync(
profile: "Series30i",
partsDelta: 1,
newCycleTimerSeconds: 18);
```
Integration test `Series/CycleDeltaTests.cs` will assert:
- After a 5 -> 6 transition with `newCycleTimerSeconds=18`, the
driver's `Production/LastCycleSeconds` settles to `currentTimer -
prevTimer`.
- `Production/LastCycleStartUtc` is within driver-tolerance of
`nowUtc - LastCycleSeconds` (allow a small window for probe-tick
jitter).
- Counter reset (parts -> 0) preserves the last published values.
- Cycle-timer rollover does not publish a negative delta.
These tests are blocked on the focas-mock + integration-test project
landing; the unit-test coverage in `FocasCycleDeltaTests` already
exercises every same-process invariant of the derivation.
### Status
focas-mock simulator has not landed yet (tracked separately from F4-b /
F4-c). F4-b + F4-c land the .NET-side wire encoders + dispatch + status
mapping unconditionally; the integration-test scaffolds at
`tests/.../IntegrationTests/Series/ParameterWriteTests.cs`,
`MacroWriteTests.cs`, `PmcRangeWriteTests.cs`, and
`PmcBitRmwIntegrationTests.cs` are deferred until the simulator +
integration-test project land. Until then unit-test coverage in
`FocasWriteParameterTests` / `FocasWriteMacroTests` /
`FocasWritePmcTests` exercises every same-process invariant against the
in-memory `FakeFocasClient`.
+230 -206
View File
@@ -1,267 +1,291 @@
# FOCAS wire protocol — packed-buffer surface
# FOCAS wire protocol — what's authoritative vs. what's guessed
Notes on the language-neutral packed-buffer encoding the FOCAS driver +
focas-mock simulator share. This format is **not** the FWLIB native struct
layout — Tier-C Fwlib32 backends marshal directly from the FANUC C struct.
The packed surface exists so the simulator (Python / FastAPI) and the .NET
wire client can speak a common format over IPC without piping a Win32 DLL
through both ends.
Companion to [`focas-simulator-plan.md`](focas-simulator-plan.md). Written during
Stream B on 2026-04-23 after a research pass through `strangesast/fwlib` +
public FOCAS documentation. Purpose: separate what we *know* about the FOCAS
wire protocol (can quote with confidence) from what we're *guessing* (will need
Wireshark traces to validate in Stream C).
## Command id table
This document directly informs `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/server/`.
Each FOCAS-equivalent call gets a stable wire-protocol command id. Ids are
**append-only** — never renumber, never reuse.
## Authoritative — from Fanuc's public `fwlib32.h`
| Id | FOCAS API | Surface |
| --- | --- | --- |
| `0x0001` | `cnc_rdcncstat` | ODBST 9-field status struct |
| `0x0002` | `cnc_rdparam` | parameter value (one number) |
| `0x0003` | `cnc_rdmacro` | macro variable value |
| `0x0004` | `cnc_rddiag` | diagnostic value |
| ... | ... | ... |
| **`0x0102`** | **`cnc_wrparam`** | **IODBPSD parameter-write packet (issue #269, plan PR F4-b)** |
| **`0x0103`** | **`cnc_wrmacro`** | **ODBM macro-write packet (issue #269, plan PR F4-b)** |
| **`0x0104`** | **`pmc_wrpmcrng`** | **IODBPMC PMC range-write packet (issue #270, plan PR F4-c)** |
| **`0x0105`** | **`cnc_wrunlockparam`** | **4-byte password buffer for the parameter-protect / read-protect unlock (issue #271, plan PR F4-d)** |
| `0x0F1A` | **`cnc_rdalmhistry`** | **ODBALMHIS alarm-history ring-buffer dump (issue #267, plan PR F3-a)** |
The header file is distributed with the FOCAS Developer Kit and mirrored in OSS
repos (notably `strangesast/fwlib`). The **struct layouts** documented there
are stable across FOCAS versions and authoritative for the payload shapes our
Python mock has to emit.
## ODBALMHIS — alarm history (`cnc_rdalmhistry`, command `0x0F1A`)
### ODBM — macro variable read buffer
Issued by `FocasAlarmProjection` when
`FocasDriverOptions.AlarmProjection.Mode == ActivePlusHistory`. Returns up
to `depth` most-recent ring-buffer entries.
```c
typedef struct odbm {
short datano; // macro variable number
short dummy; // reserved / alignment padding
long mcr_val; // 32-bit signed macro value
short dec_val; // decimal-point count (0-9)
} ODBM;
```
### Request
With `#pragma pack(push, 4)` (the FOCAS default), total size is **10 bytes** on
Windows: 2 + 2 + 4 + 2. Our `FwlibNative.cs` matches this exactly.
| Offset | Width | Field | Notes |
| --- | --- | --- | --- |
| 0 | `int16 LE` | `depth` | clamped client-side to `[1..250]` (`FocasAlarmProjectionOptions.MaxHistoryDepth`) |
Our mock's `_READ_RESP_STRUCT = struct.Struct(">iH")` is **only 6 bytes**
missing `datano` + `dummy`. A real Fwlib decoding the scaffold response will
read garbage. Stream C fix: prepend two `short` fields.
### Response (packed buffer, little-endian)
### IODBPSD — CNC parameter read/write buffer
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `num_alm` — number of entries that follow. `< 0` indicates CNC error. |
| 2 | repeated | `ALMHIS_data alm[num_alm]` (see below) |
```c
typedef struct iodbpsd {
short datano; // parameter number
short type; // axis index (0 for non-axis parameters)
union {
char cdata;
short idata;
long ldata;
char cdatas[MAX_AXIS]; // MAX_AXIS varies — 8 on 0i, 32 on 30i
short idatas[MAX_AXIS];
long ldatas[MAX_AXIS];
} u;
} IODBPSD;
```
Each entry block:
With `pack(4)` and `MAX_AXIS=8`, total size = 2 + 2 + 32 = **36 bytes**. Our
`FwlibNative.cs` matches this (`[SizeConst = 32]` data buffer).
| Offset (rel.) | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `year` |
| 2 | `int16 LE` | `month` |
| 4 | `int16 LE` | `day` |
| 6 | `int16 LE` | `hour` |
| 8 | `int16 LE` | `minute` |
| 10 | `int16 LE` | `second` |
| 12 | `int16 LE` | `axis_no` (1-based; 0 = whole-CNC) |
| 14 | `int16 LE` | `alm_type` (P/S/OT/SV/SR/MC/SP/PW/IO encoded numerically) |
| 16 | `int16 LE` | `alm_no` |
| 18 | `int16 LE` | `msg_len` (0..32 typical) |
| 20 | `msg_len` | ASCII message (no null terminator) |
| `20 + msg_len` | 0..3 | pad to 4-byte boundary so per-entry blocks stay self-delimiting |
Our mock's current param handler doesn't return bytes in IODBPSD shape —
response payload is just the raw value. Stream C fix: wrap in 4-byte header
+ union-padded data.
The CNC stamps `year..second` in **its own local time**. The deployment
guide instructs operators to keep CNC clocks on UTC so the projection's
dedup key `(OccurrenceTime, AlarmNumber, AlarmType)` stays stable across
DST transitions. The .NET decoder
(`Wire/FocasAlarmHistoryDecoder.Decode`) constructs each
`DateTimeOffset` with `TimeSpan.Zero` (UTC) on that assumption.
### ODBST — status info
### Error handling
```c
typedef struct odbst {
short dummy; // reserved
short tmmode; // Memory / Tape / MDI / EDIT / DNC
short aut; // automatic mode
short run; // running state
short motion; // motion state
short mstb; // M/S/T/B finish signal
short emergency; // emergency stop
short alarm; // alarm state
short edit; // edit mode sub-state
} ODBST;
```
- A negative `num_alm` short-circuits decode to an empty list — the
projection treats it as "no history this tick" and the next poll
retries.
- Malformed timestamps (e.g. month=0) are skipped per-entry instead of
faulting the whole decode; the dedup key for malformed entries would be
unstable anyway.
- `msg_len` overrunning the payload truncates the entry list at the
malformed entry rather than throwing.
9 × short = **18 bytes**. Our mock already emits 18 bytes via
`struct.Struct(">9h")`. ✓ correct.
## IODBPSDparameter write (`cnc_wrparam`, command `0x0102`)
### IODBPMCPMC range read/write buffer
Issue #269, plan PR F4-b. The write-side payload is the **byte-symmetric
inverse of the `cnc_rdparam` read** — the same `IODBPSD` struct shape, and
the .NET wire client uses the read-side decoder reversed (`EncodeParamValue`
in `FwlibFocasClient.cs`) so the encoder/decoder are guaranteed to stay in
lock-step.
```c
typedef struct iodbpmc {
short type_a; // PMC address letter encoded as ADR_* numeric code
short type_d; // data type: 0=byte, 1=word, 2=long, 4=float, 5=double
unsigned short datano_s; // start address number
unsigned short datano_e; // end address number
union {
char cdata[5];
short idata[5];
long ldata[5];
float fdata[5];
double dbdata[5];
} u; // 40-byte union (widest = dbdata = 5×8 bytes)
} IODBPMC;
```
### Request
With `pack(4)` the union is 40 bytes; struct total = 8 + 40 = **48 bytes**.
Our `FwlibNative.cs` matches this.
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `datano` — parameter number (e.g. `1815`) |
| 2 | `int16 LE` | `type` — axis index (1-based; `0` = whole-CNC parameter) |
| 4 | `length` | `data` payload — width depends on parameter type |
Our mock's PMC handler takes a different layout (uint16 handle + uint8 letter
+ ...). Stream C fix: rewrite to IODBPMC shape.
`length` (request frame trailer, drives `data` width):
## Reference trace findings (2026-04-23 dev-box reversing)
| FocasDataType | `length` | Payload encoding |
| --- | --- | --- |
| `Byte` | `4 + 1` | one signed byte at offset 4 |
| `Int16` | `4 + 2` | int16 LE at offset 4 |
| `Int32` | `4 + 4` | int32 LE at offset 4 |
**Good news** — we don't need a bench CNC for first-pass reversing. Loading
`Fwlib64.dll` in `otopcua-focas-cli` + pointing it at our Python simulator on
`127.0.0.1:8193` + enabling `OTOPCUA_FOCAS_RAW_CAPTURE=1` on the sim lets us
observe Fwlib's outbound bytes + iterate on reply shapes. Each cycle is ~5s;
progress measure is "Fwlib sends more bytes before disconnecting".
Bit-addressed parameters (`PARAM:1815/0` form) are not supported by F4-b
and surface as `BadNotSupported`; F4-c will land the read-modify-write
helper alongside the PMC bit RMW path.
### Confirmed wire facts
### Response
**Magic prefix** — every frame Fwlib sends begins with `0xA0 0xA0 0xA0 0xA0`
(4 bytes). This is NOT a length prefix — our scaffold tried to decode it as
uint32-big-endian = 2.7 GB and died. It's a fixed protocol marker.
Single `int16 LE` return code per the standard FWLIB convention:
**Handshake request** — `cnc_allclibhndl3` produces this 8-byte frame:
- `0``Good`
- `11` (`EW_PASSWD`) → **`BadUserAccessDenied`** (was `BadNotWritable`
pre-F4-b — see `FocasStatusMapper`). Means the parameter-write switch is
off or the CNC isn't in MDI mode; the F4-d unlock workflow will close
the loop on this from the OPC UA side.
- Other `EW_*` codes map per
[`FocasStatusMapper.MapFocasReturn`](../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasStatusMapper.cs).
```
a0 a0 a0 a0 00 01 01 01
└─ magic ─┘ └── negotiation ──┘
```
## ODBM — macro write (`cnc_wrmacro`, command `0x0103`)
The 4-byte negotiation field is stable across our observations (always
`00 01 01 01`). Interpretation TBD — possibly `(version_major=0x0001,
version_minor=0x0101)` or `(protocol=0x01, subtype=0x010101)`.
Issue #269, plan PR F4-b. The write-side payload mirrors the
`cnc_rdmacro` read shape: the same `(mcr_val, dec_val)` (integer +
decimal-point count) split, but emitted from the .NET side rather than
decoded.
**Handshake reply that Fwlib accepts** (empirically confirmed — doesn't
disconnect):
### Request
```
a0 a0 a0 a0 00 01 01 01 00 XX 00 YY
└─ magic ─┘ └── echo ──┘ handle api_version
```
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `number` — macro variable number (e.g. `500`) |
| 2 | `int16 LE` | `length` — fixed at `8` for ODBM |
| 4 | `int32 LE` | `mcr_val` — scaled integer value |
| 8 | `int16 LE` | `dec_val` — decimal-point count |
12 bytes: magic + echoed negotiation + 2-byte handle + 2-byte api_version code.
F4-b ships **integer-only writes** (`dec_val = 0`) to match the most
common HMI pattern; a future `WriteMacroScaled` overload will land if the
field calls for fractional macro setpoints. Read-side decoders apply
`mcr_val / 10^dec_val`, so a `dec_val = 0` write surfaces back as the
integer it was emitted as.
### Post-handshake frame shape — decoded via drain mode
### Response
The simulator's `OTOPCUA_FOCAS_DRAIN_AFTER_HANDSHAKE=1` mode reads all inbound
bytes for 1000 ms after the handshake reply without attempting any decode.
Captured payload from `cnc_allclibhndl3`:
Same single-int16 envelope as `cnc_wrparam`. `EW_PASSWD` is rare on macro
writes (the gate-switch protection is parameter-specific) but the mapper
treats both kinds identically.
```
00 02 00 02 a0 a0 a0 a0 00 01 21 01 00 00
└── prefix ─┘ └── magic ─┘ └─── body ────┘
4 bytes 4 bytes 6 bytes (total = 14 bytes)
```
### Symmetry note
**Key discovery**: post-handshake frames have a **4-byte prefix BEFORE the
magic**, not magic-first. Frame shape:
The plan carries a "byte layout symmetry" requirement — the encoder for
each kind is the read-side decoder reversed. Adding a new parameter type
(e.g. `Int64` parameters, when they ship) means extending both sides in
the same PR; the unit test
`FocasWriteParameterTests.ParameterWrite_round_trip_stores_value_visible_to_subsequent_read`
exercises encode → store → decode with the fake wire client and is the
canary for symmetry regressions.
```
uint16 msg_counter // starts at 2; handshake was #1 implicitly
uint16 handle_echo // matches the handle our open reply returned
4 bytes FOCAS_MAGIC // 0xA0A0A0A0
N bytes body // function-specific
```
## IODBPMC — PMC range write (`pmc_wrpmcrng`, command `0x0104`)
Session 1's drain captured only the prefix (`00 02 00 01`) before timing
out — TCP multiplexed the two test sessions's bytes differently. Session 2
caught the full 14-byte frame.
Issue #270, plan PR F4-c. The write-side payload is the read-side
`pmc_rdpmcrng` IODBPMC packet with the data direction inverted: the
caller fills the `data[]` byte run and the simulator / Fwlib32 stores
it; the response is the small status envelope rather than the populated
data buffer the read side returns.
### Body bytes — first post-handshake request
### Request
Body on `cnc_allclibhndl3` first post-handshake frame:
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `type_a` — PMC address-type code (R=5, G=4, F=3, D=8, X=1, Y=2, K=10, A=11, E=12, T=6, C=7) |
| 2 | `int16 LE` | `type_d` — data type (`0` = byte; only byte writes are issued — bit writes wrap the byte path with a read-modify-write helper) |
| 4 | `uint16 LE` | `datano_s` — first byte address (inclusive) |
| 6 | `uint16 LE` | `datano_e` — last byte address (inclusive) — `(datano_e - datano_s + 1)` is the byte count |
| 8 | `bytes` | `data[]` — payload, exactly `(datano_e - datano_s + 1)` bytes |
```
00 01 21 01 00 00
```
The header is 8 bytes; the FWLIB `IODBPMC.data` field caps at 32 bytes
(40-byte total per call), so larger ranges are chunked into 32-byte
sub-calls by the wire client. The simulator MUST honour the same chunk
ceiling so chunked-vs-single round-trips produce the same final bytes.
Informed guesses (unvalidated):
### Response
- `00 01` = body length (1 useful byte?) or sub-request count
- `21 01` = function code / operation tag — `0x21` is seen in public FOCAS
reverse-engineering notes associated with "system info" / "controller
identification" queries
- `00 00` = padding / reserved
Same single-int16 envelope as `cnc_wrparam` / `cnc_wrmacro`:
Likely this is Fwlib's "tell me what CNC you are" query — part of
`cnc_allclibhndl3`'s internal handshake continuation before the handle is
fully established. Returning an empty or malformed response causes Fwlib
to declare the far end "not a CNC" and error with `EW_FUNC` (16).
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `ew_status``0` = success, non-zero = FANUC `EW_*` |
### Iteration 3 — echo response, error-code advances
`EW_NOOPT` (option not installed), `EW_NUMBER` (out-of-range address),
`EW_LENGTH` (chunk size mismatch) are the typical failures the simulator
reproduces; the mapper translates them to OPC UA status codes the same
way the read-side does.
Sending back `<prefix><magic><echoed body>` (14 bytes matching request shape)
advances Fwlib's client-side error code from **`EW_-16` (socket-level)** to
**`EW_-17` (protocol-level rejection)**. Fwlib reads our response in full
before disconnecting with `peer closed mid-frame`.
### Bit-level RMW (driver-side, no extra wire op)
Meaning: our **frame structure is correct enough** that Fwlib parses it as a
valid FOCAS frame; the **body content** (the 6 bytes after magic) is where
the semantic mismatch now lives. Fwlib expects specific bytes back for the
`0x2101` system-info query and an echo doesn't match.
`pmc_wrpmcrng` is **byte-addressed** — there is no sub-byte write op on
the wire. Bit writes go through `IFocasClient.WritePmcBitAsync` which:
### Current iteration block
1. Issues a 1-byte `pmc_rdpmcrng` to fetch the parent byte.
2. Masks the target bit (set: OR; clear: AND-NOT).
3. Issues a 1-byte `pmc_wrpmcrng` with the modified byte.
Going deeper without reference requires either:
A per-byte semaphore in `FwlibFocasClient` serialises concurrent bit
writes against the same byte so two updates that race never lose one
another's bit. The simulator's handler implements the same byte-aligned
semantics — bit writes never reach it as a separate frame.
- **A bench CNC** (#54) to capture a real response to the `0x2101` query.
Stream C.2 Wireshark trace gives us the exact byte pattern Fwlib expects.
- **Published FOCAS response specs** for sub-function `0x2101` — not present
in `strangesast/fwlib` headers; likely only in the licensed Developer Kit
binary docs.
- **Blind enumeration** — try N variations of the 6-byte body response until
Fwlib's error code changes again. High cost, low signal.
### Symmetry note
The first two are both blocked on resources we don't have. The third is
~hundreds of cycles with no guarantee of convergence.
The encoder is the `pmc_rdpmcrng` decoder reversed: the read side parses
`(type_a, type_d, datano_s, datano_e)` from the request and emits the
data buffer in the response; the write side parses all five fields plus
the data buffer from the request and emits a status int16 in the
response. Tests `FocasWritePmcTests.PMC_*` exercise the round-trip on
the fake wire client.
### Diminishing-returns checkpoint
## cnc_wrunlockparam — connection-level password unlock (command `0x0105`)
**What we've proven without hardware**:
1. Magic prefix `0xA0A0A0A0` confirmed
2. Handshake request format decoded (`magic + 4-byte negotiation`)
3. Handshake response format that Fwlib accepts (`magic + echo + handle + api`)
4. Post-handshake frame format decoded (`prefix + magic + body`)
5. First post-handshake function code observed (`0x2101` — likely system-info)
6. Error code progression `EW_SOCKET``EW_PROTOCOL` confirms our framing is
structurally correct
Issue #271, plan PR F4-d. Some controllers (notably 16i + certain 30i
firmwares with parameter-protect on) gate `cnc_wrparam` and selected
reads behind a connection-level password switch. The driver emits this
frame on connect when `FocasDeviceOptions.Password` is configured, and
re-emits it on any read/write that returns `EW_PASSWD` (then retries the
gated call once).
**What we can't prove without bench CNC or reference docs**:
1. The exact 6-byte response body Fwlib expects for `0x2101`
2. The full list of post-handshake function codes + their body shapes
3. Whether subsequent frames use length prefixes or fixed body sizes
### Request
**Recommendation**: checkpoint here. The framing discoveries above are
preserved in `server/frames.py` + `server/state.py` + `server/focas_server.py`
+ `server/handlers/__init__.py`. When bench-CNC access unblocks Stream C.2's
reference trace, the iteration loop (with the framing work already done)
should converge in hours rather than days.
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `byte[4]` | `password[4]` — 4-byte password buffer. ASCII-encoded from `FocasDeviceOptions.Password`, right-padded with `0x00`, truncated at 4 bytes. |
### Still unknown
The 4-byte fixed slot matches the FANUC published shapethe controller
compares byte-for-byte. Longer / shorter source strings are normalised at
the driver layer before they hit this frame so the wire surface stays
canonical.
- **Response shape** for the post-handshake body requestwe can frame the
prefix + magic correctly now, but what the 6-byte body response should
carry (CNC series ID? version? capability flags?) needs further iteration.
- **Function-id numeric values** for the 9 FWLIB calls our driver makes —
one per call, need to be observed separately.
- **Error encoding** on the wire.
### Response
### Next iteration cycles
Same single-int16 envelope as the write frames:
With the handshake working, each subsequent function gets its own probe-and-observe
loop. The simulator now has a `RAW_FRAME_MARKER = 0xFFFF` sentinel that lets a
handler return exact wire bytes (bypassing the scaffold envelope) — use that to
try different post-handshake replies and watch Fwlib's reaction.
| Offset | Width | Field |
| --- | --- | --- |
| 0 | `int16 LE` | `ew_status``0` = success (gate now lifted for the lifetime of this FWLIB handle), `EW_PASSWD` = supplied password did not match the controller's slot, `EW_HANDLE` = handle invalid. |
## Stream C work order
### Lifetime
Given what's authoritative vs. guessed, here's the most efficient path:
Unlock is bound to the FWLIB handle: it persists until the handle closes
(disconnect / reconnect). The driver reinvokes unlock on every
`EnsureConnectedAsync` reconnect path so a planned or unplanned wire
restart self-heals without operator intervention. A `BadUserAccessDenied`
on a read/write triggers a single-shot retry: re-emit unlock + redispatch
the gated call once. A second `EW_PASSWD` propagates unchanged so a
mismatched password doesn't loop forever on the wire.
### Phase 1 — payload shapes (no hardware required)
### No-log invariant
- [ ] Rewrite `server/handlers/macro.py` response to return 10-byte ODBM:
`short datano, short dummy, int32 mcr_val, short dec_val`
- [ ] Rewrite `server/handlers/param.py` response to return 36-byte IODBPSD:
`short datano, short type, bytes[32] u`
- [ ] Rewrite `server/handlers/pmc.py` response to return 48-byte IODBPMC:
`short type_a, short type_d, uint16 datano_s, uint16 datano_e, bytes[40] u`
- [ ] Add unit tests asserting byte-exact sizes
- [ ] Update validate_harness.py to match the new shapes
The password is a secret. Wire-client implementations MUST NOT log the
password on either request or response. The current
`FwlibFocasClient.UnlockAsync` constructs an exception that includes
only the `EW_*` return code; the `FocasDeviceOptions` record overrides
its auto-generated `ToString` so any Serilog destructure renders
`Password = ***`. See
[`docs/v2/focas-deployment.md`](../focas-deployment.md)
§ "FOCAS password handling" for the deployment-side guarantees +
rotation runbook.
Effect: when Stream C gets its first Wireshark trace, the payload-layer of the
mock is already correct. Only the framing layer needs iteration.
### Phase 2 — framing (requires hardware)
This is the iterative Wireshark loop — no point starting until the Windows rig
+ licensed Fwlib64.dll + real CNC are all available. See the implementer's
checklist in
[`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/README.md`](../../../tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/README.md).
### Phase 3 — flip the C# test gate
Once Phase 2 proves Fwlib64 can talk to the mock:
- [ ] Flip `OTOPCUA_FOCAS_SIM_WIRE_COMPAT=1` in the CI env
- [ ] Expand `tests/.../IntegrationTests/Series/WireCompatGatedTests.cs` with
real per-series assertions
- [ ] Update `scripts/e2e/test-focas.ps1` to accept `-ProfileName`
- [ ] Close Stream D
## References
- [`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs`](../../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs) — P/Invoke surface, authoritative struct layouts
- [`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibFocasClient.cs`](../../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibFocasClient.cs) — reference C# implementation of each FWLIB call
- [`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasStatusMapper.cs`](../../../src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasStatusMapper.cs) — EW_* → OPC UA status mapping
- Fanuc FOCAS Developer Kit (licensed, not in repo) — ultimate source of truth
- `strangesast/fwlib` on GitHub — redistributes `fwlib32.h` + runtime binaries; no wire protocol docs
@@ -172,7 +172,7 @@ Lift the existing `GalaxyRuntimeProbeManager` into the new project. Behaviors pe
#### Task B.6 — Named-pipe IPC server with mandatory ACL
Per decision #76 + `driver-stability.md` §"IPC Security":
- Pipe ACL on creation: `ReadWrite | Synchronize` granted only to the OtOpcUa server's service principal SID; LocalSystem and Administrators **explicitly denied**
- Pipe ACL on creation: `ReadWrite | Synchronize` granted only to the OtOpcUa server's service principal SID; LocalSystem **explicitly denied**. Administrators was dropped from the deny list so non-elevated admins on dev boxes aren't blocked via UAC-filtered-token deny-only semantics — the per-connection SID check (§2 of driver-stability.md) remains the real authorization boundary.
- Caller identity verification on each new connection: `GetImpersonationUserName()` cross-checked against configured server service SID; mismatches dropped before any RPC frame is read
- Per-process shared secret: passed by the supervisor at spawn time, required on first frame of every connection
- Heartbeat pipe: separate from data-plane pipe, same ACL
@@ -1,6 +1,8 @@
# Phase 6.1 — Resilience & Observability Runtime
> **Status**: **SHIPPED** 2026-04-19 — Streams A/B/C/D + E data layer merged to `v2` across PRs #78-82. Final exit-gate PR #83 turns the compliance script into real checks (all pass) and records this status update. One deferred piece: Stream E.2/E.3 SignalR hub + Blazor `/hosts` column refresh lands in a visual-compliance follow-up PR on the Phase 6.4 Admin UI branch.
> **Status**: **SHIPPED** 2026-04-19 — Streams A/B/C/D + E data layer merged to `v2` across PRs #78-82. Final exit-gate PR #83 turns the compliance script into real checks (all pass) and records this status update.
>
> **Stream E.2/E.3 closed 2026-04-23**`FleetStatusPoller` now polls `DriverInstanceResilienceStatus`, detects per-`(DriverInstanceId, HostName)` deltas, and pushes `ResilienceStatusChangedMessage` via `FleetStatusHub` on the fleet group. Admin `/hosts` page subscribes on load and upserts the matching `HostStatusRow` in-memory on receipt, so operator-visible resilience state now reflects the runtime within one poller tick (~5 s) instead of the Admin page's own 10-second refresh. `FleetStatusPollerTests.Poller_pushes_ResilienceStatusChanged_on_delta` covers the first-observation push, the no-delta-no-push invariant, and the mutated-row re-push.
>
> Baseline: 906 solution tests → post-Phase-6.1: 1042 passing (+136 net). One pre-existing Client.CLI Subscribe flake unchanged.
>
@@ -129,7 +131,7 @@ Closes these gaps flagged in the 2026-04-19 audit:
- [ ] Stream B: Tier registry + generalised watchdog + scheduled recycle + wedge detector
- [ ] Stream C: `/healthz` + `/readyz` + structured logging + JSON Serilog sink
- [ ] Stream D: LiteDB cache + Polly fallback in Configuration
- [ ] Stream E: Admin `/hosts` page refresh
- [x] Stream E: Admin `/hosts` page refresh (E.1 in PRs #78-82 with the data layer; E.2/E.3 closed 2026-04-23)
- [ ] Cross-cutting: `phase-6-1-compliance.ps1` exits 0; full solution `dotnet test` passes; exit-gate doc recorded
## Adversarial Review — 2026-04-19 (Codex, thread `019da489-e317-7aa1-ab1f-6335e0be2447`)
@@ -1,10 +1,9 @@
# Phase 6.2 — Authorization Runtime (ACL + LDAP grants)
> **Status**: **SHIPPED (core)** 2026-04-19 — Streams A, B, C (foundation), D (data layer) merged to `v2` across PRs #84-87. Final exit-gate PR #88 turns the compliance stub into real checks (all pass, 2 deferred surfaces tracked).
> **Status**: **FULLY SHIPPED** (updated 2026-04-23 audit). Streams A-D core merged to `v2` across PRs #84-87 + exit-gate PR #88 on 2026-04-19; both named deferrals landed separately and were confirmed against the repo this session:
>
> Deferred follow-ups (tracked separately):
> - Stream C dispatch wiring on the 11 OPC UA operation surfaces (task #143).
> - Stream D Admin UI — RoleGrantsTab, AclsTab Probe-this-permission, SignalR invalidation, draft-diff ACL section + visual-compliance reviewer signoff (task #144).
> - **Task #143 Stream C dispatch wiring**`DriverNodeManager` calls `AuthorizationGate.IsAllowed(context.UserIdentity, OpcUaOperation.<Op>, scope)` on Read (line 249), Write (line 536) with per-classification `OpcUaOperation.WriteOperate` / `WriteTune` / `WriteConfigure` routed via `WriteAuthzPolicy`, and HistoryRead (4 call sites). `TriePermissionEvaluator` + `PermissionTrieCache` back the gate.
> - **Task #144 Stream D Admin UI**`RoleGrants.razor` (LDAP group → Admin role mapping) + `AclsTab.razor` (per-cluster node-ACL editor with a probe-this-permission surface via `PermissionProbeService`) + `AclChangeNotifier` SignalR hub for cache invalidation all present and wired.
>
> Baseline pre-Phase-6.2: 1042 solution tests → post-Phase-6.2 core: 1097 passing (+55 net). One pre-existing Client.CLI Subscribe flake unchanged.
>
@@ -1,13 +1,20 @@
# Phase 6.3 — Redundancy Runtime
> **Status**: **SHIPPED (core)** 2026-04-19 — Streams B (ServiceLevelCalculator + RecoveryStateManager) and D core (ApplyLeaseRegistry) merged to `v2` in PR #89. Exit gate in PR #90.
> **Status**: **SHIPPED (core + Stream C)** — original body merged 2026-04-19; audit 2026-04-23 promoted **Stream C (task #147)** into shipped state.
>
> Deferred follow-ups (tracked separately):
> - Stream A — RedundancyCoordinator cluster-topology loader (task #145).
> - Stream COPC UA node wiring: ServiceLevel + ServerUriArray + RedundancySupport (task #147).
> - Stream E — Admin UI RedundancyTab + OpenTelemetry metrics + SignalR (task #149).
> - Stream Fclient interop matrix + Galaxy MXAccess failover test (task #150).
> - sp_PublishGeneration pre-publish validator rejecting unsupported RedundancyMode values (task #148 part 2 — SQL-side).
> **In** (verified in repo):
> - Stream A — `ClusterTopologyLoader`, `RedundancyCoordinator`, `RedundancyTopology`, `PeerReachability` all present under `src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/`. Coordinator is now also hosted by `Program.cs` via the new `RedundancyPublisherHostedService`, which calls `RefreshAsync` on startup.
> - Stream B`ServiceLevelCalculator` + `RecoveryStateManager`.
> - **Stream C (task #147) — OPC UA node wiring**. `ServerRedundancyNodeWriter` maintains `Server.ServiceLevel` (i=2267), `Server.ServerRedundancy.RedundancySupport` (i=2994), and `Server.ServerRedundancy.ServerUriArray` (non-transparent subtype) by writing the `PropertyState.Value` + calling `ClearChangeMasks`. `RedundancyPublisherHostedService` drives the publisher on a 1 s tick and fans `OnStateChanged` / `OnServerUriArrayChanged` into the writer. Mapping of `Configuration.RedundancyMode` → Part 4 `RedundancySupport` is Warm/Hot/None (v2 clusters don't enumerate Cold / HotAndMirrored per decision #85). Idempotent per-value dedupe prevents spurious OPC UA notifications. Unit coverage: `ServerRedundancyNodeWriterTests` (4 tests, green).
> - Stream D`ApplyLeaseRegistry`.
> - Stream E — `RedundancyTab.razor` with SignalR `RoleChanged` wiring (via `FleetStatusPoller` + `FleetStatusHub`) — stale-flag + role-swap banner.
>
> **Closed this session (2026-04-23)**:
> - **Task #148 part 2**`DraftValidator.ValidateClusterTopology(cluster, nodes)` now catches three pre-publish invariants the SQL CHECK can't see: (a) unsupported `NodeCount`/`RedundancyMode` pairs; (b) `Enabled`-node count vs. declared `NodeCount` mismatch (catches disabled-node drift with mode still Hot/Warm); (c) multiple-Primary per decision #84. Returns every failure in one pass — same shape as `Validate`. 8 new tests in `DraftValidatorTests` green.
> - **Task #150 Stream F**`docs/v2/redundancy-interop-playbook.md` captures the manual validation matrix against UaExpert + Kepware + AVEVA MXAccess failover. Automating these closed-source GUI clients in PR-CI is out of scope; the automatable half is already covered by `ServiceLevelCalculatorTests` / `RedundancyStatePublisherTests` / `ClusterTopologyLoaderTests` / `ServerRedundancyNodeWriterTests`.
>
> **Remaining (documented limitation, not blocking v2.0)**:
> - Non-transparent redundancy-state node upgrade — the SDK's default `Server.ServerRedundancy` object is the base `ServerRedundancyState`, so `ApplyServerUriArray` currently logs-and-skips. Operators on the rare deployment that needs `ServerUriArray` read-back get a clear warning with the upgrade path. Documented in the interop playbook's "Known limitations" section.
>
> Baseline pre-Phase-6.3: 1097 solution tests → post-Phase-6.3 core: 1137 passing (+40 net).
>
@@ -1,12 +1,17 @@
# Phase 6.4 — Admin UI Completion
> **Status**: **SHIPPED (data layer)** 2026-04-19 — Stream A.2 (UnsImpactAnalyzer + DraftRevisionToken) and Stream B.1 (EquipmentCsvImporter parser) merged to `v2` in PR #91. Exit gate in PR #92.
> **Status**: **SHIPPED (mostly)** 2026-04-19; audit 2026-04-23 confirms what landed separately after the data-layer PR #91:
>
> Deferred follow-ups (Blazor UI + staging tables + address-space wiring):
> - Stream A UI — UnsTab MudBlazor drag/drop + 409 concurrent-edit modal + Playwright smoke (task #153).
> - Stream B follow-up — EquipmentImportBatch staging + FinaliseImportBatch transaction + CSV import UI (task #155).
> - Stream C — DiffViewer refactor into base + 6 section plugins + 1000-row cap + SignalR paging (task #156).
> - Stream D — IdentificationFields.razor + DriverNodeManager OPC 40010 sub-folder exposure (task #157).
> **In** (verified in repo):
> - **Task #153 Stream A UI**`UnsTab.razor` with drag/drop handlers + concurrent-edit via `DraftRevisionToken` + `UnsImpactAnalyzer`; Playwright smoke test in `tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests/UnsTabDragDropE2ETests.cs`.
> - **Task #155 Stream B**`EquipmentImportBatch` entity + migration, `EquipmentImportBatchService.CreateBatchAsync` / `FinaliseBatchAsync` / `DropBatchAsync` / `ListByUserAsync`, `ImportEquipment.razor` UI.
> - **Task #156 Stream C**`DiffViewer.razor` + `DiffSection.razor` refactor in place.
> - Admin UI `IdentificationFields.razor` surface shipped (part of #157).
>
> **Closed this session (2026-04-23)**:
> - **Task #157 Stream D server-side half** was a stale audit claim. `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/IdentificationFolderBuilder.cs` ships the OPC 40010 Identification sub-folder materializer (Manufacturer / Model / SerialNumber / HardwareRevision / SoftwareRevision / YearOfConstruction / AssetLocation / ManufacturerUri / DeviceManualUri); `EquipmentNodeWalker.Walk` calls it per equipment; `IdentificationFolderBuilderTests` (158 lines) + two walker-level tests (`Walk_Materializes_Identification_Subfolder_When_AnyFieldPresent`, `Walk_Omits_Identification_Subfolder_When_AllFieldsNull`) cover the null-handling branches. The initial audit grepped only `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/`; the builder lives in `Core/OpcUa/`.
>
> **Phase 6.4 is now FULLY SHIPPED — no deferred surfaces remain.**
>
> Baseline pre-Phase-6.4: 1137 solution tests → post-Phase-6.4 data layer: 1159 passing (+22).
>
+111 -14
View File
@@ -14,7 +14,7 @@ End-to-end validation that the Phase 7 production wiring chain (#243 / #244 / #2
| SQL Server reachable, `OtOpcUaConfig` DB exists with all migrations applied | `sqlcmd -S "localhost,14330" -d OtOpcUaConfig -U sa -P "..." -Q "SELECT COUNT(*) FROM dbo.__EFMigrationsHistory"` returns ≥ 11 |
| Server's `appsettings.json` `Node:ConfigDbConnectionString` matches your SQL Server | `cat src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json` |
> **Galaxy.Host pipe ACL.** Per `docs/ServiceHosting.md`, the pipe ACL deliberately denies `BUILTIN\Administrators`. **Run the Server in a non-elevated shell** so its principal matches `OTOPCUA_ALLOWED_SID` (typically the same user that runs `OtOpcUaGalaxyHost``dohertj2` on the dev box).
> **Galaxy.Host pipe ACL.** The pipe allows the configured `OTOPCUA_ALLOWED_SID` (typically the user that runs `OtOpcUaGalaxyHost``dohertj2` on the dev box). Run the Server under the same user; elevation doesn't matter — `PipeAcl.cs` no longer denies `BUILTIN\Administrators` since UAC's deny-only Admins SID would have blocked non-elevated dev-box admins too.
## Setup
@@ -36,11 +36,49 @@ sqlcmd -S "localhost,14330" -d OtOpcUaConfig -U sa -P "OtOpcUaDev_2026!" `
Expected output ends with `Phase 7 smoke seed complete.` plus a Cluster / Node / Generation summary. Idempotent — re-running wipes the prior smoke state and starts clean.
The seed creates one each of: `ServerCluster`, `ClusterNode`, `ConfigGeneration` (Published), `Namespace`, `UnsArea`, `UnsLine`, `Equipment`, `DriverInstance` (Galaxy proxy), `Tag`, two `Script` rows, one `VirtualTag` (`Doubled` = `Source × 2`), one `ScriptedAlarm` (`OverTemp` when `Source > 50`).
The seed creates one each of: `ServerCluster`, `ClusterNode`, `ClusterNodeCredential` (binds the SQL login to the node — without this `sp_GetCurrentGenerationForCluster` returns `Unauthorized: caller X is not bound to NodeId p7-smoke-node`), `ConfigGeneration` (Published), `Namespace`, `UnsArea`, `UnsLine`, `Equipment`, `DriverInstance` (Galaxy proxy), `Tag`, two `Script` rows, one `VirtualTag` (`MachineStatus` = `Source > 0`, Boolean, historized), one `ScriptedAlarm` (`OverTemp` when `Source > 50`).
### 3. Replace the Galaxy attribute placeholder
### 3. (Optional) Swap the Galaxy attribute
`scripts/smoke/seed-phase-7-smoke.sql` inserts a `dbo.Tag.TagConfig` JSON with `FullName = "REPLACE_WITH_REAL_GALAXY_ATTRIBUTE"`. Edit the SQL + re-run, or `UPDATE dbo.Tag SET TagConfig = N'{"FullName":"YourReal.GalaxyAttr","DataType":"Float64"}' WHERE TagId='p7-smoke-tag-source'`. Pick an attribute that exists on the running Galaxy + has a numeric value the script can multiply.
The shipped seed points `dbo.Tag.TagConfig` at `TestMachine_001.TestHistoryValue` — the dev-box Galaxy ships it as Int32, writable (`security_classification = Operate`), and historized (`HistoryExtension` primitive), so every E2E stage has a real live target. To swap to another attribute on a different Galaxy, pick a candidate via the same shape:
```sql
-- Run against the Galaxy Repository DB (ZB).
;WITH dpc AS (
SELECT g.gobject_id, p.package_id, p.derived_from_package_id, 0 AS depth
FROM gobject g INNER JOIN package p ON p.package_id = g.deployed_package_id
WHERE g.is_template = 0 AND g.deployed_package_id <> 0
UNION ALL
SELECT c.gobject_id, p.package_id, p.derived_from_package_id, c.depth + 1
FROM dpc c INNER JOIN package p ON p.package_id = c.derived_from_package_id
WHERE c.derived_from_package_id <> 0 AND c.depth < 10
)
SELECT DISTINCT g.tag_name + '.' + da.attribute_name AS full_ref,
dt.description AS dtype, da.security_classification
FROM dpc
INNER JOIN dynamic_attribute da ON da.package_id = dpc.package_id
INNER JOIN gobject g ON g.gobject_id = dpc.gobject_id
LEFT JOIN data_type dt ON dt.mx_data_type = da.mx_data_type
WHERE da.attribute_name NOT LIKE '[_]%'
AND da.attribute_name NOT LIKE '%.Description'
AND da.mx_data_type IN (1, 2, 3, 4)
AND da.security_classification > 0 -- writable
AND EXISTS (
SELECT 1 FROM primitive_instance pi
INNER JOIN primitive_definition pd
ON pd.primitive_definition_id = pi.primitive_definition_id
AND pd.primitive_name = 'HistoryExtension'
WHERE pi.package_id = dpc.package_id AND pi.primitive_name = da.attribute_name)
ORDER BY full_ref;
```
Then update the seed:
```sql
UPDATE dbo.Tag
SET TagConfig = N'{"FullName":"YourReal.GalaxyAttr","DataType":"Int32"}'
WHERE TagId = 'p7-smoke-tag-source';
```
### 4. Point Server.appsettings at the smoke node
@@ -54,9 +92,38 @@ The seed creates one each of: `ServerCluster`, `ClusterNode`, `ConfigGeneration`
}
```
### 4a. (Optional) Enable LDAP + SecurityProfile for the write stage
Anonymous OPC UA sessions are denied writes against `Operate`-classified tags by the PR 26 server-layer classification gate. To exercise the reverse-bridge + alarm-fires stages fully, the Server has to advertise a `UserName` UserTokenPolicy (any profile other than `None`) and authenticate against LDAP.
```json
{
"OpcUa": {
"SecurityProfile": "Basic256Sha256-Sign",
"Ldap": {
"Enabled": true,
"Server": "localhost",
"Port": 3893,
"SearchBase": "dc=lmxopcua,dc=local",
"ServiceAccountDn": "cn=serviceaccount,dc=lmxopcua,dc=local",
"ServiceAccountPassword": "serviceaccount123",
"GroupToRole": {
"ReadOnly": "ReadOnly",
"WriteOperate": "WriteOperate",
"WriteTune": "WriteTune",
"WriteConfigure": "WriteConfigure",
"AlarmAck": "AlarmAck"
}
}
}
}
```
Dev-box GLAuth ships `writeop` / `writeop123` in the `WriteOperate` group, `admin` / `admin123` across all write groups. See `C:\publish\glauth\auth.md`.
## Run
### 5. Start the Server (non-elevated shell)
### 5. Start the Server
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Server
@@ -82,27 +149,39 @@ Any line missing = follow up the failure surface (each step has its own log sign
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- browse -u opc.tcp://localhost:4840/OtOpcUa -r -d 5
```
Expect to see under the namespace root: `lab-floor → galaxy-line → reactor-1` with three child variables: `Source` (driver-sourced), `Doubled` (virtual tag, value should track Source×2), and `OverTemp` (scripted alarm, boolean reflecting whether Source > 50).
Expect to see under the namespace root: `lab-floor → galaxy-line → reactor-1` with three child variables: `Source` (driver-sourced Int32), `MachineStatus` (virtual tag Boolean, `Source > 0`), and `OverTemp` (scripted alarm Boolean, `Source > 50`). NodeIds are path-based per OPC UA Part 3 §5.2.2 — the walker mints them from `{driverId}/{folder-path}/{browseName}` and stores the driver-side FullReference in an internal NodeId→FullRef map, so client subscriptions survive backend address renames.
#### Read the virtual tag
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read -u opc.tcp://localhost:4840/OtOpcUa -n "ns=2;s=p7-smoke-vt-derived"
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840/OtOpcUa `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/MachineStatus"
```
Expected: a `Float64` value approximately equal to `2 × Source`. Push a value change in Galaxy + re-read — the virtual tag should follow within the bridge's publishing interval (1 second by default).
Expected: `Boolean`. Push a value change into the Source Galaxy attribute and re-read — `MachineStatus` should follow within the bridge's publishing interval (1 second by default).
#### Read the scripted alarm
```powershell
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read -u opc.tcp://localhost:4840/OtOpcUa -n "ns=2;s=p7-smoke-al-overtemp"
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840/OtOpcUa `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/OverTemp"
```
Expected: `Boolean``false` when Source ≤ 50, `true` when Source > 50.
#### Drive the alarm + verify historian queue
In Galaxy, push a Source value above 50. Within ~1 second, `OverTemp.Read` flips to `true`. The alarm engine emits a transition to `Phase7EngineComposer.RouteToHistorianAsync``SqliteStoreAndForwardSink.EnqueueAsync` → drain worker (every 2s) → `GalaxyHistorianWriter.WriteBatchAsync` → Galaxy.Host pipe → Aveva Historian alarm schema.
Push a Source value above 50 — either from Galaxy itself, or via the Server's OPC UA write path using LDAP credentials (step 4a). Within ~1 second, `OverTemp.Read` flips to `true`. The alarm engine emits a transition to `Phase7EngineComposer.RouteToHistorianAsync``SqliteStoreAndForwardSink.EnqueueAsync` → drain worker (every 2s) → `GalaxyHistorianWriter.WriteBatchAsync` → Galaxy.Host pipe → Aveva Historian alarm schema.
```powershell
# OPC UA write path — requires LDAP from step 4a + a writeop-class user.
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- write `
-u opc.tcp://localhost:4840/OtOpcUa -S sign `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source" `
-v 75 -U writeop -P writeop123
```
Verify the queue absorbed the event:
@@ -120,14 +199,32 @@ Open the Historian Client (or InTouch alarm summary) — the `OverTemp` activati
- [ ] EF migrations applied through `20260420232000_ExtendComputeGenerationDiffWithPhase7`
- [ ] Smoke seed completes without errors + creates exactly 1 Published generation
- [ ] Server starts in non-elevated shell + logs the Phase 7 composition lines
- [ ] Client.CLI browse shows the UNS tree with Source / Doubled / OverTemp under reactor-1
- [ ] Read on `Doubled` returns `2 × Source` value
- [ ] Server starts + logs the Phase 7 composition lines
- [ ] Client.CLI browse shows the UNS tree with Source / MachineStatus / OverTemp under reactor-1
- [ ] Read on `Source` returns a Good-quality Int32 value (proves MXAccess round-trip)
- [ ] Read on `MachineStatus` returns the live boolean truth of `Source > 0`
- [ ] Read on `OverTemp` returns the live boolean truth of `Source > 50`
- [ ] Pushing Source past 50 in Galaxy flips `OverTemp` to `true` within 1 s
- [ ] `test-galaxy.ps1 -Username writeop -Password writeop123` drives Source past 50 and flips `OverTemp` to `true` within 1 s
- [ ] SQLite queue drains (`COUNT(*)` returns to 0 within 2 s of an alarm transition)
- [ ] Historian shows the `OverTemp` activation event with the rendered message
## Second-run evidence (2026-04-24 dev box)
Full live stack ran end-to-end once the IPC unblocks (commit `d11dd05`), path-based NodeIds (commit `8be82e0`), cold-start engine guards (commit `69e1d32`), and seed retarget to `TestMachine_001.TestHistoryValue` (commit `ec1a590`) landed. Anonymous `scripts/e2e/test-galaxy.ps1` run reaches 3/7:
```
[PASS] source NodeId readable (Galaxy pipe → proxy → server → client chain up)
[PASS] source value = System.Byte[]
[INFO] BadUserAccessDenied — attribute's Galaxy-side ACL blocks writes for this session.
```
The `INFO` stage is correct behaviour — Source is `Operate`-classified and the anonymous session carries no LDAP roles. The Virtual-tag / Subscribe / Alarm / History stages stay at `[FAIL]` for two further environmental reasons once write is unblocked:
1. `TestMachine_001.TestHistoryValue` is driven by whatever Galaxy code runs on the object — idle in the default dev-box state, so no subscription pushes fire.
2. Historian writes require the Aveva Historian SDK to accept the alarm schema event — dev box doesn't have that path live.
Running `./test-galaxy.ps1 -Username writeop -Password writeop123` with step 4a's LDAP + `SecurityProfile = Basic256Sha256-Sign` applied unblocks the reverse-bridge + alarm-fires stages. The virtual-tag, subscribe, and history stages depend on further deployment choices (pick an attribute Galaxy is actively writing to, wire Aveva Historian SDK).
## First-run evidence (2026-04-20 dev box)
Ran the smoke against the live dev environment. Captured log signatures prove the Phase 7 wiring chain executes in production:
+233
View File
@@ -0,0 +1,233 @@
# Modbus tag-addressing reference
Foundational doc for the Modbus addressing grammar shipped across #136#144.
Covers the address-string parser (`ModbusAddressParser`) that the wire driver
and the Admin UI both consume, the per-tag suffix modifiers, and the family-
native branch.
## Grammar
```
<region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]
```
Each field is optional from left to right; the parser fills defaults.
### Region + offset
Three accepted forms — pick whichever matches your tag spreadsheet's
convention. All three resolve to the same `(Region, ushort PduOffset)`
on the wire.
| Form | Example | Means |
|---|---|---|
| Modicon 5-digit | `40001` | Holding register 1 (PDU 0) |
| Modicon 6-digit | `400001` | Holding register 1 (PDU 0); supports up to `465536` (PDU 65535) |
| Mnemonic | `HR1`, `IR1`, `C100`, `DI1` | Same regions; `1`-based register number |
Modicon leading-digit → region:
| Digit | Region | OPC UA wire FC |
|---|---|---|
| `0` | Coils | FC01 / FC05 / FC15 |
| `1` | DiscreteInputs | FC02 (read-only) |
| `3` | InputRegisters | FC04 (read-only) |
| `4` | HoldingRegisters | FC03 / FC06 / FC16 |
### Bit suffix `.N`
`40001.5` = bit 5 (LSB-first) of HR[0]. Implies `DataType=BitInRegister`;
mixing with an explicit type or array-count is rejected.
### Type code `:T`
Codes verified 2026-04-25 against [Wonderware DASMBTCP user
guide](https://cdn.logic-control.com/media/DASMBTCP.pdf) and the
[Ignition Modbus addressing
manual](https://www.docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/opc-ua-drivers/modbus/modbus-addressing).
The `I` / `UI` / `I_64` / `UI_64` / `BCD_32` shapes match Wonderware's
suffix convention and Ignition's underscore-N prefix variants where
those vendors agree.
| Code | Type | Registers | Vendor reference |
|---|---|---|---|
| `BOOL` | Boolean | 1 (region must be Coils / DiscreteInputs) | universal |
| `S` | Int16 | 1 | Wonderware DASMBTCP `S` = 16-bit signed |
| `US` | UInt16 | 1 | Ignition `HRUS` = Unsigned Short |
| `I` | Int32 | 2 | Wonderware DASMBTCP `I` = 32-bit signed; Ignition `HRI` |
| `UI` | UInt32 | 2 | Ignition `HRUI` |
| `I_64` | Int64 | 4 | Ignition `HRI_64` |
| `UI_64` | UInt64 | 4 | Ignition `HRUI_64` |
| `F` | Float32 | 2 | Wonderware `F`; Ignition `HRF` |
| `D` | Float64 | 4 | Ignition `HRD` |
| `BCD` | 16-bit BCD | 1 | Ignition `HRBCD` |
| `BCD_32` | 32-bit BCD | 2 | Ignition `HRBCD_32` |
| `STR<len>` | ASCII string, `len` chars (2 chars / register) | `ceil(len/2)` | analogous to Ignition `HRS<addr>:<len>` |
Default when omitted:
- Coils / DiscreteInputs → `BOOL`
- HoldingRegisters / InputRegisters → `S` (Int16) — matches Ignition's bare-`HR` default
**Codes removed in #146** (silent wrong-data risk, never compatible with the
two reference vendors): `:DI`, `:L`, `:UDI`, `:UL`, `:LI`, `:ULI`, `:LBCD`.
Pre-#146 configs that use these get a clear "Unknown type code" diagnostic at
parse time; rewrite to the post-#146 codes per the table above.
### Byte order `:O`
| Mnemonic | Meaning | Wire |
|---|---|---|
| `ABCD` | Big-endian (Modbus spec default) | `[A,B,C,D]` |
| `CDAB` | Word swap (Siemens, several AB) | `[C,D,A,B]` |
| `BADC` | Byte swap (legacy little-endian-internal devices) | `[B,A,D,C]` |
| `DCBA` | Full reverse (some EtherNet/IP gateways) | `[D,C,B,A]` |
For 8-byte values (Int64 / Float64) the same labels apply pairwise.
### Array count `:N`
`40001:F:5` = `Float32[5]` (consumes HR[0..9]). Array + bit suffix is
rejected. Strings are not arrays.
### Composition
The 3-field shorthand `40001:F:5` is parsed as `(type=F, count=5)` because
`5` isn't a valid byte-order mnemonic. Use the explicit 4-field form
`40001:F:CDAB:5` when you need a non-default order.
## Family-native syntax (#144)
When the driver instance has `Family != Generic`, the parser tries the
family's native syntax FIRST, then falls back to Modicon / mnemonic.
### DL205 (AutomationDirect DirectLOGIC)
| Form | Example | Mapping |
|---|---|---|
| `Vnnnn` (octal) | `V2000` | HoldingRegisters[1024] (octal 2000 = decimal 1024) |
| `Ynn` (octal) | `Y17` | Coils[2048 + 15] (Y-output base + offset) |
| `Cnn` (octal) | `C100` | Coils[3072 + 64] (C-relay base + offset) |
| `Xnn` (octal) | `X17` | DiscreteInputs[15] |
| `SPnn` (octal) | `SP10` | DiscreteInputs[1024 + 8] |
**Cross-family ambiguity**: `C100` means Coils[99] under `Generic`
(mnemonic) but Coils[3136] under `DL205`. Per-driver Family choice
disambiguates.
### MELSEC (Mitsubishi)
| Form | Example | Mapping (sub-family Q_L_iQR / F_iQF) |
|---|---|---|
| `Dnnn` (decimal) | `D100` | HoldingRegisters[100] |
| `Mnnn` (decimal) | `M50` | Coils[50] |
| `Xnn` | `X20` | DiscreteInputs[32 hex / 16 octal] |
| `Ynn` | `Y20` | Coils[32 hex / 16 octal] |
X / Y digit interpretation depends on `MelsecSubFamily`:
- `Q_L_iQR` → hex (default)
- `F_iQF` → octal
Bank-base offsets default to 0 in the grammar string. Sites with non-zero
"Modbus Device Assignment" bases use the structured tag form.
## Driver-instance options
Beyond per-tag addressing, `ModbusDriverOptions` exposes (#139#143):
### Connection (#139)
- `KeepAlive { Enabled, Time, Interval, RetryCount }` — TCP-level probes.
Defaults match the historical PR 53 wire output (Enabled=true, Time=30s,
Interval=10s, RetryCount=3).
- `IdleDisconnectTimeout` — proactively close + reconnect after this much
socket idle time. Default null = disabled.
- `Reconnect { InitialDelay, MaxDelay, BackoffMultiplier }` — geometric
backoff for the post-drop reconnect loop. Default
`(0, 30s, 2.0)` = immediate first retry, geometric thereafter.
### Protocol (#140)
- `MaxCoilsPerRead` (default 2000) — separate cap for FC01/FC02 coil reads.
- `UseFC15ForSingleCoilWrites` — force FC15 (write multiple coils
qty=1) for single-coil writes. Safety/audit PLCs may require this.
- `UseFC16ForSingleRegisterWrites` — same for FC16 vs FC06.
- `DisableFC23` — kill switch for FC23 (currently unused; reserved).
### Subscribe (#141)
- Per-tag `Deadband` — suppress sub-threshold publishes on numeric tags.
- `WriteOnChangeOnly` (driver-level) — short-circuit identical-value
writes. Cache invalidates on read-divergence.
### Multi-unit (#142)
- Per-tag `UnitId` — overrides the driver-level UnitId in the MBAP
header. Required for one-Ethernet-gateway / N-RTU-slave deployments.
- `IPerCallHostResolver.ResolveHost` returns `host:port/unitN` per tag so
per-PLC circuit breakers fire per slave.
- Per-tag `CoalesceProhibited` — escape hatch for #143's planner (read
this tag in isolation regardless of `MaxReadGap`).
### Block-read coalescing (#143)
- `MaxReadGap` (default 0 = off) — gap budget the planner is willing to
bridge between adjacent register tags. With `MaxReadGap=10`, three tags
at HR 100/102/110 collapse into one FC03 of quantity 11.
### Coalescing auto-recovery (#148 / #150 / #151 / #152)
- A coalesced read that fails with a Modbus exception (write-only or
protected register mid-block) records the failed range as
auto-prohibited. The planner stops re-coalescing across the range; the
per-tag fallback path keeps healthy members working in the same scan.
- **Bisection (#150)**: every re-probe pass narrows multi-register
prohibitions by trying the two halves separately. Over log2(span)
ticks the prohibition pins at the actual offending register(s);
intermediate halves that succeed get cleared.
- **Periodic re-probe (#151)**: opt in via
`AutoProhibitReprobeInterval` (TimeSpan?). Default null = disabled
(prohibitions persist for the driver lifetime; clear on
`ReinitializeAsync`).
- **Per-tag escape hatch**: `CoalesceProhibited` (bool, default false)
on `ModbusTagDefinition`. The planner reads such tags in isolation
regardless of `MaxReadGap`. Use for known-bad addresses you want to
exclude from the auto-discovery loop.
- **Diagnostics (#152)**: `ModbusDriver.GetAutoProhibitedRanges()`
returns a snapshot of every active prohibition as
`ModbusAutoProhibition` records (UnitId / Region / StartAddress /
EndAddress / LastProbedUtc / BisectionPending). Surface in the
driver-diagnostics RPC channel when that wiring lands; for now
consumable by in-process callers (Server health endpoints, log
aggregation).
## JSON DTO shape
The factory accepts both the structured form (legacy) and the new
`AddressString` form per-tag. Mix freely — newer pasted rows use the
grammar string; legacy rows keep the structured fields.
```json
{
"host": "10.1.2.3",
"port": 502,
"unitId": 1,
"family": "DL205",
"keepAlive": { "enabled": true, "timeMs": 30000, "intervalMs": 10000, "retryCount": 3 },
"idleDisconnectMs": 120000,
"reconnect": { "initialDelayMs": 0, "maxDelayMs": 30000, "backoffMultiplier": 2.0 },
"maxCoilsPerRead": 2000,
"writeOnChangeOnly": false,
"maxReadGap": 8,
"tags": [
{ "name": "Temp", "addressString": "V2000:F:CDAB" },
{ "name": "Setpoint", "addressString": "40001:I" },
{ "name": "Outputs", "addressString": "Y0:5" },
{ "name": "AlarmCount", "region": "HoldingRegisters", "address": 200, "dataType": "Int16", "deadband": 5.0 }
]
}
```
## Vendor compatibility caveat
The exact spelling of type codes (e.g. `I` vs `INT`, `BCD` vs `L_BCD`) and
the byte-order mnemonics were synthesised from training-era vendor docs
(Wonderware DASMBTCP, Kepware KEPServerEX, Ignition, Matrikon, OAS).
Before locking the grammar for a production deployment, verify against
the current Kepware "Modbus Ethernet Driver Help" PDF and Ignition's
"Modbus Addressing" user-manual page — if a critical tool's mnemonics
have shifted, add aliases in `ModbusAddressParser.TryParseType` rather
than asking users to rewrite spreadsheets.
+46
View File
@@ -0,0 +1,46 @@
# Multi-host dispatch — per-PLC circuit breakers
Phase 6.1 decision #144 / task #135. Motivation: a single DriverInstance that fronts N PLCs (Modbus with multiple slaves, AB CIP with multiple ControlLogix chassis, etc.) must not let one dead PLC trip the resilience breaker for its healthy siblings.
This note documents the shipped contract so future driver authors don't re-derive it.
## Contract
The resilience pipeline keys on `(DriverInstanceId, HostName, DriverCapability)`. One dead PLC opens only the pipeline keyed on its HostName; healthy sibling PLCs keep their own pipelines intact.
Three participants:
1. **`DriverResiliencePipelineBuilder.GetOrCreate(driverInstanceId, hostName, capability, options)`** — the pipeline cache. First call per key builds a Polly pipeline (timeout → retry → breaker). Subsequent calls return the cached instance. Covered by `DriverResiliencePipelineBuilderTests.Pipeline_IsIsolated_PerHost`.
2. **`CapabilityInvoker.ExecuteAsync(capability, hostName, callSite, ct)`** — takes `hostName` per-call. Threads it straight through to the pipeline builder. Covered by `CapabilityInvokerTests`.
3. **`IPerCallHostResolver.ResolveHost(fullReference)`** — an optional interface a multi-device driver implements. `DriverNodeManager.ResolveHostFor` calls it on every capability dispatch so the host flowing into the invoker comes from the tag's per-PLC metadata, not the driver instance. Single-device drivers don't implement it — `DriverNodeManager` falls back to `DriverInstanceId` as the hostname, which still flows through the same `(instance, host, capability)` key shape (one pipeline per single-device instance).
End-to-end `dead PLC, healthy PLC` scenario proven by `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`.
## Driver author checklist
To light up per-PLC circuit breakers on a multi-device driver:
1. **Options model** — extend the driver's options type with an explicit device list. See `AbCipDriverOptions.Devices : IReadOnlyList<AbCipDeviceConfig>`.
2. **Tag → device mapping** — parse the tag's `DeviceId` from `TagConfig`. The driver's per-tag definition records the device HostAddress alongside the wire address. See `AbCipTagDefinition.DeviceHostAddress`.
3. **`IPerCallHostResolver`** — implement it on the driver. `ResolveHost(fullReference)` looks up the tag's definition and returns the device HostAddress. Unknown references should return a deterministic fallback (e.g. the first configured device's host) rather than throw — the invoker handles the mislookup at capability level when the actual read surfaces `BadNodeIdUnknown`.
4. **Health surface**`IHostConnectivityProbe.GetHostStatuses()` returns one `HostConnectivityStatus` per configured device so the Admin UI fleet page lights the per-PLC status distinctly.
5. **Transport per device** — one network connection per PLC, serialized per device via `SemaphoreSlim` (or equivalent). Do not share a transport across PLCs; the breaker-isolation guarantee disappears if they share a queue.
## Current fleet status (2026-04-24)
| Driver | Per-tag device | `IPerCallHostResolver` | Per-PLC breaker isolation |
|---|---|---|---|
| AB CIP | ✅ `DeviceId` | ✅ | ✅ live |
| AB Legacy | 1 device / instance | — (not needed) | trivial |
| Modbus | 1 device / instance today | — | trivial — multi-device refactor tracked separately |
| S7 | 1 device / instance today | — | trivial — same |
| TwinCAT | 1 device / instance today | — | trivial — same |
| FOCAS | 1 CNC / instance | — (not needed) | trivial |
| Galaxy | 1 Galaxy Host / instance | — (not needed) | trivial — Host recycle runs per instance |
| OPC UA Client | 1 upstream / instance | — (not needed) | trivial |
"Trivial" above means the pipeline key ends up as `(DriverInstanceId, DriverInstanceId, capability)` via `DriverNodeManager.ResolveHostFor`'s fallback — one pipeline per driver instance, which is correct for single-device drivers.
Extending Modbus / S7 / TwinCAT to multi-device follows the AB CIP template verbatim; it's per-driver surgery (schema row + options model + resolver implementation + transport fan-out) rather than shared-infrastructure work.
+19 -13
View File
@@ -689,7 +689,7 @@ Galaxy.Proxy ──→ Galaxy.Shared ←── Galaxy.Host
**Decided:**
- Mono-repo (Decision #31 above).
- `Core.Abstractions` is **internal-only for now** — no standalone NuGet. Keep the contract mutable while the first 8 drivers are being built; revisit publishing after Phase 5 when the shape has stabilized. Design the contract *as if* it will eventually be public (no leaky types, stable names) to minimize churn later.
- `Core.Abstractions` is **internal-only for now** — no standalone NuGet. Keep the contract mutable while the first 8 drivers are being built; revisit publishing after the driver fleet (originally Phase 5, folded into the Phase 3 umbrella — see exit gate) once the shape has stabilized. Design the contract *as if* it will eventually be public (no leaky types, stable names) to minimize churn later.
---
@@ -742,24 +742,30 @@ Each step leaves the system runnable. The generic extraction is effectively free
10. **Build `Galaxy.Proxy`** — .NET 10 in-process proxy implementing IDriver interfaces, forwarding over IPC
11. **Validate parity** — v2 Galaxy driver must pass the same integration tests as v1
**Phase 3 — Modbus TCP driver (prove the abstraction)**
12. **Build `Driver.ModbusTcp`** — NModbus, config-driven tags from central DB, internal poll loop, device-as-folder hierarchy
13. **Add Modbus config screens to Admin** (first driver-specific config UI)
**Phase 3 — Driver fleet (all seven non-Galaxy drivers) — ✅ CLOSED 2026-04-23** (see [`implementation/exit-gate-phase-3.md`](implementation/exit-gate-phase-3.md))
**Phase 4 PLC drivers**
14. **Build `Driver.AbCip`** — libplctag, ControlLogix/CompactLogix symbolic tags + Admin config screens
15. **Build `Driver.AbLegacy`** — libplctag, SLC 500/MicroLogix file-based addressing + Admin config screens
16. **Build `Driver.S7`** — S7netplus, Siemens S7-300/400/1200/1500 + Admin config screens
17. **Build `Driver.TwinCat`** — Beckhoff.TwinCAT.Ads v6, native ADS notifications, symbol upload + Admin config screens
Originally split across Phase 3 (Modbus alone), Phase 4 (PLC drivers), and
Phase 5 (specialty drivers). In execution, once `Core.Abstractions` had
stabilised under Phase 1 + Phase 2, each driver landed as its own stream
rather than as a gated mini-phase; the phase numbers were folded into a
single umbrella. Shipped:
**Phase 5 — Specialty drivers**
18. **Build `Driver.Focas`**FANUC FOCAS2 P/Invoke, pre-defined CNC tag set, PMC/macro config + Admin config screens
19. **Build `Driver.OpcUaClient`**OPC UA client gateway/aggregation, namespace remapping, subscription proxying + Admin config screens
12. **`Driver.Modbus`** — NModbus, config-driven tags, internal poll loop, device-as-folder hierarchy (umbrella closure #210)
13. **`Driver.AbCip`** — libplctag, ControlLogix/CompactLogix symbolic tags (#211, live-booted under #220)
14. **`Driver.AbLegacy`** — libplctag, SLC 500 / MicroLogix / PLC-5 file-based addressing (#213, live-booted under #222)
15. **`Driver.S7`** — S7netplus, Siemens S7-300/400/1200/1500 (#212, live-booted under #220)
16. **`Driver.TwinCAT`** — Beckhoff.TwinCAT.Ads v7, native ADS notifications, symbol upload (factory wired 2026-04-23; wire-live deferred, #221)
17. **`Driver.FOCAS`** — FANUC FOCAS2 P/Invoke via Tier-C out-of-process `Driver.FOCAS.Host` (#220 five-PR split; wire-live deferred, #222 follow-up)
18. **`Driver.OpcUaClient`** — OPC UA client gateway / aggregation, namespace remapping, subscription proxying (scaffold #66; live-boot 5/8 stages via `test-opcuaclient.ps1`)
Supporting infrastructure: `DriverFactoryRegistry` + `DriverInstanceBootstrapper`
(#248); per-driver test-client CLI suite (#249#251); e2e test scripts with
aggregate runner (#253); server-side factory + seed SQL per driver (#210#213).
**Decided:**
- **Parity test for Galaxy**: existing v1 IntegrationTests suite + scripted Client.CLI walkthrough (see Section 4 above).
- **Timeline**: no hard deadline. Each phase ships when it's right — tests passing, Galaxy parity bar met. Quality cadence over calendar cadence.
- **FOCAS SDK**: license already secured. Phase 5 can proceed as scheduled; `Fwlib64.dll` available for P/Invoke.
- **FOCAS SDK**: license already secured. FOCAS driver shipped as part of the Phase 3 umbrella with Tier-C host; `Fwlib64.dll` available for P/Invoke (wire-level live-boot gated on lab rig, #222 follow-up).
---
+128
View File
@@ -0,0 +1,128 @@
# Redundancy Interop Playbook (Phase 6.3 Stream F — task #150)
> **Scope**: manual validation that third-party OPC UA clients + AVEVA MXAccess
> observe our non-transparent redundancy signals (ServiceLevel, ServerUriArray,
> RedundancySupport) and fail over to the Backup node when the Primary drops.
>
> **Why manual**: the third-party clients named here are Windows-GUI binaries
> (UaExpert, Kepware QuickClient) or embedded inside AVEVA System Platform.
> Automating any of them into PR-CI is out of scope for v2. This playbook
> captures the minimal dev-box-plus-VM setup and the expected pass criteria so
> the work can be executed repeatably at v2 release readiness and after any
> Phase 6.3 follow-up change.
## Prerequisites
1. Two `OtOpcUa.Server` nodes in one `ServerCluster`:
- Declared as `NodeCount = 2`, `RedundancyMode = Hot` (or `Warm`).
- Each with a distinct `ApplicationUri` (enforced by unique index per
decision #86).
- Each node's `StaticRoutes.xml` points at the other (`ServerCluster.Node[].Host`).
2. `scripts/install/Install-Services.ps1` applied on each node so the
`RedundancyPublisherHostedService` is running.
3. At least one `DriverInstance` with a reachable simulator or PLC so both
servers have a non-empty address space to browse.
4. On the client host:
- `UaExpert` ≥ 1.7 installed
- Kepware `ClientAce QuickClient` (or equivalent) — optional, for a second
client
5. For the AVEVA leg: a `Galaxy.Host` running against an MXAccess deployment
with an external OPC UA client object pointed at the cluster (not at a
single node).
## Expected signals on a running cluster
| Node | `ServiceLevel` | `RedundancySupport` | `ServerUriArray` |
|---|---|---|---|
| Primary, healthy, peer reachable | 200 | `Hot` (or `Warm`) | self + peer |
| Primary, mid-apply | 75 (`PrimaryMidApply`) | same | same |
| Primary, peer UNreachable | 150 (`PrimaryPeerDown`) | same | same |
| Backup, healthy | 100 (`Secondary`) | same | same |
| Either, dwelling in recovery | 50 (`Recovering`) | same | same |
| Either, invariant violation (two Primary, disabled-node mismatch) | 2 (`InvalidTopology`) | same | same |
(The band constants live in `ServiceLevelCalculator.Classify`.)
## Test matrix
Each row is one manual run; pass criterion in the right column.
### Block A — UA protocol signals (UaExpert)
| # | Scenario | Procedure | Pass criterion |
|---|---|---|---|
| A1 | ServiceLevel published | Connect UaExpert to Primary. Browse to `Server.ServerStatus.ServiceLevel`. | Value = 200 (or the expected Band byte per table above) |
| A2 | ServiceLevel updates on peer down | Connect to Primary. Stop Backup (`sc stop OtOpcUa`). Watch `ServiceLevel`. | Transitions 200 → 150 within ~2 s of peer probe timeout |
| A3 | RedundancySupport | Browse to `Server.ServerRedundancy.RedundancySupport`. | Value matches the declared `RedundancyMode` (Warm / Hot / None) |
| A4 | ServerUriArray (non-transparent upgrade) | Requires a redundancy-object-type upgrade follow-up. | When upgrade lands: `ServerUriArray` reports both ApplicationUris, self first |
| A5 | Mid-apply dip | On Primary trigger a `sp_PublishGeneration` apply. | `ServiceLevel` drops to 75 for the apply duration + dwell |
### Block B — Client failover
| # | Scenario | Procedure | Pass criterion |
|---|---|---|---|
| B1 | UaExpert picks Primary by ServiceLevel | In UaExpert configure a Redundancy Group with both endpoint URLs. | Client picks the Primary URL (higher ServiceLevel) |
| B2 | UaExpert cuts over on Primary kill | Kill the Primary's `OtOpcUa` service. | Client session reconnects to Backup within UaExpert's reconnect timeout (default 5 s). Data-change monitored items resume. |
| B3 | UaExpert cuts back when Primary returns | Start the Primary service. Wait ≥ recovery dwell (see `RecoveryStateManager.DwellTime`). | `ServiceLevel` on returning Primary goes through 50 (Recovering) → 200; UaExpert may or may not switch back (client-policy dependent; both are accepted outcomes) |
| B4 | Kepware QuickClient failover | Repeat B1B3 with Kepware in place of UaExpert. | Same pass criteria; establishes we're not UaExpert-specific |
### Block C — Galaxy MXAccess failover
This block validates that an AVEVA System Platform app consuming our cluster
via MXAccess tolerates a Primary drop the same way a native OPC UA client does.
The MXAccess toolkit internally wraps the OPC UA Client and does its own
redundancy negotiation; we're asserting that negotiation honors our
`ServiceLevel` signal.
| # | Scenario | Procedure | Pass criterion |
|---|---|---|---|
| C1 | Galaxy binds to Primary on first connect | Bring the cluster up. Start a Galaxy `$MxAccessClient` object pointed at the cluster with both node URLs. | Galaxy reports `QUALITY = Good` + initial values from the Primary |
| C2 | Galaxy redirects on Primary drop | Stop the Primary. | Galaxy's `QUALITY` briefly goes `Uncertain`, then back to `Good`; values continue streaming from the Backup within MXAccess's `ReconnectInterval` (default 20 s) |
| C3 | Galaxy handles mid-apply dip | Trigger a generation apply on the Primary. | Galaxy continues reading — the mid-apply dip is advertisory (ServiceLevel 75), not a session drop; MXAccess should stay bound |
## Recording results
Copy the tables above into a tracking doc per run. The tracking doc shape:
```
Run date: 2026-MM-DD
Cluster: <id> Primary: <node> Backup: <node> Release: <sha>
A1: PASS evidence: UaExpert screenshot uaexpert-a1.png
A2: PASS evidence: ServiceLevel trend grafana-a2.png
```
One pass of every row is the acceptance criterion. Re-run after any Phase 6.3
follow-up ships (especially the non-transparent redundancy-type upgrade, which
flips A4 from "deferred" to "expected pass").
## Known limitations
- **A4 pending**: `Server.ServerRedundancy` on our current SDK build lands as
the base `ServerRedundancyState`, which has no `ServerUriArray` child.
`ServerRedundancyNodeWriter.ApplyServerUriArray` logs-and-skips until the
redundancy-object-type upgrade follow-up lands.
- **Recovery dwell default**: `RecoveryStateManager.DwellTime` defaults to 60 s
in `Program.cs`. Adjust via future config knob if B3 takes too long to
observe.
- **C-block external dependency**: The `Galaxy.Host` side of the redundancy
story is largely out of our code — it's MXAccess's own client-redundancy
policy talking to our published ServiceLevel. A negative result on C1-C3
does not necessarily indicate an OtOpcUa bug; cross-check with UaExpert
(Block A / B) first.
## Automation notes (why this is a playbook, not a test)
- UaExpert and Kepware binaries are closed-source Windows GUIs; they don't
ship headless CLIs for the browse/connect/subscribe flows.
- The OPC Foundation reference SDK *can* drive every scenario, but our own
`Driver.OpcUaClient` tests already cover that client's behaviour; Block B
adds value specifically because these two clients have independent
redundancy implementations we don't control.
- For the sub-set of scenarios that *can* be automated — the self-loopback
case where our own `otopcua-cli` drives Primary + Backup — the existing
`tests/ZB.MOM.WW.OtOpcUa.Server.Tests/RedundancyStatePublisherTests` +
`ServiceLevelCalculatorTests` (unit) + `ClusterTopologyLoaderTests`
(integration) already cover the math + data path. The wire-level assertion
that the values actually land on the right OPC UA nodes is covered by
`ServerRedundancyNodeWriterTests`.
-1012
View File
File diff suppressed because it is too large Load Diff
+61 -40
View File
@@ -1,7 +1,7 @@
# v2 Release Readiness
> **Last updated**: 2026-04-19 (all three release blockers CLOSED — Phase 6.3 Streams A/C core shipped)
> **Status**: **RELEASE-READY (code-path)** for v2 GA — all three code-path release blockers are closed. Remaining work is manual (client interop matrix, deployment checklist signoff, OPC UA CTT pass) + hardening follow-ups; see exit-criteria checklist below.
> **Last updated**: 2026-04-24 (Phase 5 driver complement closed — AB CIP, AB Legacy, TwinCAT, FOCAS all shipped; FOCAS Tier-C retired for a pure-managed in-process client)
> **Status**: **RELEASE-READY (code-path)** for v2 GA. All three original code-path release blockers remain closed. Phase 5 is now complete. Remaining work is manual (live-hardware validations, client interop matrix, deployment checklist signoff, OPC UA CTT pass) + hardening follow-ups; see exit-criteria checklist below.
This doc is the single view of where v2 stands against its release criteria. Update it whenever a deferred follow-up closes or a new release blocker is discovered.
@@ -14,67 +14,78 @@ This doc is the single view of where v2 stands against its release criteria. Upd
| Phase 2 — Galaxy driver split (Proxy/Host/Shared) | ✓ | Shipped |
| Phase 3 — OPC UA server + LDAP + security profiles | ✓ | Shipped |
| Phase 4 — Redundancy scaffold (entities + endpoints) | ✓ | Shipped (runtime closes in 6.3) |
| Phase 5 — Drivers | ⚠ partial | Galaxy / Modbus / S7 / OpcUaClient shipped; AB CIP / AB Legacy / TwinCAT / FOCAS deferred (task #120) |
| Phase 6.1 — Resilience & Observability | ✓ | **SHIPPED** (PRs #7883) |
| Phase 6.2 — Authorization runtime | ◐ core | **SHIPPED (core)** (PRs #8488); dispatch wiring + Admin UI deferred |
| Phase 6.3 — Redundancy runtime | ◐ core | **SHIPPED (core)** (PRs #8990); coordinator + UA-node wiring + Admin UI + interop deferred |
| Phase 6.4 — Admin UI completion | ◐ data layer | **SHIPPED (data layer)** (PRs #9192); Blazor UI + OPC 40010 address-space wiring deferred |
| Phase 5 — Drivers | ✓ | **Shipped** Galaxy, Modbus (+ DL205/S7/MELSEC profiles), S7 native, OPC UA Client, AB CIP, AB Legacy, TwinCAT ADS, FOCAS (managed wire client) |
| Phase 6.1 — Resilience & Observability | ✓ | Shipped (PRs #7883) |
| Phase 6.2 — Authorization runtime | ◐ core | Core shipped (PRs #8488, #94 dispatch wiring); finer-grained Browse/Subscribe/Alarm/Call gating + 3-user interop matrix deferred |
| Phase 6.3 — Redundancy runtime | ◐ core | Core shipped (PRs #8990, #9899); peer-probe HostedServices, OPC UA variable-node binding, `sp_PublishGeneration` lease wrap, client interop matrix deferred |
| Phase 6.4 — Admin UI completion | ◐ data layer + Identification | Data layer + OPC 40010 Identification folder shipped (PRs #9192, Identification audit close-out 2026-04-23); Blazor UI pieces deferred |
**Aggregate test counts:** 906 baseline (pre-Phase-6) → **1159 passing** across Phase 6. One pre-existing Client.CLI `SubscribeCommandTests.Execute_PrintsSubscriptionMessage` flake tracked separately.
**Driver integration-test counts** (end-to-end against live or simulated targets): Modbus 26, FOCAS 9, AbCip 7, OpcUaClient 3, S7 3, AbLegacy 2, TwinCAT 2. Plus Galaxy's separate cross-FX parity/stability suite.
**Aggregate test counts** (2026-04-19 baseline): 1159 passing across the solution. One pre-existing Client.CLI `SubscribeCommandTests.Execute_PrintsSubscriptionMessage` flake tracked separately. Rerun `dotnet test ZB.MOM.WW.OtOpcUa.slnx` after the FOCAS migration commits land to refresh the number.
## Release blockers (must close before v2 GA)
Ordered by severity + impact on production fitness.
All code-path release blockers are closed. The remaining items are live-hardware / manual validations listed under exit criteria.
### ~~Security — Phase 6.2 dispatch wiring~~ (task #143**CLOSED** 2026-04-19, PR #94)
**Closed**. `AuthorizationGate` + `NodeScopeResolver` now thread through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager`. `OnReadValue` + `OnWriteValue` + all four HistoryRead paths call `gate.IsAllowed(identity, operation, scope)` before the invoker. Production deployments activate enforcement by constructing `OpcUaApplicationHost` with an `AuthorizationGate(StrictMode: true)` + populating the `NodeAcl` table.
**Closed**. `AuthorizationGate` + `NodeScopeResolver` thread through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager`. `OnReadValue` + `OnWriteValue` + all four HistoryRead paths call `gate.IsAllowed(identity, operation, scope)` before the invoker. Production deployments activate enforcement by constructing `OpcUaApplicationHost` with an `AuthorizationGate(StrictMode: true)` + populating the `NodeAcl` table.
Additional Stream C surfaces (not release-blocking, hardening only):
Remaining Stream C surfaces (hardening, not release-blocking):
- Browse + TranslateBrowsePathsToNodeIds gating with ancestor-visibility logic per `acl-design.md` §Browse.
- CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).
- Alarm Acknowledge / Confirm / Shelve gating.
- Call (method invocation) gating.
- Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.
- ~~Browse + TranslateBrowsePathsToNodeIds gating with ancestor-visibility logic per `acl-design.md` §Browse.~~ **Partial, 2026-04-24.** `DriverNodeManager.Browse` override post-filters the `ReferenceDescription` list via a new `FilterBrowseReferences` helper — denied nodes disappear silently per OPC UA convention. Ancestor-visibility implication (Read-grant at `Line/Tag` implying Browse on `Line`) still to ship; needs a subtree-has-any-grant query on the trie evaluator. `TranslateBrowsePathsToNodeIds` surface not yet wired.
- ~~CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).~~ **Partial, 2026-04-24.** `DriverNodeManager.CreateMonitoredItems` override pre-gates each request and pre-populates `BadUserAccessDenied` into the errors slot for denied items (the base stack honours pre-set errors and skips those items). Decision #153's per-item `(AuthGenerationId, MembershipVersion)` stamp for detecting mid-subscription revocation is still to ship — needs subscription-layer plumbing. TransferSubscriptions not yet wired (same pattern).
- ~~Alarm Acknowledge / Confirm / Shelve gating.~~ **Partial, 2026-04-24.** Acknowledge + Confirm map to dedicated `OpcUaOperation.AlarmAcknowledge` / `AlarmConfirm` via `MapCallOperation`; Shelve falls through to generic `OpcUaOperation.Call` (needs per-instance method NodeId resolution to distinguish — follow-up).
- ~~Call (method invocation) gating.~~ **Closed 2026-04-24.** `DriverNodeManager.Call` override pre-gates each `CallMethodRequest` via `GateCallMethodRequests`. Denied calls return `BadUserAccessDenied` without running the method. Alarm methods map to alarm-specific operation kinds; everything else gates as generic `Call`.
- ~~Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.~~ **Closed 2026-04-24.** `AuthorizationBootstrap` now loads `NodeAcl` rows for the current generation into a `PermissionTrieCache`, builds the gate, and merges every registered driver's `EquipmentNamespaceContent` into a full-path `NodeScopeResolver` index. `OpcUaServerService` calls the bootstrap after the equipment registry is populated, before `OpcUaApplicationHost.StartAsync`. Disabled by default — operators flip `Node:Authorization:Enabled=true` to enforce, `StrictMode=true` to reject anonymous/no-groups identities.
- 3-user integration matrix covering every operation × allow/deny.
These are additional hardening — the three highest-value surfaces (Read / Write / HistoryRead) are now gated, which covers the base-security gap for v2 GA.
### ~~Config fallback — Phase 6.1 Stream D wiring~~ (task #136**CLOSED** 2026-04-19, PR #96)
**Closed**. `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` end-to-end: bootstrap calls go through the timeout → retry → fallback-to-sealed pipeline; every central-DB success writes a fresh sealed snapshot so the next cache-miss has a known-good fallback; `StaleConfigFlag.IsStale` is now consumed by `HealthEndpointsHost.usingStaleConfig` so `/healthz` body reports reality.
**Closed**. `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` end-to-end; `/healthz` surfaces the stale flag.
Production activation: Program.cs switches `NodeBootstrap → SealedBootstrap` + constructs `OpcUaApplicationHost` with the `StaleConfigFlag` as an optional ctor parameter.
Remaining follow-ups (hardening, not release-blocking):
Remaining follow-ups (hardening):
- A `HostedService` that polls `sp_GetCurrentGenerationForCluster` periodically so peer-published generations land in this node's cache without a restart.
- Richer snapshot payload via `sp_GetGenerationContent` so fallback can serve the full generation content (DriverInstance enumeration, ACL rows, etc.) from the sealed cache alone.
- Richer snapshot payload via `sp_GetGenerationContent` so fallback can serve full generation content (DriverInstance enumeration, ACL rows, etc.) from the sealed cache alone.
### ~~Redundancy — Phase 6.3 Streams A/C core~~ (tasks #145 + #147**CLOSED** 2026-04-19, PRs #9899)
**Closed**. The runtime orchestration layer now exists end-to-end:
- `RedundancyCoordinator` reads `ClusterNode` + peer list at startup (Stream A shipped in PR #98). Invariants enforced: 1-2 nodes (decision #83), unique ApplicationUri (#86), ≤1 Primary in Warm/Hot (#84). Startup fails fast on violation; runtime refresh logs + flips `IsTopologyValid=false` so the calculator falls to band 2 without tearing down.
- `RedundancyStatePublisher` orchestrates topology + apply lease + recovery state + peer reachability through `ServiceLevelCalculator` + emits `OnStateChanged` / `OnServerUriArrayChanged` edge-triggered events (Stream C core shipped in PR #99). The OPC UA `ServiceLevel` Byte variable + `ServerUriArray` String[] variable subscribe to these events.
**Closed**. `RedundancyCoordinator` + `RedundancyStatePublisher` + `PeerReachabilityTracker` orchestrate topology + apply lease + recovery state + peer reachability through `ServiceLevelCalculator` + emit `OnStateChanged` / `OnServerUriArrayChanged` edge-triggered events.
Remaining Phase 6.3 surfaces (hardening, not release-blocking):
- `PeerHttpProbeLoop` + `PeerUaProbeLoop` HostedServices that poll the peer + write to `PeerReachabilityTracker` on each tick. Without these the publisher sees `PeerReachability.Unknown` for every peer → Isolated-Primary band (230) even when the peer is up. Safe default (retains authority) but not the full non-transparent-redundancy UX.
- OPC UA variable-node wiring layer: bind the `ServiceLevel` Byte node + `ServerUriArray` String[] node to the publisher's events via `BaseDataVariable.OnReadValue` / direct value push. Scoped follow-up on the Opc.Ua.Server stack integration.
- `sp_PublishGeneration` wraps its apply in `await using var lease = coordinator.BeginApplyLease(...)` so the `PrimaryMidApply` band (200) fires during actual publishes (task #148 part 2).
- Client interop matrix validation — Ignition / Kepware / Aveva OI Gateway (Stream F, task #150). Manual + doc-only work; doesn't block code ship.
- ~~`PeerHttpProbeLoop` + `PeerUaProbeLoop` HostedServices populating `PeerReachabilityTracker` on each tick.~~ **Closed 2026-04-24.** Two-layer probe model shipped: HTTP probe at 2 s / 1 s timeout against `/healthz`; OPC UA probe at 10 s / 5 s timeout via `DiscoveryClient.GetEndpoints`, short-circuiting when HTTP reports the peer unhealthy. Registered on the Server as `AddHostedService<PeerHttpProbeLoop>` + `AddHostedService<PeerUaProbeLoop>`. Publisher now sees accurate `PeerReachability` per peer instead of degrading to `Unknown` → Isolated-Primary band (230).
- OPC UA variable-node wiring: bind `ServiceLevel` Byte + `ServerUriArray` String[] to the publisher's events via `BaseDataVariable.OnReadValue` / direct value push.
- ~~`sp_PublishGeneration` wraps its apply in `await using var lease = coordinator.BeginApplyLease(...)` so the `PrimaryMidApply` band (200) fires during actual publishes (task #148 part 2).~~ **Closed 2026-04-24.** The apply loop now lives in `GenerationRefreshHostedService` — polls `sp_GetCurrentGenerationForCluster` every 5s, opens a lease when a new generation is detected, calls `RedundancyCoordinator.RefreshAsync` inside the `await using`, releases the lease on all exit paths. Replaces the previous "topology never refreshes without a process restart" behaviour.
- Client interop matrix — Ignition / Kepware / Aveva OI Gateway (Stream F, task #150). Manual + doc-only.
### Remaining drivers (task #120)
### ~~Phase 5 driver complement~~ (task #120**CLOSED** 2026-04-24)
AB CIP, AB Legacy, TwinCAT ADS, FOCAS drivers are planned but unshipped. Decision pending on whether these are release-blocking for v2 GA or can slip to a v2.1 follow-up.
**Closed**. All four deferred drivers shipped:
- **AB CIP** (PRs #202222) — `Driver.AbCip`, `Driver.AbCip.IntegrationTests` (7 tests), AB CIP Cli. Live-boot verified against a ControlLogix rig.
- **AB Legacy** (PRs #202, #223) — `Driver.AbLegacy`, `Driver.AbLegacy.IntegrationTests` (2 tests), AB Legacy Cli. PCCC cip-path workaround for SLC/MicroLogix.
- **TwinCAT ADS** (PRs #205, this branch `task-galaxy-e2e`) — `Driver.TwinCAT`, `Driver.TwinCAT.IntegrationTests` (2 tests), TwinCAT Cli. TCBSD/ESXi fixture for e2e since local Hyper-V / TwinCAT RTIME are mutually exclusive on the dev box.
- **FOCAS** (PRs #173, #199 + this session's migration) — `Driver.FOCAS` with an **in-process managed `FocasWireClient`** that speaks FOCAS/2 over TCP directly. Tier-C isolation retired — `Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `FwlibNative` P/Invoke + shim DLL + NSSM service all deleted. `Driver.FOCAS.IntegrationTests` covers 9 scenarios (fixed tree identity/axes/program/timers/spindle + user-authored PARAM/MACRO/PMC reads, Browse, Subscribe, IAlarmSource raise/clear, Probe transitions).
Decision recorded: FOCAS is **read-only** against the CNC by design — writes return `BadNotWritable`. See `docs/drivers/FOCAS.md` + `docs/drivers/FOCAS-Test-Fixture.md` for the deployment + coverage map.
## Nice-to-haves (not release-blocking)
- **Admin UI** — Phase 6.1 Stream E.2/E.3 (`/hosts` column refresh), Phase 6.2 Stream D (`RoleGrantsTab` + `AclsTab` Probe), Phase 6.3 Stream E (`RedundancyTab`), Phase 6.4 Streams A/B UI pieces, Stream C DiffViewer, Stream D `IdentificationFields.razor`. Tasks #134, #144, #149, #153, #155, #156, #157.
- **Background services** — Phase 6.1 Stream B.4 `ScheduledRecycleScheduler` HostedService (task #137), Phase 6.1 Stream A analyzer (task #135 — Roslyn analyzer asserting every capability surface routes through `CapabilityInvoker`).
- **Multi-host dispatch** — Phase 6.1 Stream A follow-up (task #135). Currently every driver gets a single pipeline keyed on `driver.DriverInstanceId`; multi-host drivers (Modbus with N PLCs) need per-PLC host resolution so failing PLCs trip per-PLC breakers without poisoning siblings. Decision #144 requires this but we haven't wired it yet.
- **Multi-host dispatch** — Phase 6.1 Stream A follow-up (task #135). Every driver currently gets a single pipeline keyed on `driver.DriverInstanceId`; multi-host drivers (Modbus with N PLCs) need per-PLC host resolution so failing PLCs trip per-PLC breakers without poisoning siblings. Decision #144 requires this but not wired.
- **Phase 7** — scripting + alarming + historian sink (plan drafted 2026-04-20 in `docs/v2/implementation/phase-7-*.md`). Out of scope for v2 GA.
## Live-hardware validations (task #54 + task family)
The code ships; these tasks remain open as lab/field verification:
- **#54** — FOCAS live-CNC wire-level smoke against a real FANUC control. The mock's wire responder is PDU-verified against `fwlibe64.dll` upstream but OtOpcUa's managed client has not been pointed at a production CNC.
- **AB CIP live-boot** — already passed on a ControlLogix rig (PR #222). Continue to run ahead of each release.
- **TwinCAT wire-live** — TCBSD/ESXi fixture covers the common path; production PLC verification remains lab-gated.
## Running the release-readiness check
@@ -82,7 +93,12 @@ AB CIP, AB Legacy, TwinCAT ADS, FOCAS drivers are planned but unshipped. Decisio
pwsh ./scripts/compliance/phase-6-all.ps1
```
This meta-runner invokes each `phase-6-N-compliance.ps1` script in sequence and reports an aggregate PASS/FAIL. It is the single-command verification that what we claim is shipped still compiles + tests pass + the plan-level invariants are still satisfied.
This meta-runner invokes each `phase-6-N-compliance.ps1` script in sequence and reports an aggregate PASS/FAIL:
- `phase-6-1-compliance.ps1` — Resilience & Observability
- `phase-6-2-compliance.ps1` — Authorization runtime
- `phase-6-3-compliance.ps1` — Redundancy runtime
- `phase-6-4-compliance.ps1` — Admin UI completion
Exit 0 = every phase passes its compliance checks + no test-count regression.
@@ -92,18 +108,23 @@ v2 GA requires all of the following:
- [ ] All four Phase 6.N compliance scripts exit 0.
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` passes with ≤ 1 known-flake failure.
- [ ] Release blockers listed above all closed (or consciously deferred to v2.1 with a written decision).
- [x] Release blockers listed above all closed.
- [x] Phase 5 driver complement shipped (Galaxy, Modbus, S7, OpcUaClient, AbCip, AbLegacy, TwinCAT, FOCAS).
- [ ] Production deployment checklist (separate doc) signed off by Fleet Admin.
- [ ] At least one end-to-end integration run against the live Galaxy on the dev box succeeds.
- [ ] FOCAS live-CNC wire-level smoke (#54) runs clean against a real FANUC control.
- [ ] OPC UA conformance test (CTT or UA Compliance Test Tool) passes against the live endpoint.
- [ ] Non-transparent redundancy cutover validated with at least one production client (Ignition 8.3 recommended — see decision #85).
## Change log
- **2026-04-19**Release blocker #3 **closed** (PRs #9899). Phase 6.3 Streams A + C core shipped: `ClusterTopologyLoader` + `RedundancyCoordinator` + `RedundancyStatePublisher` + `PeerReachabilityTracker`. Code-path release blockers all closed; remaining Phase 6.3 surfaces (peer-probe HostedServices, OPC UA variable-node binding, sp_PublishGeneration lease wrap, client interop matrix) are hardening follow-ups.
- **2026-04-19**Release blocker #2 **closed** (PR #96). `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag`; `/healthz` now surfaces the stale flag. Remaining follow-ups (periodic poller + richer snapshot payload) downgraded to hardening.
- **2026-04-19**Release blocker #1 **closed** (PR #94). `AuthorizationGate` wired into `DriverNodeManager` Read / Write / HistoryRead dispatch. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-grained scope resolution) downgraded to hardening follow-ups — no longer release-blocking.
- **2026-04-19**Phase 6.4 data layer merged (PRs #9192). Phase 6 core complete. Capstone doc created.
- **2026-04-24**Phase 5 driver complement closed (task #120 CLOSED). AB CIP, AB Legacy, TwinCAT, FOCAS all shipped. FOCAS migration: retired the Tier-C split (`Driver.FOCAS.Host` + `Driver.FOCAS.Shared` + `FwlibNative` + shim DLL deleted) in favour of a pure-managed in-process `FocasWireClient` inlined into `Driver.FOCAS`; driver is now read-only against the CNC by design. Integration test matrix grew to cover Browse / Subscribe / IAlarmSource / Probe end-to-end.
- **2026-04-23**Phase 6.4 audit close-out. IdentificationFolderBuilder + OPC 40010 Identification folder verified against the shipped code.
- **2026-04-20**Phase 7 plan drafted (`phase-7-scripting-and-alarming.md`, `phase-7-e2e-smoke.md`). Out of scope for v2 GA.
- **2026-04-19**Release blocker #3 closed (PRs #9899). Phase 6.3 Streams A + C core shipped: `ClusterTopologyLoader` + `RedundancyCoordinator` + `RedundancyStatePublisher` + `PeerReachabilityTracker`. Code-path release blockers all closed; remaining Phase 6.3 surfaces (peer-probe HostedServices, OPC UA variable-node binding, `sp_PublishGeneration` lease wrap, client interop matrix) are hardening follow-ups.
- **2026-04-19** — Release blocker #2 closed (PR #96). `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag`; `/healthz` surfaces the stale flag. Remaining follow-ups (periodic poller + richer snapshot payload) downgraded to hardening.
- **2026-04-19** — Release blocker #1 closed (PR #94). `AuthorizationGate` wired into `DriverNodeManager` Read / Write / HistoryRead dispatch. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-grained scope resolution) downgraded to hardening follow-ups — no longer release-blocking.
- **2026-04-19** — Phase 6.4 data layer merged (PRs #9192). Phase 6 core complete.
- **2026-04-19** — Phase 6.3 core merged (PRs #8990). `ServiceLevelCalculator` + `RecoveryStateManager` + `ApplyLeaseRegistry` land as pure logic; coordinator / UA-node wiring / Admin UI / interop deferred.
- **2026-04-19** — Phase 6.2 core merged (PRs #8488). `AuthorizationGate` + `TriePermissionEvaluator` + `LdapGroupRoleMapping` land; dispatch wiring + Admin UI deferred.
- **2026-04-19** — Phase 6.1 shipped (PRs #7883). Polly resilience + Tier A/B/C stability + health endpoints + LiteDB generation-sealed cache + Admin `/hosts` data layer all live.
+31
View File
@@ -0,0 +1,31 @@
# TwinCAT driver — v3 backlog
The v2 TwinCAT driver is considered solid: 28 integration tests (14 `[TwinCATFact]` +
16-case `[TwinCATTheory]`) running live against the TCBSD fixture, 110 unit tests,
three latent driver bugs shaken out (notification cycle units, `STRING(N)` mapper,
bit-indexed BOOL path). Further work is deferred.
Archived from `docs/drivers/TwinCAT-Test-Fixture.md` § Follow-up candidates.
## Deferred items
1. **TC2 coverage** — spin up a TC2 runtime (Windows CE IPC or legacy XAR)
and run the same suite; any delta surfaces. Blocked on hardware.
2. **Notification coalescing under load** — run the subscribe test while
the PLC cycle is saturated (bump `lineSim` complexity, watch for
dropped notifications). Doable on current rig; deferred as lower
priority than v3 feature work.
3. **Multi-hop AMS route** — add a test behind an IPC gateway with a
chained route entry. Blocked on hardware (gateway IPC).
4. **License-rotation automation** — XAR's 7-day trial expires on
schedule. Either automate `TcActivate.exe /reactivate` via a scheduled
task on the VM (not officially supported; reportedly works for some
TC3 builds), or buy a paid runtime license (~$1k one-time per runtime
per CPU) to kill the rotation. Ops item, not code.
5. **Lab rig** — cheapest IPC (CX7000 / CX9020) on a dedicated network;
the only route that covers TC2 + real EtherCAT I/O timing + cycle
jitter under CPU load. Blocked on hardware + budget.
-101
View File
@@ -1,101 +0,0 @@
# TC3 EventLogger spike — managed-wrapper investigation
**Question (b) from the PR 5.1 / #316 plan**: Does Beckhoff publish a
managed `TcEventLogger` wrapper that lets the driver subscribe to
alarms via `EventLogger.AlarmRaised` instead of decoding AMS port 110
notifications by hand?
## TL;DR
**No managed wrapper.** The `Beckhoff.TwinCAT.Ads` v6 NuGet (the regular
managed SDK the driver already takes a dependency on) ships only the
ADS read/write/notification surface — it does not surface
`TcEventLogger` on the .NET side. The C++ TcCOM headers
(`TcEventLogger.h` etc.) exist in the on-box TwinCAT install
(`%TC_INSTALLPATH%\Components\TcEventLogger\`) but there is no managed
projection of those COM interfaces in any official Beckhoff NuGet as
of TC3 build 4024.x.
Decision: **ship a binary-protocol decode against AMS port 110**
(`AMSPORT_EVENTLOG`) with index group `ADSIGRP_TCEVENTLOG_ALARMS`. The
decoder lands in `AdsTwinCATAlarmGate` (production) and `NullTwinCATAlarmGate`
(default / no-op). Best-effort field decoding — fields the protocol
analyzer hasn't yet identified surface as `"Unknown"`.
## What was checked
| Source | Result |
| --- | --- |
| `Beckhoff.TwinCAT.Ads` v6.x NuGet, namespace inventory | `TwinCAT.Ads`, `TwinCAT.Ads.SumCommand`, `TwinCAT.Ads.TypeSystem`, `TwinCAT.TypeSystem`. **No** `TcEventLogger` namespace. |
| `Beckhoff.TwinCAT.Ads.TcpRouter` v6.x NuGet | Router only; no EventLogger surface. |
| Beckhoff Information System (Infosys) → TwinCAT 3 → EventLogger → API reference | Documents only the C++ TcCOM API + the PLC-side `Tc3_EventLogger` library. No managed-language section. |
| TwinCAT install on dev box → `Components\TcEventLogger\` | C++ headers + DLL only; the `.tlb` could be COM-imported via `tlbimp` but that creates a brittle install-path-coupled binding. |
| Public Beckhoff GitHub orgs | `Beckhoff/TwinCAT-Tools-Library` etc. — no managed EventLogger wrapper. |
## Why decode at the wire?
A `tlbimp` projection of the on-box TcCOM `.tlb` would technically work
but introduces three problems:
1. **Install-path coupling** — the `.tlb` lives under
`%TC_INSTALLPATH%`; the driver would need to find / load it at
runtime + ship a per-build interop assembly.
2. **Bitness lock-in** — TcCOM is x86; the driver builds AnyCPU.
3. **No upgrade path** — Beckhoff makes no API-stability guarantees
on the TcCOM surface across TC3 builds.
Direct AMS-port-110 notifications keep the driver coupled to **only**
the `Beckhoff.TwinCAT.Ads` v6 NuGet's stable wire surface. Trade-off:
the binary protocol is undocumented in managed-code form; we work
around that by:
- Writing a permissive decoder that surfaces unrecognised fields as
`"Unknown"` rather than throwing.
- Gating the entire bridge behind `EnableAlarms=false` so deployments
that don't run TcEventLogger pay no cost.
- Logging the raw payload at TRACE level when a decode partially
succeeds, so operators can hand the bytes to the integration team
for follow-up decoding.
## What ships in PR 5.1
- `ITwinCATAlarmGate` interface — driver-internal seam.
- `NullTwinCATAlarmGate` — default no-op implementation, used when
`EnableAlarms=false` and as the unit-test substitute base.
- `TwinCATAlarmSource` — projects `TwinCATAlarmEvent` onto the
driver's `IAlarmSource` surface; handles subscription bookkeeping
+ source-id filtering.
- `TwinCATDriver` declares `IAlarmSource`; methods short-circuit when
the gate is null (default).
- Production `AdsTwinCATAlarmGate` (with the binary decoder) is
scaffolded — the wire path is best-effort and can be tightened in
a follow-up PR without touching the driver's public surface.
## Open questions for the follow-up PR
1. **Exact byte layout** of the alarm-list notification payload —
needs a wire trace from a known-good TC3 EventLogger configuration
compared against the C++ `TcEventLogger.h` struct definitions.
2. **Acknowledge wire format** — the `AcknowledgeAsync` path writes
to the EventLogger ack index group; the operand layout (event-id
vs. condition-id mapping) is best-effort in PR 5.1.
3. **Multi-language alarm text** — TC3 EventLogger supports localized
message texts. The decoder should pick the runtime's configured
language; PR 5.1 falls back to the first text it finds.
4. **Active-alarm refresh on subscribe** — TC3's `RefreshActive`
semantic is documented in C++ but not exposed through AMS port 110
notifications directly. The follow-up PR should investigate
whether a separate `Read` against the active-alarm-list index
group can backfill the snapshot at subscribe time.
## Why land PR 5.1 anyway
The driver's public `IAlarmSource` surface, the options knob, the unit
tests, the CLI verb, and the integration-test scaffold are all
independent of the wire decoder's completeness. Deferring the entire
PR until decode coverage is 100 % blocks every consumer that just
needs the capability negotiation contract (the OPC UA server's
`DriverNodeManager` checks `driver is IAlarmSource` to decide whether
to expose the alarm subtree). Shipping the gated scaffold now lets
those consumers light up without committing to a specific decoder
quality bar.
+274
View File
@@ -0,0 +1,274 @@
# Galaxy / LMX Backend — Restructuring Options
## Context
Today the Galaxy driver is structured very differently from every other driver
in this repo:
- **Galaxy.Proxy** (.NET 10, in-process): tiny shim that frames IPC to the host.
- **Galaxy.Host** (.NET Framework 4.8 **x86**, NSSM-wrapped Windows service):
owns MXAccess COM, the STA pump, the ZB Galaxy Repository SQL queries, the
Wonderware Historian SDK plugin, the per-platform `ScanState` probe manager,
the alarm tracker (`.InAlarm`/`.Priority`/`.DescAttrName`/`.Acked` state
machine + ack writer), recycle policy, and post-mortem MMF.
Other drivers (Modbus, S7, AB CIP, OpcUaClient, TwinCAT, FOCAS Tier-C) are
**in-process Tier-A drivers** in the .NET 10 server. They do data + browse
only; historian and alarming are driver-agnostic concerns at the server layer.
A sibling project, **mxaccessgw**
(`C:\Users\dohertj2\Desktop\mxaccessgw`), already provides:
- A .NET 10 x64 gRPC gateway in front of per-session .NET 4.8 x86 worker
processes that own MXAccess COM, the STA, and event sinks
(`MxGateway.Server` + `MxGateway.Worker`).
- A full MXAccess command + event surface (`Register`, `AddItem`, `Advise`,
`Write`, `WriteSecured`, `OnDataChange`, `OnWriteComplete`, etc.).
- A cached, deploy-gated, paged **Galaxy Repository browse** RPC
(`galaxy_repository.v1`) reading the same ZB tables we read today, with the
query bodies kept byte-identical to OtOpcUa.
- A .NET client library (`clients/dotnet/MxGateway.Client`).
- API-key auth, Blazor dashboard, structured logs, metrics, watchdog/recycle.
The proposal is to **strip Galaxy down to data + browse** — push historian and
alarming out to server-level subsystems where they live for every other driver
— and pick how the slimmed-down driver talks to MXAccess.
---
## What "push historian and alarming out" means
Both options below assume the same scope reduction; they only differ in how
the driver reaches MXAccess.
| Concern | Today (Galaxy.Host) | After |
|---|---|---|
| Galaxy hierarchy browse | `GalaxyRepository` (SQL) inside Host | Driver (Option 1: via gw browse RPC; Option 2: own SQL or worker) |
| Live read / write / subscribe | `MxAccessClient` + STA pump in Host | gw (Option 1) or embedded worker (Option 2) |
| Wonderware Historian SDK | `HistorianDataSource` in Host (x86) | Separate Historian data source plugged into the server's HA service. Likely stays its own .NET 4.8 x86 sidecar because the SDK is x86-only; **independent of the Galaxy driver lifecycle**. |
| Alarm state machine (`.InAlarm`/`.Acked` quartet, transitions, ack writer) | `GalaxyAlarmTracker` in Host | Server-level A&E subsystem subscribes to alarm-bearing attributes the driver advertises and runs the AlarmCondition state machine generically. Driver only flags `IsAlarm=true` in node metadata. |
| `ScanState` per-platform probes | `GalaxyRuntimeProbeManager` in Host | Driver-side: ScanState is just another tag subscription; the driver re-advises one per discovered `$WinPlatform`/`$AppEngine` and reports `HostConnectivityStatus` from the value stream. No special host-side machinery. |
After the strip-down, the Galaxy driver looks like Modbus or OpcUaClient: it
discovers nodes, reads/writes/subscribes, and reports per-host transport
health. Everything else is the server's problem.
---
## Option 1 — Tier-A driver against the MxAccess Gateway
`Driver.Galaxy` becomes a regular **in-process .NET 10 driver** in the OtOpcUa
server (no `.Host`, no `.Proxy` split, no x86). It talks to a separately
deployed `MxGateway.Server` over gRPC using `MxGateway.Client`. Browse comes
from `galaxy_repository.v1.DiscoverHierarchy`. Live data comes from
`MxAccessGateway.OpenSession`/`AddItem`/`Advise`/`StreamEvents`.
```
OtOpcUa.Server (.NET 10 x64)
└── Driver.Galaxy (in-proc, .NET 10)
└── gRPC ──► MxGateway.Server (.NET 10 x64)
└── pipe ──► MxGateway.Worker (.NET 4.8 x86)
└── MXAccess COM (STA)
```
### Pros
- **Architectural parity with other drivers.** No bespoke `Host` service, no
x86 build target, no NSSM wrapper, no STA pump in this repo, no
`PostMortemMmf`/`RecyclePolicy` we maintain ourselves.
- **OtOpcUa server stops needing AVEVA installed on its own host.** The
gateway runs where MXAccess lives; the OPC UA server can live on a different
box, in a container, or on a hardened jump host.
- **One canonical MXAccess surface across the org.** Any future tool — a
diagnostic CLI, a Historian replacement, an integration harness — talks to
the same gw with the same parity guarantees we get.
- **Multi-instance friendly.** Two OtOpcUa servers (warm/hot redundancy) share
one gw and one MXAccess footprint instead of each running their own
`Galaxy.Host` with duplicate Wonderware client identities.
- **Browse + cache for free.** `galaxy_repository.v1` already implements the
hierarchy cache, deploy-time gating, paging, and `WatchDeployEvents` — we
delete `GalaxyRepository.cs`, `GalaxyHierarchyRow.cs`, the change-detection
poll loop, and the matching SQL plumbing.
- **Operability for free.** API-key auth, Blazor dashboard at `/dashboard`,
metrics via `Meter`, structured logs with redaction. We currently have
none of that in `Galaxy.Host`.
- **Future backend swap.** When AVEVA exposes managed NMX or another modern
path, gw routes to it without OtOpcUa changes (gw's stated roadmap).
- **Tighter blast radius.** A hung COM event, a leaking COM object, a
crashing worker — all owned by gw's session/worker isolation, not the
OPC UA server process.
- **Simpler version story for OtOpcUa.** Driver is plain .NET 10; the
bitness/runtime split lives entirely in mxaccessgw's repo.
### Cons
- **Extra deployment dependency.** mxaccessgw is now a service that has to be
installed, monitored, and kept on a compatible protocol version. For a
single-box install this is one more moving piece.
- **Two hops on every call** (driver→gw, gw→worker) instead of one
(proxy→host). Today's hop is MessagePack over a named pipe; the new outer
hop is gRPC over TCP. Per-call overhead is a few hundred microseconds, not
a regression for OPC UA workloads but measurable for very chatty bursts.
- **Auth/secret surface added.** OtOpcUa now holds an API key for gw and
rotates it; gw's SQLite-backed key store has to be managed.
- **Failure model spans two processes we don't own** — gw + worker. Reconnect
logic in our driver has to ride both: gw transport drop, gw session lease
expiry, gw-detected worker crash, plus the worker's own MXAccess reconnect.
All of it is exposed in the gRPC contract, but it's still surface area.
- **Cross-repo protocol coupling.** Bumping `mxaccessgw` major version (gRPC
contract changes, session shape changes) ripples into OtOpcUa releases.
Mitigated by versioned contracts; not free.
- **Galaxy redundancy still has to think about gw.** A redundancy fail-over of
OtOpcUa is independent of the gw's session lifecycle. Need to decide whether
the standby holds an open session or only opens it on takeover.
- **Sensitive writes (`WriteSecured`, `AuthenticateUser`) cross the network**
if gw is remote. TLS + mTLS solves it but adds setup.
---
## Option 2 — Embed mxaccessgw worker, no gateway
`Driver.Galaxy` is still in-process .NET 10, but instead of speaking gRPC to a
gateway service, it directly **launches and supervises one (or more)
`MxGateway.Worker` processes** and talks to them over the same named-pipe
worker protocol gw uses internally
(`docs/WorkerFrameProtocol.md`, `docs/WorkerProcessLauncher.md`). Browse stays
local — driver runs the SQL queries against ZB itself.
```
OtOpcUa.Server (.NET 10 x64)
└── Driver.Galaxy (in-proc, .NET 10)
├── ZB SQL (local, in-proc)
└── pipe ──► MxGateway.Worker (.NET 4.8 x86, child process)
└── MXAccess COM (STA)
```
### Pros
- **One hop, not two.** Driver → worker pipe is the same shape as today's
Proxy → Host pipe. Latency is on par with the current implementation.
- **No new service to deploy.** Worker is launched as a child process the
same way `Galaxy.Host` is launched today (just with mxaccessgw's worker
binary). Single-machine install story stays simple.
- **Keeps the trust boundary local.** No API keys, no TLS, no exposed gRPC
port on the OtOpcUa box.
- **Reuses mxaccessgw's parity-tested worker code** — STA pump, COM lifetime,
event conversion, fault model — without inheriting gw's ASP.NET Core /
Blazor / SQLite footprint.
- **Tighter ownership.** OtOpcUa owns the worker lifecycle; recycle, kill,
restart, post-mortem all decided by the driver, not by an external service
we don't control.
- **Easier to reason about during integration tests.** No second service to
spin up in CI; just a child process per test fixture.
### Cons
- **OtOpcUa server box must still have AVEVA + MXAccess installed**, since
the worker runs locally. The major deployment win of Option 1
(separating where MXAccess runs from where OtOpcUa runs) is lost.
- **OtOpcUa still ships an x86 .NET 4.8 binary alongside it.** Even if we
vendor mxaccessgw's worker rather than write our own, installer complexity
and bitness considerations remain.
- **We re-implement everything gw already gives.** Process supervision,
watchdog, recycle policy, heartbeat, post-mortem — these are exactly what
`Galaxy.Host` does today, and they'd live in our repo again, just calling a
different worker binary.
- **No browse cache, no deploy gating, no `WatchDeployEvents`** — we keep
running our own ZB queries and our own `time_of_last_deploy` poll, or we
port gw's cache code into the driver. Either way it's duplicated logic.
- **No auth, no dashboard, no metrics.** Operability stays where it is today
(i.e., minimal). Adding it ourselves is a separate project.
- **Multiple OtOpcUa instances multiply MXAccess sessions.** Redundancy pair
→ two MXAccess clients on the Galaxy from the same software, vs. Option 1
where one gw arbitrates.
- **Worker protocol coupling without the contract surface.** We depend on
mxaccessgw's worker IPC frame format — a surface that mxaccessgw treats as
*internal* to its own gw↔worker boundary. If they refactor it, we have to
follow. The public gRPC contract (Option 1) is more stable by design.
- **Loses the "common MXAccess access point" benefit.** Other consumers
(CLI, integration harnesses, future tools) can't share state with our
embedded worker.
---
## Status quo (for comparison)
Keep `Galaxy.Host` as today, and in-place rip out historian + alarming +
probe manager. End state: the Host shrinks to `MxAccessClient` + `GalaxyRepository`,
which is roughly what Option 2 ends up looking like — but with our hand-rolled
COM bridge instead of mxaccessgw's worker. Not a serious option once
mxaccessgw exists; we'd be maintaining a parallel implementation of the same
thing.
---
## Recommendation (effort-agnostic)
**Go with Option 1 — Tier-A driver against the MxAccess Gateway.**
The decisive arguments:
1. **It's the only option that aligns Galaxy with how every other driver in
this repo is structured.** The user's stated goal — "keep lmx to data +
browsing, similar to other drivers" — only fully resolves if there is no
`.Host` and no x86 build artifact in this repo at all. Option 2 still has
an x86 child process and supervisor code; it's `Galaxy.Host` with a
different worker binary inside.
2. **It separates *where MXAccess runs* from *where OtOpcUa runs*.** That is
a strategically larger win than a few hundred microseconds of per-call
latency. The OPC UA server stops being chained to AVEVA install footprint,
bitness, and Wonderware client identity — which removes a class of
deployment, redundancy, and CI problems we hit today (e.g., the
`DESKTOP-6JL3KKO` Hyper-V/Docker conflict, the `dohertj2`-only pipe ACL,
the live-Galaxy smoke test prerequisites).
3. **It collapses scope.** A non-trivial fraction of `Galaxy.Host` (browse
cache, deploy-event watch, worker supervision, COM bridge, post-mortem,
recycle, ACL hardening) is reproduced *better* in mxaccessgw. Option 1
deletes our copy. Option 2 keeps it.
4. **It positions historian and alarming for the right home.** Once the
Galaxy driver is "just another driver", historian becomes a server-level
data source (one that can also feed Modbus/S7 history if we ever want it),
and alarming becomes a server-level A&E subsystem. Option 2 nominally
allows the same move, but the temptation to keep them in `Galaxy.Host`
"while we're already there" is real.
5. **It future-proofs against AVEVA's roadmap.** Managed NMX, ASB, or any
replacement that shows up over the next few years gets adopted in
mxaccessgw without a release in this repo.
The case for Option 2 is real but narrow: it's the right call **only** if we
commit to single-box deployments forever, refuse to take a gRPC dependency,
and value local-trust simplicity over the consolidation/operability benefits
gw provides. None of those constraints hold here.
### What flips the recommendation
- If the gw protocol is unstable or perf-tested under our subscription
patterns turns out worse than expected → revisit Option 2.
- If org-policy forbids running an MXAccess gateway as its own service →
Option 2.
- If Galaxy goes from one of several drivers to *the* primary driver and
raw call-rate matters more than architectural fit → revisit.
Otherwise: Option 1.
---
## Out-of-scope follow-ups (don't decide here, but flag them)
- **Where does the Wonderware Historian SDK live?** Likely its own
.NET 4.8 x86 sidecar exposing a small `IHistorianDataSource` over a pipe or
gRPC, plugged into the OPC UA server's HA service alongside any future
historian sources. Independent of which option above is chosen.
- **Alarm subsystem ownership.** Decide whether the server hosts a generic
AlarmCondition state machine driven by driver-advertised alarm metadata, or
whether each driver continues to emit pre-shaped alarm transitions. Galaxy's
4-attr quartet is a strong forcing function for the generic approach.
- **Redundancy + gw sessions.** Standby OtOpcUa holds an open gw session
(warm) vs. opens on takeover (cold). Affects gw worker count and Galaxy
client-identity collisions.
- **Auth between OtOpcUa and gw.** API key in DPAPI-protected secret file vs.
Windows-auth gRPC. Both supported by gw; pick before rollout.
+476
View File
@@ -0,0 +1,476 @@
# Galaxy → MxAccessGateway Migration Plan
Implements **Option 1** from `lmx_backend.md`: replace the bespoke `Galaxy.Host`
+ `Galaxy.Proxy` IPC pair with an **in-process Tier-A** `Driver.Galaxy` running
in the .NET 10 OtOpcUa server, talking to a separately-deployed
`MxGateway.Server` (mxaccessgw repo) over gRPC for live MXAccess work and
Galaxy Repository browse.
## Outcome
After this work:
- `OtOpcUa.Server` is fully .NET 10 x64 — no x86 build artifacts in this repo.
- `Driver.Galaxy.Host` (Windows service, NSSM-wrapped, .NET 4.8 x86) is
retired. `Driver.Galaxy.Proxy` and `Driver.Galaxy.Shared` are deleted.
AVEVA platform is no longer required on the OtOpcUa box.
- A new in-process `Driver.Galaxy` lives next to `Driver.Modbus`,
`Driver.OpcUaClient`, etc. It implements the same `IDriver` capability set
the proxy implements today, but its body calls `MxGateway.Client`
(`MxGatewayClient`, `MxGatewaySession`, `GalaxyRepositoryClient`).
- Wonderware Historian SDK access moves out of the Galaxy driver into a
driver-agnostic historian data source (`Driver.Historian.Wonderware`,
separate sidecar, .NET 4.8 x86). The OPC UA HA service plugs into it the
same way it would plug into any future historian.
- Alarm condition tracking moves out of the driver into the OPC UA server's
generic A&E subsystem. The driver only flags `IsAlarm=true` on attribute
metadata and forwards live `.InAlarm`/`.Acked`/etc value changes; the
server runs the AlarmCondition state machine.
- Per-platform `ScanState` probes degrade to plain attribute subscriptions —
no special probe manager.
---
## Pre-flight: improvements to land in mxaccessgw first
These are **integration-quality changes** in the mxaccessgw repo that make
the OtOpcUa side dramatically simpler / faster / more robust. They aren't
strictly required to start, but ship enough of them before phase 3 that we're
not designing around gaps.
### gw-1. Galaxy attribute metadata parity
**What's there:** `galaxy_repository.v1.DiscoverHierarchy` returns
`GalaxyObject` with name, parent, category, and dynamic attributes.
**What's missing for OtOpcUa:** every field today's `MxAccessGalaxyBackend`
copies into `GalaxyAttributeInfo` — confirm gw's `Attribute` proto carries:
- `mx_data_type` (int)
- `is_array` (bool)
- `array_dimension` (uint, optional)
- `security_classification` (int)
- `is_historized` (bool, from `HistorizedExtension` primitive)
- `is_alarm` (bool, from `AlarmExtension` primitive)
If any are missing, add them to the proto and the server-side query mapper.
Without `IsAlarm` and `IsHistorized` the OPC UA server can't decide which
nodes get HasHistoricalConfiguration / which become AlarmConditions.
### gw-2. Stable, documented event-stream resume semantics
**What's needed:** the OtOpcUa driver must survive a transient gw transport
drop without losing subscription state or duplicating change events. gw's
`StreamEventsAsync(afterWorkerSequence)` already exposes resumption.
Document the per-session retention window (how long does the worker buffer
events the gateway hasn't acked?) and the "events were dropped, you must
re-subscribe" signal. If retention is bounded by count rather than time,
expose the bound in `OpenSessionReply` so the client can size its own buffer.
### gw-3. Reconnectable sessions
Listed under "post-v1 revisit" in `gateway.md`. Without it, every gw or
OtOpcUa restart re-`Register`s, re-`AddItem`s, re-`Advise`s the entire
address space — for a 50k-tag Galaxy that's a non-trivial cold-start. With
reconnectable sessions, the driver presents its `SessionId` after a restart
and the worker keeps its handles.
If full reconnection is too large, ship a **bulk replay** instead: a single
RPC that takes the full subscription set and the worker performs the
register/add/advise inside one round trip. We can drive it from a
client-side cache rather than gw state. See gw-5 below.
### gw-4. Driver-shaped subscribe primitive
`MxGatewaySession` already has `SubscribeBulkAsync` (one RPC: `Register`
implicit + `AddItem` + `Advise` for a list of tag addresses, returning
per-tag `SubscribeResult`). That's exactly what `ISubscribable.SubscribeAsync`
wants. Confirm it returns enough per-tag detail to surface a partial-failure
list to OPC UA monitored items (good handle, status code, error text).
If not already, expose **`SubscribeBulk` with optional update-rate hint**
forwarded to `SetBufferedUpdateInterval` so the OPC UA publishing interval
becomes a single field on the subscribe call rather than a follow-up RPC.
### gw-5. Subscription replay snapshot
Provide an RPC `ReplaySubscriptionsAsync(SessionId, IEnumerable<TagAddress>)`
that re-establishes a list of subscriptions after a session reset and returns
per-tag results. The client stores its tag list locally (the driver already
has it from `Discover`), and the gw worker turns it into one
register/add/advise sequence. This is the minimum surface we need; full
"reattach to a previous session by id" (gw-3) is a richer version of the
same thing.
### gw-6. Transport-health stream
The gw already exposes worker / session health on its dashboard. Add a small
streaming RPC `StreamSessionHealth(SessionId) → stream SessionHealth` so the
OtOpcUa driver can surface "MXAccess transport up/down" to its
`IHostConnectivityProbe` without faking it via probe-tag subscriptions.
Today `MxAccessClient.ConnectionStateChanged` does this in-process; we want
the same signal at the gw boundary.
### gw-7. Optional .NET 10 client polish
- Async-disposable session pattern is already there.
- Add a **typed `MxValue` ⇄ `object` adapter** for the seven Galaxy types
OtOpcUa cares about (Boolean, Int32, Float, Double, String, DateTime,
arrays of the same). Today every consumer writes its own `MxValue.From<T>`
helpers; this shaves boilerplate from the driver.
- Add a **`SubscribeWithCallback`** convenience wrapper that combines
`OpenSession` + `SubscribeBulk` + `StreamEvents` and routes events through
a delegate per tag. Keeps the OPC UA driver from re-implementing the
fan-out / sequencer pattern.
### gw-8. Auth minimums
Document API-key scoping as it applies to OtOpcUa: the server identity needs
`session`, `invoke`, `event`, and `metadata:read` scopes. Provide a CLI to
mint a key bound to those scopes for an OtOpcUa instance.
### gw-9. Performance: bulk paths and value coalescing
- Confirm `SubscribeBulkAsync` is implemented as a single MXAccess
`AddItem`+`Advise` loop on the worker, not N pipe round trips. If not, fix
before we drive 50k-tag Galaxies through it.
- Expose `SetBufferedUpdateInterval` per session so OtOpcUa can request
buffered updates at the OPC UA publishing interval and get one batched
`OnBufferedDataChange` per tick rather than N `OnDataChange` events.
These can all ship in mxaccessgw independently and improve every consumer.
---
## OtOpcUa-side improvements to land in parallel
Some are forced by removing `Galaxy.Host`; others are quality-of-life.
### ot-1. Promote `IHistorianDataSource` to a server-level extension point
Today `IHistorianDataSource` is a Galaxy-internal abstraction in
`Driver.Galaxy.Host`. Lift it to `OtOpcUa.Core.Abstractions` (or a similar
home next to `IDriver`) and let the OPC UA HA service consume **any number
of registered data sources** keyed by node namespace. Drivers don't own
historian access; the server mounts data sources alongside drivers. This is
the prerequisite that lets us move Wonderware Historian out of the Galaxy
driver without losing the feature.
### ot-2. Generic alarm condition state machine in the server
Move the `.InAlarm`/`.Priority`/`.DescAttrName`/`.Acked` quartet handling
out of `GalaxyAlarmTracker` into a server-level alarm subsystem keyed off the
`IsAlarm=true` flag drivers set during discovery. The server subscribes to
the four sub-attributes itself and runs the AlarmCondition state machine.
Driver only:
- declares `IsAlarm=true` in `DriverAttributeInfo`,
- forwards plain attribute value changes (already done by `ISubscribable`).
This is also a precondition for future drivers (Modbus DL205 alarm bits,
S7 alarm DBs) to emit alarms without each writing their own tracker.
### ot-3. Driver capabilities trim
After ot-1 and ot-2, `Driver.Galaxy` no longer needs to implement:
- `IHistoryProvider` (server's HA service handles it via Wonderware
historian data source)
- `IAlarmHistorianWriter` (server's A&E historian, or kept generic — Galaxy
shouldn't own the SQLite path)
- `IAlarmSource` ack route (server-level alarm subsystem writes back via the
driver's `IWritable.WriteAsync`, which the gw already supports)
Keep:
- `IDriver`, `ITagDiscovery`, `IReadable`, `IWritable`, `ISubscribable`,
`IRediscoverable`, `IHostConnectivityProbe`.
### ot-4. Treat `time_of_last_deploy` as `IRediscoverable`'s pump
Replace the Host-side change-detection poll with a managed
`GalaxyRepositoryClient.WatchDeployEventsAsync` consumer in the driver.
Each event raises `OnRediscoveryNeeded` with the new deploy time as the
`scopeHint`. No polling code in this repo.
### ot-5. Connection pool at the server, not the driver
If the redundancy pair runs two OtOpcUa instances against one gw, both
should share a single `GrpcChannel` per process (already gRPC default) but
**different sessions** (one MXAccess client identity per OtOpcUa instance,
not one shared session that fights over Wonderware client state). Encode
the per-instance MXAccess client name in driver config — already partly
there (`OTOPCUA_GALAXY_CLIENT_NAME`); make it explicit in the new driver's
`appsettings.json` shape.
---
## Phased implementation
Each phase is a working, mergeable slice. Keep `Galaxy.Host` running
alongside the new driver until phase 7 — gated by a config switch
`Galaxy:Backend = legacy-host | mxgateway`.
### Phase 0 — pre-flight (mxaccessgw repo)
Ship gw-1, gw-2, gw-4, gw-9 (the parity, performance, and contract bits the
plan immediately depends on). gw-3, gw-5, gw-6, gw-7 can come during or
after phase 5.
**Exit:** local OtOpcUa dev box can `MxGatewayClient.Create` a client, open a
session, `SubscribeBulkAsync` 100 tags, and observe `OnDataChange` events at
the configured update rate.
### Phase 1 — server-level historian extension point (ot-1)
1. Extract `IHistorianDataSource` (and its DTOs `HistorianSample`,
`HistorianAggregateSample`, `HistoricalEvent`) from
`Driver.Galaxy.Host/Backend/Historian/` into
`src/ZB.MOM.WW.OtOpcUa.Core/Abstractions/Historian/`.
2. Extend the OPC UA HA service to look up a registered
`IHistorianDataSource` per namespace and call into it for `HistoryRead`,
`HistoryReadProcessed`, `HistoryReadAtTime`, `HistoryReadEvents`. Drivers
stop implementing `IHistoryProvider` directly; the server proxies.
3. Add a no-op default registration so drivers without history keep working.
**Exit:** all current Galaxy history reads route through an
`IHistorianDataSource` registered by `Driver.Galaxy.Host` (still legacy)
without behavior change. Other drivers untouched.
### Phase 2 — server-level alarm subsystem (ot-2)
1. Add an `IAlarmConditionDeclaration` API on the address-space builder so
discovery can flag a node as alarm-bearing and supply the four
sub-attribute references.
2. Add a hosted `AlarmConditionService` in the server that, on driver
`Discover`, subscribes to the four sub-attributes via the driver's own
`ISubscribable`, runs the state machine, and emits
`IAlarmSource.OnAlarmEvent` itself. Acks route back through the driver's
`IWritable.WriteAsync` to the `.AckMsg` attribute.
3. Add Galaxy-specific defaults (sub-attribute naming) as a small adapter
so the same service can serve future drivers with different conventions.
**Exit:** Galaxy alarms still work end-to-end; the tracker code that runs
inside `Galaxy.Host` is dead but kept for the legacy-host backend path.
### Phase 3 — Wonderware Historian sidecar (`Driver.Historian.Wonderware`)
1. New solution project: `Driver.Historian.Wonderware`, .NET 4.8 x86,
console app + NSSM (mirrors today's Galaxy.Host packaging exactly,
minus Galaxy responsibilities).
2. Hosts the existing `HistorianDataSource`, `HistorianClusterEndpointPicker`,
`HistorianHealthSnapshot` code lifted from `Galaxy.Host/Backend/Historian/`
and exposes them over a small named-pipe protocol (or local gRPC if
.NET 4.8 cost is acceptable; named pipe is simpler).
3. Add `Driver.Historian.Wonderware.Client` — .NET 10 — implementing
`IHistorianDataSource` against the sidecar.
4. Server registers it as a data source for the `Galaxy` namespace.
**Exit:** OPC UA history reads work via the sidecar with the legacy-host
backend still in place. We've decoupled history from MXAccess.
### Phase 4 — new `Driver.Galaxy` against gw
This is the meat. New project: `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/`, .NET 10,
in-process. Capabilities (post ot-3): `IDriver`, `ITagDiscovery`, `IReadable`,
`IWritable`, `ISubscribable`, `IRediscoverable`, `IHostConnectivityProbe`.
Shape:
```
Driver.Galaxy/
GalaxyDriver.cs # IDriver root
Browse/
GalaxyDiscoverer.cs # consumes GalaxyRepositoryClient.DiscoverHierarchyAsync
DataTypeMap.cs # mx_data_type → DriverDataType
SecurityMap.cs # security_classification → SecurityClassification
Runtime/
GalaxyMxSession.cs # owns one MxGatewaySession; Register + map per-driver client name
SubscriptionRegistry.cs # tag → server/item handles; persists to memory only
EventPump.cs # consumes session.StreamEventsAsync, fans out to OnDataChange
ReconnectSupervisor.cs # gw transport drop / session-lost recovery
DeployWatcher.cs # GalaxyRepositoryClient.WatchDeployEventsAsync → OnRediscoveryNeeded
Health/
HostConnectivityForwarder.cs # gw-6 SessionHealth → IHostConnectivityProbe
Config/
GalaxyDriverOptions.cs # endpoint, ApiKey, ClientName, TLS, retry, intervals
GalaxyDriverFactoryExtensions.cs # AddGalaxyDriver(IServiceCollection)
```
Key behaviors:
- **Discovery** calls `GalaxyRepositoryClient.DiscoverHierarchyAsync()`
once at init and on every `WatchDeployEvents` event, then drives the
address space builder. Same node naming as today (parent contained-name
hierarchy + leaf attributes named `tag_name.AttributeName`).
- **Read** uses one-off `AddItem` + `Advise` + read-after-first-callback
is overkill; instead, use **`Register` + per-call `AddItem`/`Read`** if gw
exposes a synchronous read, otherwise short-lived advise. *Action item:*
confirm gw's read story; if absent, request a synchronous `ReadAsync` RPC
on top of MXAccess `Read` (which exists in the COM API).
- **Write** maps `WriteRequest.Value` to `MxValue` via gw-7 helpers and
calls `WriteAsync(serverHandle, itemHandle, value, userId=0)`. Routes
`WriteSecured` (where `SecurityClassification == SecuredWrite/Verified`)
to `WriteSecuredAsync` once exposed on `MxGatewaySession`.
- **Subscribe** calls `SubscribeBulkAsync` once per `ISubscribable.Subscribe`
call. Stores `(tag → itemHandle, sid)` in `SubscriptionRegistry`. The
single `EventPump` consumes one `StreamEventsAsync` per session and fans
out per `sid`.
- **Unsubscribe** calls `UnsubscribeBulkAsync` and drops registry entries.
- **Reconnect** — when the gRPC channel drops or `StreamEvents` returns,
`ReconnectSupervisor` reopens the session and replays subscriptions via
gw-5 `ReplaySubscriptionsAsync`. The driver flags `DriverState.Degraded`
during recovery; the server keeps publishing last-good values with
`Uncertain` quality.
- **Host connectivity** — single synthesized host entry named after
`OTOPCUA_GALAXY_CLIENT_NAME` driven by gw-6 `SessionHealth` updates
(or, until gw-6 lands, by transport drops).
Wire into the server next to other Tier-A drivers in the
`AddDrivers(...)` call site.
**Exit:** flipping `Galaxy:Backend` to `mxgateway` runs the OPC UA server
end-to-end with no `Galaxy.Host` involvement. Live read, live write, live
subscribe pass against the dev Galaxy. Historian + alarms still work via
phases 13.
### Phase 5 — parity test matrix
Reuse the existing live-Galaxy integration tests; run each scenario twice:
once with `Galaxy:Backend=legacy-host`, once with `mxgateway`. Compare:
- discovered hierarchy node count + names + datatypes,
- subscribed publish rates (allow ±10% tolerance vs. legacy),
- write success / status codes for each `SecurityClassification`,
- alarm condition transitions (Active / Acked / Inactive) — already
routed through phase 2's server-level subsystem,
- history reads — phase 3 sidecar, identical results both backends,
- reconnect behavior under gw kill, worker kill, network drop, ZB drop.
Document the matrix; resolve every discrepancy or explicitly accept it.
**Exit:** parity matrix has zero unexplained deltas. Performance budget
agreed: e.g. ≤ 2× per-call latency vs. named-pipe baseline at the 95th
percentile, equal or better throughput in `SubscribeBulk` setup time.
### Phase 6 — perf + hardening
- Land gw-9 buffered-update intervals.
- Add OpenTelemetry traces from the driver around every gw call,
correlated via `client_correlation_id`.
- Write soak test: 50k tags subscribed, 24h, count missed events, gw
restarts, OtOpcUa restarts.
- Tune `MxGatewayClientOptions.MaxGrpcMessageBytes`, retry pipeline,
call timeouts based on soak results.
**Exit:** production-acceptable perf numbers documented in
`docs/Galaxy.Driver.md`.
### Phase 7 — retirement
1. Default `Galaxy:Backend = mxgateway` everywhere (sample configs,
install scripts, e2e configs).
2. Delete `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`,
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy`,
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared`, and matching tests.
3. Remove `OtOpcUaGalaxyHost` NSSM registration from
`scripts/install/Install-Services.ps1`. Add a registration block for the
Wonderware historian sidecar from phase 3.
4. Remove every x86 .NET 4.8 reference, build target, and CI step from this
repo; remove `mxaccess_documentation.md`-driven dependencies that no
longer apply.
5. Update CLAUDE.md, `docs/v2/dev-environment.md`, `docs/ServiceHosting.md`,
`docs/Redundancy.md` to reflect the new topology.
6. Memory housekeeping: retire `project_galaxy_host_service.md` and
`project_galaxy_host_installed.md`; add a short note about the gw
dependency.
**Exit:** `git grep -i 'Galaxy\.Host'` returns nothing in source.
---
## Configuration shape (new driver)
```jsonc
"Drivers": {
"Galaxy": {
"Type": "Galaxy",
"InstanceId": "galaxy-prod-1",
"Gateway": {
"Endpoint": "https://mxgw.aveva.local:5001",
"ApiKeySecretRef": "galaxy:apiKey", // resolved via existing secret store
"UseTls": true,
"CaCertificatePath": "C:\\publish\\mxgw\\ca.crt",
"ConnectTimeoutSeconds": 10,
"DefaultCallTimeoutSeconds": 5,
"StreamTimeoutSeconds": 0 // unbounded
},
"MxAccess": {
"ClientName": "OtOpcUa-A", // unique per OtOpcUa instance
"PublishingIntervalMs": 1000, // hint for SetBufferedUpdateInterval
"WriteUserId": 0
},
"Repository": {
"DiscoverPageSize": 5000,
"WatchDeployEvents": true
},
"Reconnect": {
"InitialBackoffMs": 500,
"MaxBackoffMs": 30000,
"ReplayOnSessionLost": true
}
}
}
```
The OtOpcUa secret store already handles DPAPI-protected values for LDAP
binds; reuse it for the gw API key. Never put the key in plaintext in the
sample config.
---
## Risks and mitigations
| Risk | Mitigation |
|---|---|
| gw protocol regression breaks production | Pin gw NuGet to a contract version range; CI runs parity matrix on every gw bump; staged rollout via `Galaxy:Backend` flag. |
| Per-call latency regresses for chatty workloads | Land gw-9 (buffered updates) before phase 5; soak the 95p in phase 6. |
| Reconnect storm after gw restart re-registers 50k tags | Land gw-3 or gw-5 before phase 6; client-side bulk replay throttled by `SubscribeBulkAsync` chunk size. |
| Alarm parity gap from moving tracker server-side | Phase 2 ships before phase 4; parity matrix gates phase 7. |
| Historian sidecar adds a second .NET 4.8 x86 service | Acceptable: it's a *driver-agnostic* component, and it ships only where Wonderware historian access is actually needed. |
| Two OtOpcUa instances both registering as same MXAccess client | `ClientName` is per-instance config (ot-5); install scripts lint that the redundancy pair has distinct names. |
| Cross-machine MXAccess writes traverse plaintext gRPC | Phase 0 enforces `UseTls=true` for any non-loopback `Endpoint`; CI lints the sample configs. |
| gw API key leaked in logs | gw and `MxGatewayClient` already redact `authorization` metadata; phase 6 audit. |
| Memory leak in `EventPump` under high event rate | Bounded channel between `StreamEventsAsync` and per-sub fan-out, drop-newest with a metric counter; soak test catches. |
---
## Cross-cutting deliverables
- **Docs:** `docs/v2/Galaxy.Driver.md` (new), updates to
`docs/v2/dev-environment.md`, `docs/ServiceHosting.md`,
`docs/Redundancy.md`, `CLAUDE.md`.
- **Install scripts:** `scripts/install/Install-Services.ps1` removes
`OtOpcUaGalaxyHost`, adds `OtOpcUaWonderwareHistorian`, no Galaxy
service registration on the OtOpcUa node.
- **e2e:** `scripts/e2e/e2e-config.sample.json` — drop `OTOPCUA_GALAXY_*`
pipe vars, add `Drivers:Galaxy:Gateway:Endpoint` etc.
- **Memory:** retire stale Galaxy.Host entries; add gw dependency entry,
redundancy + client-name guidance.
---
## Order-of-work summary
```
Phase 0 (gw repo): gw-1, gw-2, gw-4, gw-9
Phase 1 (this): ot-1 — historian extension point
Phase 2 (this): ot-2 — alarm subsystem
Phase 3 (this): Driver.Historian.Wonderware sidecar
Phase 4 (this): Driver.Galaxy (new) behind backend flag
— depends on Phase 0, 1, 2
Phase 5 (this+gw): parity matrix
— drives gw-3 / gw-5 / gw-6 / gw-7 if gaps surface
Phase 6 (this): perf + hardening
Phase 7 (this): retire Galaxy.Host / Proxy / Shared
```
Phases 13 are independent of each other and can run in parallel. Phase 4
needs all three plus Phase 0. Phase 5 requires Phase 4. Phases 6 and 7 are
sequential after Phase 5.
+1050
View File
File diff suppressed because it is too large Load Diff
+43 -20
View File
@@ -53,27 +53,47 @@ read-only tag.
## Status
Stages 1 + 2 (driver-side probe + loopback) are verified end-to-end
against the pymodbus / ab_server / python-snap7 fixtures. Stages 3-5
(anything crossing the OtOpcUa server) are **blocked** on server-side
driver factory wiring:
All seven driver factories are registered in
`src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` — Galaxy, FOCAS, Modbus,
AB CIP, AB Legacy, S7, TwinCAT. `DriverInstanceBootstrapper` can
materialise any `DriverType` row from the central Config DB into a
live driver. The factory-wiring block that originally gated stages
3-5 is closed.
- `src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` only registers Galaxy +
FOCAS factories. `DriverInstanceBootstrapper` skips any `DriverType`
without a registered factory — so Modbus / AB CIP / AB Legacy / S7 /
TwinCAT rows in the Config DB are silently no-op'd even when the seed
is perfect.
- No Config DB seed script exists for non-Galaxy drivers; Admin UI is
currently the only path to author one.
Live-boot verification:
Tracking: **#209** (umbrella) → #210 (Modbus), #211 (AB CIP), #212 (S7),
#213 (AB Legacy, also hardware-gated — #222). Each child issue lists
the factory class to write + the seed SQL shape + the verification
command.
- **Galaxy** — 7/7 stages (read / write / subscribe / alarms / history)
against a real Galaxy + `OtOpcUaGalaxyHost` on this dev box.
- **AB CIP, S7** — 5/5 stages each under task #220 against the
`ab_server` + `python-snap7` fixtures.
- **AB Legacy** — 5/5 stages under task #222 against `ab_server` SLC500
/ MicroLogix / PLC-5 profiles (requires the `cip-path /1,0` workaround
for the Docker fixture).
- **Modbus** — 5/5 stages against the `pymodbus` + dl205 profile,
including HR[200] scratch register + per-protocol bidirectional +
subscribe-sees-change stages.
- **TwinCAT** — factory registered; driver features validated against the
TCBSD VM virtual-PLC fixture (FreeBSD + TwinCAT/BSD runtime on ESXi —
bypasses the Hyper-V/RTIME conflict that blocks XAR on the dev box).
`TWINCAT_TRUST_WIRE=1` is still required to run the script —
false-pass-prevention belt, not an "unverified" flag.
- **FOCAS** — factory registered; gated by `FOCAS_TRUST_WIRE=1` pending
the lab-rig CNC (task #222 follow-up).
- **OpcUaClient (gateway)** — eight-stage script (`test-opcuaclient.ps1`)
covers probe / remote read / forward bridge / subscribe / reverse
bridge / browse mirror / alarm / history against the opc-plc Docker
fixture at `opc.tcp://localhost:50000`. Reverse-bridge / alarm /
history stages are opt-in per the parameter docs (opc-plc's default
image has no writable nodes and does not historize).
Until those ship, stages 3-5 will fail with "read failed" (nothing
published at that NodeId) and `[FAIL]` the suite even on a running
server.
Remaining work is **per-protocol seed authoring**: each dev fills in
the NodeIds their server publishes under `e2e-config.json` (sidecar
is `.gitignore`-d; see `e2e-config.sample.json` for the shape). Admin
UI remains the supported path for authoring the matching driver
instance rows in the Config DB.
Tracking: umbrella #209 is closed; remaining TwinCAT / FOCAS work
tracks under their hardware-fixture tasks (#221 / #222).
## Prereqs
@@ -85,7 +105,9 @@ server.
for the simulator matrix — pymodbus / ab_server / python-snap7 /
opc-plc cover Modbus / AB / S7 / OPC UA Client. FOCAS and TwinCAT
have no public simulator; they are gated with env-var skip flags
below.
below. For OpcUaClient, `docker compose -f
tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/Docker/
docker-compose.yml up -d` brings up `opc-plc` on port 50000.
3. **PowerShell 7+**. The runner uses null-coalescing + `Set-StrictMode`;
the Windows-PowerShell-5.1 shell will not parse `test-all.ps1`.
4. **.NET 10 SDK**. Each script either runs `dotnet run --project
@@ -136,7 +158,8 @@ section to skip it.
| Galaxy | — | **PASS** (requires OtOpcUaGalaxyHost + a live Galaxy; 7 stages including alarms + history) |
| S7 | — | **PASS** (python-snap7 fixture) |
| FOCAS | `FOCAS_TRUST_WIRE=1` | **SKIP** (no public simulator — task #222 lab rig) |
| TwinCAT | `TWINCAT_TRUST_WIRE=1` | **SKIP** (needs XAR or standalone Router — task #221) |
| TwinCAT | `TWINCAT_TRUST_WIRE=1` | **SKIP** by default; features **validated** against the TCBSD VM fixture — set the env var to run |
| OpcUaClient | — | **PASS** stages 1-4 + browse (opc-plc Docker fixture); stages 5/7/8 are opt-in (require writable / alarm / historizing upstream) |
| Phase 7 | — | **PASS** if the Modbus instance seeds a `VT_DoubledHR100` virtual tag + `AlarmHigh` scripted alarm |
Set the `*_TRUST_WIRE` env vars to `1` when you've pointed the script at
+5 -2
View File
@@ -422,8 +422,11 @@ function Write-Summary {
[Parameter(Mandatory)] [string]$Title,
[Parameter(Mandatory)] [array]$Results
)
$passed = ($Results | Where-Object { $_.Passed }).Count
$failed = ($Results | Where-Object { -not $_.Passed }).Count
# @(...) forces an array even when Where-Object matches 0 or 1 items,
# otherwise .Count trips Set-StrictMode -Version 3.0 ("property 'Count'
# cannot be found on this object") on $null or on a single hashtable.
$passed = @($Results | Where-Object { $_.Passed }).Count
$failed = @($Results | Where-Object { -not $_.Passed }).Count
Write-Host ""
Write-Host "=== $Title summary: $passed/$($Results.Count) passed ===" `
-ForegroundColor $(if ($failed -eq 0) { "Green" } else { "Red" })
+13 -4
View File
@@ -34,7 +34,7 @@
},
"focas": {
"$comment": "Gated behind FOCAS_TRUST_WIRE=1 — no public simulator. Point at a real CNC + ensure Fwlib32.dll is on PATH.",
"$comment": "Gated behind FOCAS_TRUST_WIRE=1 for real-CNC runs, or pass -ProfileName to run against the focas-mock Docker fixture. Managed wire client — no native dependencies.",
"host": "192.168.1.20",
"port": 8193,
"address": "R100",
@@ -61,9 +61,18 @@
},
"opcuaclient": {
"$comment": "Optional upstream-redundancy probe (PR-14). When both primaryUrl and secondaryUrl are set, test-opcuaclient.ps1 runs an extra bridged read while both upstreams are reachable. Leave keys absent to skip the redundancy stage. The OtOpcUa server's DriverConfig for the OpcUaClient instance must already have Redundancy.Enabled=true + the same EndpointUrls list; this script doesn't reconfigure the driver.",
"primaryUrl": "opc.tcp://localhost:50000",
"secondaryUrl": "opc.tcp://localhost:50002"
"$comment": "OPC UA Client (gateway) driver. Default opc-plc Docker fixture exposes ns=3;s=FastUInt1 as a ticker. The `bridgeNodeId` is the local mirror of remoteNodeId after the OpcUaClient driver's DiscoverAsync runs — dev-specific. Stages 5/7/8 are opt-in: supply writable* NodeIds to enable reverse-bridge, alarmNodeId to enable alarm, historyNodeId to enable history (opc-plc does not historize by default — a Prosys / UA Expert sample server is needed for stage 8).",
"remoteUrl": "opc.tcp://localhost:50000",
"remoteNodeId": "ns=3;s=FastUInt1",
"bridgeNodeId": "ns=2;s=OpcUaClient/FastUInt1",
"bridgeRootNodeId": "ns=2;s=OpcUaClient",
"browseDepth": 3,
"browseMinNodes": 5,
"changeWaitSec": 8,
"writableRemoteNodeId": "",
"writableBridgeNodeId": "",
"alarmNodeId": "",
"historyNodeId": ""
},
"phase7": {
-210
View File
@@ -1,210 +0,0 @@
#Requires -Version 7.0
<#
.SYNOPSIS
End-to-end CLI test for AB CIP HSBY failover routing (PR abcip-5.2). Subscribes to
a tag through the OtOpcUa OPC UA server, flips the active chassis mid-stream via
the paired-fixture's hsby-mux sidecar HTTP endpoint, and asserts the subscribe
stream survives the failover (no permanent loss of notifications + the post-flip
data carries the partner-side update).
.DESCRIPTION
Paired-fixture variant of test-abcip.ps1. Where test-abcip.ps1 runs against a
single ab_server instance, this script assumes a paired fixture with two
ab_server instances (primary + partner) and an hsby-mux sidecar exposing
/flip {"active": "primary" | "partner"} over HTTP.
Five assertions:
- HsbyInitialActive primary is Active at start (hsby-mux primes it)
- HsbyResolveActive driver-diagnostics surfaces AbCip.HsbyActive == 1
- HsbyFailoverFlip POST {"active": "partner"} AbCip.HsbyActive == 2
- HsbySubscribeSurvives subscribe stream stays open across the flip + sees
an updated value from the partner side
- HsbyFailoverCount AbCip.HsbyFailoverCount increments by 1
.PARAMETER PrimaryGateway
ab://host[:port]/cip-path of the primary chassis. Default ab://127.0.0.1/1,0.
.PARAMETER PartnerGateway
ab://host[:port]/cip-path of the partner chassis. Default ab://127.0.0.2/1,0.
.PARAMETER HsbyMuxUrl
Base URL of the paired-fixture's hsby-mux sidecar. Default http://localhost:7080.
Endpoints used:
GET /role returns {"primary":"Active","partner":"Standby"}
POST /flip {"active":"primary"|"partner"} flips role tag values on each chassis
.PARAMETER OpcUaUrl
OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
.PARAMETER BridgeNodeId
NodeId at which the server publishes the tag exercised by the subscribe assertion.
Required.
.PARAMETER TagPath
Logix symbolic path the bridge tag points at. Default 'TestDINT'.
.PARAMETER DriverInstanceId
DriverInstance ID for the AB CIP driver under test. Used to scope the
driver-diagnostics RPC. Default 'abcip-hsby'.
.EXAMPLE
./test-abcip-hsby.ps1 -BridgeNodeId 'ns=2;s=AbCip/Bridge/TestDINT'
#>
param(
[string]$PrimaryGateway = "ab://127.0.0.1/1,0",
[string]$PartnerGateway = "ab://127.0.0.2/1,0",
[string]$HsbyMuxUrl = "http://localhost:7080",
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[Parameter(Mandatory)] [string]$BridgeNodeId,
[string]$TagPath = "TestDINT",
[string]$DriverInstanceId = "abcip-hsby"
)
$ErrorActionPreference = "Stop"
. "$PSScriptRoot/_common.ps1"
$abcipCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli" `
-ExeName "otopcua-abcip-cli"
$opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
$results = @()
function Invoke-HsbyFlip {
param([string]$Active)
$body = @{ active = $Active } | ConvertTo-Json -Compress
try {
Invoke-RestMethod -Uri "$HsbyMuxUrl/flip" -Method Post -Body $body -ContentType 'application/json'
} catch {
throw "hsby-mux at $HsbyMuxUrl/flip rejected the request: $($_.Exception.Message)"
}
}
function Get-HsbyDiagnosticValue {
param([string]$Counter)
# Pull driver-diagnostics through the OPC UA Admin RPC surface. The CLI returns
# a raw JSON blob; we grep out the named counter so the assertion is robust to
# other counters the driver surfaces.
$diagArgs = @($opcUaCli.PrefixArgs) + @(
"driver-diagnostics", "-u", $OpcUaUrl, "-d", $DriverInstanceId)
$diagOut = & $opcUaCli.File @diagArgs 2>&1
$joined = ($diagOut -join "`n")
if ($joined -match "${Counter}.*?:\s*([\d\.]+)") {
return [double]$matches[1]
}
return $null
}
# ---- HsbyInitialActive — hsby-mux primes primary as Active ----
Write-Header "HsbyInitialActive (POST $HsbyMuxUrl/flip {active=primary})"
try {
Invoke-HsbyFlip -Active "primary" | Out-Null
Start-Sleep -Seconds 3 # role-probe loop default tick is 2s
$active = Get-HsbyDiagnosticValue -Counter "AbCip.HsbyActive"
$passed = ($active -eq 1.0)
$results += [PSCustomObject]@{
Name = "HsbyInitialActive"
Passed = $passed
Detail = if ($passed) { "AbCip.HsbyActive=1 after priming primary" } else { "AbCip.HsbyActive=$active (expected 1)" }
}
} catch {
$results += [PSCustomObject]@{
Name = "HsbyInitialActive"; Passed = $false; Detail = $_.Exception.Message
}
}
# ---- HsbyResolveActive — driver routing reads through the primary ----
Write-Header "HsbyResolveActive (read $TagPath via primary)"
$readArgs = @("read") + @("-g", $PrimaryGateway, "-f", "ControlLogix") + @("-t", $TagPath, "--type", "DInt")
$readOut = & $abcipCli.Exe @($abcipCli.Args + $readArgs) 2>&1
$readOk = ($readOut -join "`n") -notmatch "(error|fail)"
$results += [PSCustomObject]@{
Name = "HsbyResolveActive"
Passed = $readOk
Detail = if ($readOk) { "primary read completed without error" } else { "read failed: $($readOut -join ' ')" }
}
# ---- HsbySubscribeSurvives + HsbyFailoverFlip + HsbyFailoverCount ----
Write-Header "HsbyFailoverFlip + HsbySubscribeSurvives (subscribe across flip)"
$failoverBaseline = Get-HsbyDiagnosticValue -Counter "AbCip.HsbyFailoverCount"
if ($null -eq $failoverBaseline) { $failoverBaseline = 0 }
$duration = 12
$subOut = New-TemporaryFile
$subErr = New-TemporaryFile
$subArgs = @($opcUaCli.PrefixArgs) + @(
"subscribe", "-u", $OpcUaUrl, "-n", $BridgeNodeId, "-i", "200", "--duration", "$duration")
$subProc = Start-Process -FilePath $opcUaCli.File -ArgumentList $subArgs `
-NoNewWindow -PassThru `
-RedirectStandardOutput $subOut.FullName `
-RedirectStandardError $subErr.FullName
# Let the subscribe settle + accumulate primary-side notifications.
Start-Sleep -Seconds 3
# Mid-stream flip — primary→Standby, partner→Active.
try {
Invoke-HsbyFlip -Active "partner" | Out-Null
} catch {
Stop-Process -Id $subProc.Id -Force -ErrorAction SilentlyContinue
$results += [PSCustomObject]@{
Name = "HsbyFailoverFlip"; Passed = $false; Detail = "hsby-mux flip rejected: $($_.Exception.Message)"
}
}
# Wait for the role-probe loop to catch up (default tick 2s + ProbeIntervalMs slack).
Start-Sleep -Seconds 4
# Drive a write through the partner so the subscribe sees a fresh value.
$flipValue = Get-Random -Minimum 70000 -Maximum 79999
$writeArgs = @("write") + @("-g", $PartnerGateway, "-f", "ControlLogix") + @("-t", $TagPath, "--type", "DInt", "-v", $flipValue)
& $abcipCli.Exe @($abcipCli.Args + $writeArgs) | Out-Null
$activeAfter = Get-HsbyDiagnosticValue -Counter "AbCip.HsbyActive"
$flipPassed = ($activeAfter -eq 2.0)
$results += [PSCustomObject]@{
Name = "HsbyFailoverFlip"
Passed = $flipPassed
Detail = if ($flipPassed) { "AbCip.HsbyActive=2 after flip" } else { "AbCip.HsbyActive=$activeAfter (expected 2)" }
}
# Stop the subscribe + harvest the stream.
$subProc.WaitForExit(($duration + 5) * 1000) | Out-Null
if (-not $subProc.HasExited) { Stop-Process -Id $subProc.Id -Force }
$subText = (Get-Content $subOut.FullName -Raw) + (Get-Content $subErr.FullName -Raw)
Remove-Item $subOut.FullName, $subErr.FullName -ErrorAction SilentlyContinue
# Stream survival = at least one notification *after* the flip carries the new
# partner-side value. The post-flip write of $flipValue is the canary.
$saw = $subText -match "$flipValue"
$results += [PSCustomObject]@{
Name = "HsbySubscribeSurvives"
Passed = $saw
Detail = if ($saw) {
"subscribe stream surfaced post-flip value $flipValue from partner chassis"
} else {
"subscribe stream did not see the post-flip canary $flipValue — output: $subText"
}
}
# ---- HsbyFailoverCount — counter incremented by ≥ 1 ----
Write-Header "HsbyFailoverCount"
$failoverAfter = Get-HsbyDiagnosticValue -Counter "AbCip.HsbyFailoverCount"
if ($null -eq $failoverAfter) { $failoverAfter = 0 }
$counterOk = ($failoverAfter - $failoverBaseline) -ge 1
$results += [PSCustomObject]@{
Name = "HsbyFailoverCount"
Passed = $counterOk
Detail = if ($counterOk) {
"AbCip.HsbyFailoverCount went from $failoverBaseline$failoverAfter"
} else {
"AbCip.HsbyFailoverCount unchanged ($failoverBaseline$failoverAfter); expected at least 1 increment"
}
}
Write-Summary -Title "AB CIP HSBY failover e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
+1 -200
View File
@@ -30,30 +30,6 @@
.PARAMETER BridgeNodeId
NodeId at which the server publishes the TagPath.
.PARAMETER FastBridgeNodeId
Optional NodeId for a Tag declared with ScanRateMs <= 100. When supplied
alongside SlowBridgeNodeId the script runs the per-tag scan-rate assertion
(PR abcip-4.1).
.PARAMETER SlowBridgeNodeId
Optional NodeId for a Tag declared with ScanRateMs >= 1000. Pair with
FastBridgeNodeId to enable the scan-rate assertion.
.PARAMETER SystemConnectionStatusNodeId
Optional NodeId for the synthetic _System/_ConnectionStatus variable
emitted by AB CIP discovery (PR abcip-4.3). When supplied, the script
runs the SystemTagBrowse assertion reads the value through the OPC UA
server + asserts it surfaces one of the canonical HostState strings.
NodeId form: ns=<n>;s=AbCip/<gateway>/_System/_ConnectionStatus.
.PARAMETER RefreshTagDbNodeId
Optional NodeId for the writeable _System/_RefreshTagDb trigger added in
PR abcip-4.4. When supplied, the script runs the RefreshTagDbWrite
assertion writes True through the OPC UA server + reads back, asserting
the trigger latches to False (Kepware-style "always idle" semantics) and
the write itself surfaces Good. NodeId form:
ns=<n>;s=AbCip/<gateway>/_System/_RefreshTagDb.
#>
param(
@@ -61,19 +37,7 @@ param(
[string]$Family = "ControlLogix",
[string]$TagPath = "TestDINT",
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[Parameter(Mandatory)] [string]$BridgeNodeId,
[string]$FastBridgeNodeId,
[string]$SlowBridgeNodeId,
# PR abcip-4.3 — NodeId for the synthetic _System/_ConnectionStatus variable that
# discovery emits under each device. Optional — when wired, runs the
# SystemTagBrowse assertion that browses + reads the system folder through the OPC UA
# server. NodeId form: ns=<n>;s=AbCip/<gateway>/_System/_ConnectionStatus.
[string]$SystemConnectionStatusNodeId,
# PR abcip-4.4 — NodeId for the writeable _System/_RefreshTagDb refresh-trigger.
# Mirrors the SystemConnectionStatusNodeId knob: optional, only runs the
# RefreshTagDbWrite assertion when supplied. NodeId form:
# ns=<n>;s=AbCip/<gateway>/_System/_RefreshTagDb.
[string]$RefreshTagDbNodeId
[Parameter(Mandatory)] [string]$BridgeNodeId
)
$ErrorActionPreference = "Stop"
@@ -130,168 +94,5 @@ $results += Test-SubscribeSeesChange `
-DriverWriteArgs (@("write") + $commonAbCip + @("-t", $TagPath, "--type", "DInt", "-v", $subValue)) `
-ExpectedValue "$subValue"
# PR abcip-3.2 — Symbolic-vs-Logical sanity assertion. Reads the same tag with both
# addressing modes through the CLI's --addressing-mode flag. Logical-mode against ab_server
# falls back to Symbolic on the wire (libplctag wrapper limitation; see AbCip-Performance.md
# §Addressing mode), so the assertion is "both modes complete + return the same value" — not
# a perf comparison. Skipped on Micro800 (driver downgrades Logical → Symbolic with warning,
# making both reads identical-by-design + uninteresting to compare here).
if ($Family -ne "Micro800") {
$symValue = Get-Random -Minimum 40000 -Maximum 49999
Write-Host "AB CIP e2e: priming gateway with $symValue then reading via Symbolic + Logical"
$writeArgs = @("write") + $commonAbCip + @("-t", $TagPath, "--type", "DInt", "-v", $symValue)
& $abcipCli.Exe @($abcipCli.Args + $writeArgs) | Out-Null
$symRead = & $abcipCli.Exe @($abcipCli.Args + @("read") + $commonAbCip + @("-t", $TagPath, "--type", "DInt", "--addressing-mode", "Symbolic"))
$logRead = & $abcipCli.Exe @($abcipCli.Args + @("read") + $commonAbCip + @("-t", $TagPath, "--type", "DInt", "--addressing-mode", "Logical"))
$symMatched = ($symRead -join "`n") -match "$symValue"
$logMatched = ($logRead -join "`n") -match "$symValue"
$passed = $symMatched -and $logMatched
$results += [PSCustomObject]@{
Name = "AddressingModeSanity"
Passed = $passed
Detail = if ($passed) { "Symbolic + Logical both returned $symValue" } else { "Sym=$symMatched Log=$logMatched" }
}
}
# PR abcip-4.1 — per-tag scan-rate divergence assertion. Runs only when both fast + slow
# NodeIds are wired; otherwise this knob is skipped on the existing single-NodeId fixture.
# The assertion is "fast bucket sees > 5x as many notifications as slow bucket" — the
# unit + integration tests cover the bucketing math, this just proves the multi-rate split
# survives end-to-end through the OPC UA server's Subscription / MonitoredItem path.
if ($FastBridgeNodeId -and $SlowBridgeNodeId) {
Write-Header "Per-tag scan rate (FastBridge=$FastBridgeNodeId, SlowBridge=$SlowBridgeNodeId)"
$duration = 8
$fastOut = New-TemporaryFile
$slowOut = New-TemporaryFile
$fastErr = New-TemporaryFile
$slowErr = New-TemporaryFile
$fastArgs = @($opcUaCli.PrefixArgs) + @("subscribe", "-u", $OpcUaUrl, "-n", $FastBridgeNodeId, "-i", "100", "--duration", "$duration")
$slowArgs = @($opcUaCli.PrefixArgs) + @("subscribe", "-u", $OpcUaUrl, "-n", $SlowBridgeNodeId, "-i", "1000", "--duration", "$duration")
$fastProc = Start-Process -FilePath $opcUaCli.File -ArgumentList $fastArgs `
-NoNewWindow -PassThru `
-RedirectStandardOutput $fastOut.FullName `
-RedirectStandardError $fastErr.FullName
$slowProc = Start-Process -FilePath $opcUaCli.File -ArgumentList $slowArgs `
-NoNewWindow -PassThru `
-RedirectStandardOutput $slowOut.FullName `
-RedirectStandardError $slowErr.FullName
Start-Sleep -Seconds 2
# Drive a single PLC change so even stable tags get *one* notification during the window
# (initial-data push + 1 change). The cadence assertion below relies on the fast tag
# accumulating sampling-interval-driven events even between explicit changes.
$tickValue = Get-Random -Minimum 50000 -Maximum 59999
$writeArgs = @("write") + $commonAbCip + @("-t", $TagPath, "--type", "DInt", "-v", $tickValue)
& $abcipCli.Exe @($abcipCli.Args + $writeArgs) | Out-Null
$fastProc.WaitForExit(($duration + 5) * 1000) | Out-Null
$slowProc.WaitForExit(($duration + 5) * 1000) | Out-Null
if (-not $fastProc.HasExited) { Stop-Process -Id $fastProc.Id -Force }
if (-not $slowProc.HasExited) { Stop-Process -Id $slowProc.Id -Force }
$fastText = (Get-Content $fastOut.FullName -Raw) + (Get-Content $fastErr.FullName -Raw)
$slowText = (Get-Content $slowOut.FullName -Raw) + (Get-Content $slowErr.FullName -Raw)
Remove-Item $fastOut.FullName, $slowOut.FullName, $fastErr.FullName, $slowErr.FullName -ErrorAction SilentlyContinue
# Each data-change line matches `=\s*<value>\s*(<status>)` per Test-SubscribeSeesChange.
$fastMatches = ([regex]::Matches($fastText, "=\s*\S+\s*\(")).Count
$slowMatches = ([regex]::Matches($slowText, "=\s*\S+\s*\(")).Count
$passed = ($fastMatches -ge 5) -and ($fastMatches -gt ($slowMatches * 5))
$detail = if ($passed) {
"fast=$fastMatches notifications vs slow=$slowMatches (>5x ratio achieved)"
} else {
"fast=$fastMatches slow=$slowMatches — expected fast > slow*5"
}
$results += [PSCustomObject]@{ Name = "PerTagScanRate"; Passed = $passed; Detail = $detail }
}
# PR abcip-4.2 — write-coalesce assertion. Writes the same value twice through the OPC UA
# server and verifies the PLC-side state reflects only one wire write. The driver-side
# diagnostics counter (AbCip.WritesSuppressed) is the authoritative signal, but ab_server
# itself doesn't expose a "writes received" counter so this script-level check is intentionally
# observational — it primes the tag with a baseline, writes the same value twice, and reads
# back to confirm the value matches without surfacing additional state changes. The unit + integration
# tests do the strict "exactly N suppressions" math; this is the e2e shape proof.
$coalesceValue = Get-Random -Minimum 60000 -Maximum 69999
Write-Header "WriteCoalesce (baseline=$coalesceValue, two redundant writes)"
$writeArgs = @("write") + $commonAbCip + @("-t", $TagPath, "--type", "DInt", "-v", $coalesceValue)
& $abcipCli.Exe @($abcipCli.Args + $writeArgs) | Out-Null
& $abcipCli.Exe @($abcipCli.Args + $writeArgs) | Out-Null
& $abcipCli.Exe @($abcipCli.Args + $writeArgs) | Out-Null
$readArgs = @("read") + $commonAbCip + @("-t", $TagPath, "--type", "DInt")
$readOut = & $abcipCli.Exe @($abcipCli.Args + $readArgs)
$coalesceMatch = ($readOut -join "`n") -match "$coalesceValue"
$results += [PSCustomObject]@{
Name = "WriteCoalesce"
Passed = $coalesceMatch
Detail = if ($coalesceMatch) {
"three identical writes of $coalesceValue produced the expected readback (driver-side WritesSuppressed counter exposed via driver-diagnostics RPC)"
} else {
"three identical writes did not converge on $coalesceValue — got '$readOut'"
}
}
# PR abcip-4.3 — _System/_ConnectionStatus browse-and-read assertion. Reads the live
# diagnostic snapshot via the OPC UA Client CLI; the value comes straight from the
# AbCipSystemTagSource (no libplctag round-trip). When the probe loop is healthy + the
# gateway is reachable, the value should be "Running"; on a stopped fixture it would be
# "Stopped". The assertion accepts any of the four canonical states, plus the "Unknown"
# transient that surfaces before the first probe iteration completes.
if ($SystemConnectionStatusNodeId) {
Write-Header "SystemTagBrowse (_System/_ConnectionStatus from $SystemConnectionStatusNodeId)"
$sysReadArgs = @($opcUaCli.PrefixArgs) + @("read", "-u", $OpcUaUrl, "-n", $SystemConnectionStatusNodeId)
$sysOut = & $opcUaCli.File @sysReadArgs 2>&1
$sysJoined = ($sysOut -join "`n")
$sysMatched = $sysJoined -match "Running|Stopped|Unknown|Faulted"
$results += [PSCustomObject]@{
Name = "SystemTagBrowse"
Passed = $sysMatched
Detail = if ($sysMatched) {
"_ConnectionStatus surfaced one of Running / Stopped / Unknown / Faulted via OPC UA"
} else {
"_ConnectionStatus did not surface a recognised HostState — got '$sysJoined'"
}
}
}
# PR abcip-4.4 — _RefreshTagDb write-then-verify assertion. Writes True through the
# OPC UA server (the live driver intercepts the write + dispatches to RebrowseAsync
# against the cached IAddressSpaceBuilder) + reads back, asserting Kepware-style
# latch semantics: the trigger always reads False the moment the dispatch returns.
# Pairs with the existing rebrowse step driven by the AbCip CLI (issue #233) — both
# surfaces hit the same RebrowseAsync entry point, just from different sides of the
# OPC UA wire.
if ($RefreshTagDbNodeId) {
Write-Header "RefreshTagDbWrite (_System/_RefreshTagDb from $RefreshTagDbNodeId)"
$writeArgs = @($opcUaCli.PrefixArgs) + @(
"write", "-u", $OpcUaUrl, "-n", $RefreshTagDbNodeId, "-v", "true", "--type", "Boolean")
$writeOut = & $opcUaCli.File @writeArgs 2>&1
$writeJoined = ($writeOut -join "`n")
# The OPC UA Client CLI surfaces "Good" on success; a non-Good result still
# round-trips the literal status code so we can match generously.
$writeOk = $writeJoined -match "Good"
$readArgs = @($opcUaCli.PrefixArgs) + @("read", "-u", $OpcUaUrl, "-n", $RefreshTagDbNodeId)
$readOut = & $opcUaCli.File @readArgs 2>&1
$readJoined = ($readOut -join "`n")
# Kepware-style trigger reads always return false — assert the trigger isn't
# latched to true after the write. Match case-insensitively because the OPC UA
# Client CLI may render the value as "False" or "false".
$readFalse = $readJoined -imatch "false"
$passed = $writeOk -and $readFalse
$results += [PSCustomObject]@{
Name = "RefreshTagDbWrite"
Passed = $passed
Detail = if ($passed) {
"_RefreshTagDb write returned Good and read-back surfaced false — Kepware-style latch held"
} else {
"RefreshTagDb write/verify failed — write='$writeJoined' read='$readJoined'"
}
}
}
Write-Summary -Title "AB CIP e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
+1 -244
View File
@@ -29,41 +29,6 @@
.PARAMETER BridgeNodeId
NodeId at which the server publishes the Address.
.PARAMETER DiagnosticsRequestCountNodeId
Optional NodeId for the synthetic _Diagnostics/<host>/RequestCount variable
emitted by AB Legacy discovery (PR ablegacy-10 / #253). When supplied, the
script runs the DiagnosticsRequestCount assertion: reads the user-tag
BridgeNodeId N times through the OPC UA server, then reads the diagnostic
counter and asserts the value is at least N (a probe loop or a parallel
client may have bumped it by more, so the comparison is `>=`). NodeId form:
ns=<n>;s=AbLegacy/<gateway>/_Diagnostics/RequestCount. Mirrors the
-SystemConnectionStatusNodeId knob on test-abcip.ps1.
.PARAMETER DiagnosticsDemoteCountNodeId
Optional NodeId for the synthetic _Diagnostics/<host>/DemoteCount variable
emitted by AB Legacy discovery (PR ablegacy-12 / #255). When supplied, the
script runs the auto-demote assertion: kills the simulator container so
reads start failing, hammers the user-tag BridgeNodeId at least
FailureThreshold times to trip the demotion, then reads the diagnostic
counter and asserts the value increased by >= 1. NodeId form:
ns=<n>;s=AbLegacy/<gateway>/_Diagnostics/DemoteCount. The simulator
must support `docker stop otopcua-ab-server-slc500` for the kill stage.
.PARAMETER FailureThresholdForDemote
Failure threshold the server is configured with (default 3). The
demote assertion writes/reads N+1 times against the killed simulator
to guarantee the threshold trips even if some reads beat the kill.
.PARAMETER DhPlusStation
PR ablegacy-13 / #256 — DH+ node address (octal 0..77 == decimal 0..63)
of a PLC-5 reachable through a 1756-DHRIO module. **Documentation
parameter only there is no automated assertion**: libplctag's ab_server
does not simulate the DHRIO + DH+ + PLC-5 stack, so wire-level coverage
requires real hardware. When supplied alongside a `-Gateway` of the form
`ab://<host>/1,<slot>,2,<station-octal>` and `-PlcType Plc5`, the value
here is recorded in the run log so reproducibility is auditable. See
docs/drivers/AbLegacy-DH-Bridging.md for the manual smoke procedure.
#>
param(
@@ -71,14 +36,7 @@ param(
[string]$PlcType = "Slc500",
[string]$Address = "N7:5",
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[Parameter(Mandatory)] [string]$BridgeNodeId,
[string]$DiagnosticsRequestCountNodeId,
[string]$DiagnosticsDemoteCountNodeId,
[int]$FailureThresholdForDemote = 3,
# PR ablegacy-13 / #256 — DH+ station via 1756-DHRIO bridging. Doc-only:
# no automated assertion (no Docker fixture covers DH+). See script header
# comment + docs/drivers/AbLegacy-DH-Bridging.md.
[string]$DhPlusStation
[Parameter(Mandatory)] [string]$BridgeNodeId
)
$ErrorActionPreference = "Stop"
@@ -137,206 +95,5 @@ $results += Test-SubscribeSeesChange `
-DriverWriteArgs (@("write") + $commonAbLegacy + @("-a", $Address, "-t", "Int", "-v", $subValue)) `
-ExpectedValue "$subValue"
# PR 7 — contiguous array read smoke. The default `--tag=N7[120]` in the Docker
# fixture's docker-compose.yml has plenty of room for `,10`; against real hardware
# the seeded N7 file just needs at least 10 words. Asserts the CLI exits 0 (the
# driver issued one PCCC frame for the whole block) — the per-element values are
# whatever the device currently holds.
Write-Header "Array contiguous read"
$arrayResult = Invoke-Cli -Cli $abLegacyCli `
-Args (@("read") + $commonAbLegacy + @("-a", "N7:0,10", "-t", "Int"))
if ($arrayResult.ExitCode -eq 0) {
Write-Pass "array read N7:0,10 succeeded"
$results += @{ Passed = $true }
} else {
Write-Fail "array read N7:0,10 exit=$($arrayResult.ExitCode)"
Write-Host $arrayResult.Output
$results += @{ Passed = $false; Reason = "array read exit $($arrayResult.ExitCode)" }
}
# PR 8 — deadband subscribe assertion. Subscribe with --deadband-absolute 5,
# write three small deltas (each within the 5-unit deadband), assert exactly
# one notification fires (the first-seen sample). The fourth write breaks
# above the threshold and the subscription should fire again.
Write-Header "Deadband subscribe (--deadband-absolute 5)"
$baseValue = Get-Random -Minimum 100 -Maximum 200
& $abLegacyCli.File @($abLegacyCli.PrefixArgs) `
@("write") + $commonAbLegacy + @("-a", $Address, "-t", "Int", "-v", $baseValue) | Out-Null
$subscribeProc = Start-Process -FilePath $abLegacyCli.File `
-ArgumentList ($abLegacyCli.PrefixArgs + @("subscribe") + $commonAbLegacy `
+ @("-a", $Address, "-t", "Int", "-i", "200", "--deadband-absolute", "5")) `
-PassThru -RedirectStandardOutput "$env:TEMP/ablegacy-deadband.out" `
-RedirectStandardError "$env:TEMP/ablegacy-deadband.err"
Start-Sleep -Seconds 2
# Three small deltas within deadband.
& $abLegacyCli.File @($abLegacyCli.PrefixArgs) `
@("write") + $commonAbLegacy + @("-a", $Address, "-t", "Int", "-v", ($baseValue + 1)) | Out-Null
Start-Sleep -Milliseconds 500
& $abLegacyCli.File @($abLegacyCli.PrefixArgs) `
@("write") + $commonAbLegacy + @("-a", $Address, "-t", "Int", "-v", ($baseValue + 2)) | Out-Null
Start-Sleep -Milliseconds 500
& $abLegacyCli.File @($abLegacyCli.PrefixArgs) `
@("write") + $commonAbLegacy + @("-a", $Address, "-t", "Int", "-v", ($baseValue + 3)) | Out-Null
Start-Sleep -Milliseconds 500
Stop-Process -Id $subscribeProc.Id -Force -ErrorAction SilentlyContinue
$subscribeOutput = Get-Content "$env:TEMP/ablegacy-deadband.out" -ErrorAction SilentlyContinue
# Count `=` lines (the SubscribeCommand format prints one per OnDataChange). Expect exactly 1
# (the first-seen sample at $baseValue) — none of the +1/+2/+3 deltas crosses the 5 absolute.
$notifyLines = @($subscribeOutput | Where-Object { $_ -match " = " })
if ($notifyLines.Count -eq 1) {
Write-Pass "deadband subscribe emitted 1 notification (initial only); 3 sub-threshold writes suppressed"
$results += @{ Passed = $true }
} else {
Write-Fail "deadband subscribe expected 1 notification; got $($notifyLines.Count)"
Write-Host ($subscribeOutput -join "`n")
$results += @{ Passed = $false; Reason = "deadband notify count $($notifyLines.Count)" }
}
# PR ablegacy-10 / #253 — diagnostic-counter round-trip assertion. After N reads
# against the user-tag BridgeNodeId the auto-emitted _Diagnostics/<host>/RequestCount
# counter must be >= N. The exact equality isn't asserted because a probe loop /
# parallel client may have bumped the counter — the spec is "every read counts".
if ($DiagnosticsRequestCountNodeId) {
Write-Header "DiagnosticsRequestCount (_Diagnostics/RequestCount from $DiagnosticsRequestCountNodeId)"
$diagN = 5
# Read the first counter snapshot to baseline; the assertion compares delta against
# the N OPC UA reads we issue between snapshots so a noisy probe loop doesn't
# invalidate the test.
$baselineOut = & $opcUaCli.File @($opcUaCli.PrefixArgs) `
@("read", "-u", $OpcUaUrl, "-n", $DiagnosticsRequestCountNodeId) 2>&1
$baseline = 0
if (($baselineOut -join "`n") -match '(\d+)') { $baseline = [int64]$Matches[1] }
for ($i = 0; $i -lt $diagN; $i++) {
& $opcUaCli.File @($opcUaCli.PrefixArgs) `
@("read", "-u", $OpcUaUrl, "-n", $BridgeNodeId) | Out-Null
}
$afterOut = & $opcUaCli.File @($opcUaCli.PrefixArgs) `
@("read", "-u", $OpcUaUrl, "-n", $DiagnosticsRequestCountNodeId) 2>&1
$after = 0
if (($afterOut -join "`n") -match '(\d+)') { $after = [int64]$Matches[1] }
$delta = $after - $baseline
if ($delta -ge $diagN) {
Write-Pass "DiagnosticsRequestCount delta $delta >= $diagN OPC UA reads"
$results += @{ Passed = $true }
} else {
Write-Fail "DiagnosticsRequestCount delta $delta < $diagN OPC UA reads (baseline=$baseline after=$after)"
$results += @{ Passed = $false; Reason = "diag delta $delta < $diagN" }
}
}
# ablegacy-11 / #254 — RSLogix CSV import smoke. Builds an in-memory canonical CSV
# (one row per N/F/B/L/ST/T/C/R file letter), invokes `import-rslogix --emit
# appsettings-fragment` against it, parses the resulting JSON, and asserts the Tags
# array carries exactly 8 entries. Doesn't talk to the PLC — purely offline parser
# coverage.
Write-Header "RSLogix CSV import"
$importCsvPath = Join-Path $env:TEMP "ablegacy-rslogix-canonical-$([guid]::NewGuid()).csv"
$importJsonPath = Join-Path $env:TEMP "ablegacy-rslogix-fragment-$([guid]::NewGuid()).json"
@"
Symbol,Address,Description,DataType,Scope
MotorSpeed,N7:0,Motor speed setpoint,INT,Global
TankLevel,F8:0,Tank level (gallons),REAL,Global
RunFlag,B3:0/0,Run command flag,BOOL,Global
TotalCount,L9:0,Total piece count,LINT,Global
RecipeName,ST10:0,"Recipe name, free-form text",STRING,Global
DwellTimer,T4:0.ACC,Dwell timer accumulator,TIMER,Global
PieceCounter,C5:0.ACC,Piece counter accumulator,COUNTER,Global
StateMachine,R6:0.LEN,State-machine control length,CONTROL,Global
"@ | Set-Content -Path $importCsvPath -Encoding UTF8
try {
$importResult = Invoke-Cli -Cli $abLegacyCli `
-Args @("import-rslogix", "--file", $importCsvPath, "--device", $Gateway,
"--emit", "appsettings-fragment", "--output", $importJsonPath)
if ($importResult.ExitCode -ne 0) {
Write-Fail "import-rslogix exit=$($importResult.ExitCode): $($importResult.Output)"
$results += @{ Passed = $false; Reason = "import-rslogix exit $($importResult.ExitCode)" }
}
elseif (-not (Test-Path $importJsonPath)) {
Write-Fail "import-rslogix produced no output file at $importJsonPath"
$results += @{ Passed = $false; Reason = "no output file" }
}
else {
$fragment = Get-Content $importJsonPath -Raw | ConvertFrom-Json
$tagCount = @($fragment.Tags).Count
if ($tagCount -eq 8) {
Write-Pass "import-rslogix emitted $tagCount tag(s) — matches CSV row count"
$results += @{ Passed = $true }
} else {
Write-Fail "import-rslogix emitted $tagCount tag(s); expected 8"
$results += @{ Passed = $false; Reason = "tag count $tagCount" }
}
}
}
finally {
Remove-Item -Path $importCsvPath -ErrorAction SilentlyContinue
Remove-Item -Path $importJsonPath -ErrorAction SilentlyContinue
}
# PR ablegacy-12 / #255 — auto-demote round-trip. Kill the simulator container,
# hammer the bridge NodeId past the failure threshold, then assert the
# DemoteCount diagnostic incremented. Restart the simulator at the end so the
# next run gets a clean baseline. Gated on -DiagnosticsDemoteCountNodeId so
# environments without docker-side control of the simulator can opt out.
if ($DiagnosticsDemoteCountNodeId) {
Write-Header "AutoDemote (kill simulator + observe DemoteCount from $DiagnosticsDemoteCountNodeId)"
$baselineDemoteOut = & $opcUaCli.File @($opcUaCli.PrefixArgs) `
@("read", "-u", $OpcUaUrl, "-n", $DiagnosticsDemoteCountNodeId) 2>&1
$baselineDemote = 0
if (($baselineDemoteOut -join "`n") -match '(\d+)') { $baselineDemote = [int64]$Matches[1] }
# Best-effort container kill — prefer the slc500 profile name; fall back to
# micrologix / plc5 in case the operator pointed the e2e at a different family.
$simContainers = @("otopcua-ab-server-slc500", "otopcua-ab-server-micrologix", "otopcua-ab-server-plc5")
$killed = $false
foreach ($c in $simContainers) {
$stop = docker stop $c 2>$null
if ($LASTEXITCODE -eq 0 -and $stop) {
Write-Host "Stopped $c"
$killed = $true
break
}
}
if (-not $killed) {
Write-Fail "AutoDemote: no ab_server container found via 'docker stop' — skipping demote assertion"
$results += @{ Passed = $false; Reason = "no simulator container to kill" }
}
else {
# Hammer past the threshold. Each read against a now-unreachable simulator
# surfaces BadCommunicationError; FailureThreshold consecutive ones trip
# the demotion. We add 2 extra to absorb timing slack (one read may be
# in-flight when the kill lands).
$hammerCount = $FailureThresholdForDemote + 2
for ($i = 0; $i -lt $hammerCount; $i++) {
& $opcUaCli.File @($opcUaCli.PrefixArgs) `
@("read", "-u", $OpcUaUrl, "-n", $BridgeNodeId) 2>&1 | Out-Null
}
Start-Sleep -Seconds 1
$afterDemoteOut = & $opcUaCli.File @($opcUaCli.PrefixArgs) `
@("read", "-u", $OpcUaUrl, "-n", $DiagnosticsDemoteCountNodeId) 2>&1
$afterDemote = 0
if (($afterDemoteOut -join "`n") -match '(\d+)') { $afterDemote = [int64]$Matches[1] }
$deltaDemote = $afterDemote - $baselineDemote
if ($deltaDemote -ge 1) {
Write-Pass "AutoDemote DemoteCount delta $deltaDemote >= 1 after $hammerCount failed reads"
$results += @{ Passed = $true }
} else {
Write-Fail "AutoDemote DemoteCount delta $deltaDemote < 1 (baseline=$baselineDemote after=$afterDemote)"
$results += @{ Passed = $false; Reason = "demote delta $deltaDemote" }
}
# Restart the simulator so subsequent test runs have a clean baseline.
# Best-effort — if docker-compose isn't on the path the operator can
# bring it back manually via the Docker/docker-compose.yml profile.
try { docker start (docker ps -aq -f "name=otopcua-ab-server-") | Out-Null } catch { }
}
}
Write-Summary -Title "AB Legacy e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
+30 -1
View File
@@ -189,6 +189,33 @@ if ($galaxy) {
}
else { $summary["galaxy"] = "SKIP (no config entry)" }
# ---------------------------------------------------------------------------
# OPC UA Client (gateway driver)
# ---------------------------------------------------------------------------
$opcuaclient = Get-Or $config "opcuaclient"
if ($opcuaclient) {
Write-Header "== OPC UA CLIENT =="
Run-Suite "opcuaclient" {
& "$PSScriptRoot/test-opcuaclient.ps1" `
-RemoteUrl (Get-Or $opcuaclient "remoteUrl" "opc.tcp://localhost:50000") `
-OpcUaUrl (Get-Or $opcuaclient "opcUaUrl" $OpcUaUrl) `
-RemoteNodeId (Get-Or $opcuaclient "remoteNodeId" "ns=3;s=FastUInt1") `
-BridgeNodeId $opcuaclient["bridgeNodeId"] `
-WritableRemoteNodeId (Get-Or $opcuaclient "writableRemoteNodeId" "") `
-WritableBridgeNodeId (Get-Or $opcuaclient "writableBridgeNodeId" "") `
-BridgeRootNodeId (Get-Or $opcuaclient "bridgeRootNodeId" "i=85") `
-BrowseDepth (Get-Or $opcuaclient "browseDepth" 3) `
-BrowseMinNodes (Get-Or $opcuaclient "browseMinNodes" 5) `
-AlarmNodeId (Get-Or $opcuaclient "alarmNodeId" "") `
-AlarmWaitSec (Get-Or $opcuaclient "alarmWaitSec" 15) `
-HistoryNodeId (Get-Or $opcuaclient "historyNodeId" "") `
-HistoryLookbackSec (Get-Or $opcuaclient "historyLookbackSec" 3600) `
-ChangeWaitSec (Get-Or $opcuaclient "changeWaitSec" 8)
}
}
else { $summary["opcuaclient"] = "SKIP (no config entry)" }
$phase7 = Get-Or $config "phase7"
if ($phase7) {
Write-Header "== PHASE 7 virtual tags + scripted alarms =="
@@ -220,7 +247,9 @@ $summary.GetEnumerator() | ForEach-Object {
Write-Host (" {0,-10} {1}" -f $_.Key, $_.Value) -ForegroundColor $color
}
$failed = ($summary.Values | Where-Object { $_ -eq "FAIL" }).Count
# @() wrap — Where-Object returns $null / a single scalar for 0-or-1 matches,
# and .Count on either trips Set-StrictMode -Version 3.0.
$failed = @($summary.Values | Where-Object { $_ -eq "FAIL" }).Count
if ($failed -gt 0) {
Write-Host "$failed suite(s) failed." -ForegroundColor Red
exit 1
+148 -131
View File
@@ -1,16 +1,35 @@
#Requires -Version 7.0
#Requires -Version 7.0
<#
.SYNOPSIS
End-to-end CLI test for the FOCAS (Fanuc CNC) driver.
.DESCRIPTION
**Hardware-gated.** There is no public FOCAS simulator; the driver's
FwlibFocasClient P/Invokes Fanuc's licensed Fwlib32.dll. Against a dev
box without the DLL on PATH the test will skip with a clear message.
Against a real CNC with the DLL present it runs probe / driver-loopback /
server-bridge the same way the other scripts do.
Runs the CLI against either the managed wire client (default Driver.FOCAS.Cli
dials the CNC on TCP:8193 directly, no native dependencies) or the focas-mock
Docker fixture. Hardware-gated by default because the default CncHost is
127.0.0.1; set FOCAS_TRUST_WIRE=1 once -CncHost points at a real CNC, or pass
-ProfileName to run against the Docker sim.
Set FOCAS_TRUST_WIRE=1 when -CncHost points at a real CNC to un-gate.
The script also supports three nice-to-have modes shipped 2026-04-24:
-Series per-series matrix mode. Accepts a comma-separated list; the
core stages are run once per series, swapping the -Address to
the supplied per-series probe. Fails fast if any series's
configured address is outside the documented range (the driver
itself enforces that at InitializeAsync).
-ProfileName for use with the Python Docker simulator (see
docs/v2/implementation/focas-simulator-plan.md). Selects a
docker-compose profile + matching -Series. When set, the
FOCAS_TRUST_WIRE gate is considered satisfied because the sim
is a legitimate non-hardware target.
-HandleLeakCycles <int> stress stage that opens + closes <N> sessions
via the CLI's `probe` command with a short sleep between
cycles. Exercises the Tier-C supervisor's handle-recycle path
without touching user data. Typical values: 1001000. A CNC's
FWLIB handle pool is finite (~510), so this shakes out
handle-leak bugs if either side forgets to free.
.PARAMETER CncHost
IP or hostname of the CNC. Default 127.0.0.1 override for real runs.
@@ -19,76 +38,79 @@
FOCAS TCP port. Default 8193.
.PARAMETER Address
FOCAS address to exercise. Default R100 (PMC R-file register).
FOCAS address to exercise. Default R100 (PMC R-file register). Ignored
when -Series is set and the series profile supplies its own probe.
.PARAMETER Series
Comma-separated list of CNC series to run the matrix against. Known:
ZeroI_D, ZeroI_F, ZeroI_MF, ZeroI_TF, Sixteen_i, Thirty_i, ThirtyOne_i,
ThirtyTwo_i, PowerMotion_i. When empty the script runs a single pass
without a series constraint.
.PARAMETER ProfileName
docker-compose profile name from tests/.../Docker/profiles/. When set,
the script assumes the Python simulator is the target + un-gates
FOCAS_TRUST_WIRE.
.PARAMETER HandleLeakCycles
Run a handle-leak stress stage with <N> open/close cycles. 0 = skip.
.PARAMETER OpcUaUrl
OtOpcUa server endpoint.
.PARAMETER BridgeNodeId
NodeId at which the server publishes the Address.
.PARAMETER Write
Issue #268 (F4-a) + #269 (F4-b) — opts the script into write stages.
Without -Write the script runs read-only probe / loopback / bridge
coverage. With -Write the script additionally exercises the F4-b
cnc_wrparam + cnc_wrmacro round-trip stages against the configured
-ParamAddress / -MacroAddress (default safe values). The wire writes
fire only when FOCAS_TRUST_WIRE=1 (already gated above) AND the
operator explicitly requests the write path.
.PARAMETER ParamAddress
Parameter address for the F4-b write stage (default PARAM:1815).
Only used when -Write is supplied. Pick a parameter that's safe to
scribble on for your CNC setup the default is benign for a stock
Fanuc 30i but every site differs.
.PARAMETER MacroAddress
Macro variable for the F4-b write stage (default MACRO:500). Macro
writes are the lowest-risk write surface (no parameter-write switch
needed, no MDI mode required) so this stage runs whenever -Write is
supplied.
.PARAMETER PmcBitAddress
PMC bit address for the F4-c bit-write round-trip stage (default
R100.3). Only fires when -Write is supplied AND the operator
double-opts in via FOCAS_PMC_WRITE=1, mirroring the FOCAS_PARAM_WRITE
gate. PMC writes have a higher blast radius than PARAM/MACRO (a
mistargeted bit can move motion or latch a feedhold) so the gate is
off by default see docs/v2/focas-deployment.md "Write safety / PMC
pre-checks".
.PARAMETER CncPassword
Issue #271 (F4-d) — optional CNC connection-level password emitted via
cnc_wrunlockparam on connect. Required only when the controller gates
parameter writes behind a password switch (16i + some 30i firmwares
with parameter-protect on). Threaded through to every CLI invocation
in the -Write stage as --cnc-password. PASSWORD INVARIANT: never
logged the CLI's Serilog config does not destructure this flag.
See docs/v2/focas-deployment.md § "FOCAS password handling" for the
no-log invariant + rotation runbook.
#>
param(
[string]$CncHost = "127.0.0.1",
[int]$CncPort = 8193,
[string]$Address = "R100",
[string]$Series = "",
[string]$ProfileName = "",
[int]$HandleLeakCycles = 0,
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[Parameter(Mandatory)] [string]$BridgeNodeId,
[switch]$Write,
[string]$ParamAddress = "PARAM:1815",
[string]$MacroAddress = "MACRO:500",
[string]$PmcBitAddress = "R100.3",
[string]$CncPassword = ""
[Parameter(Mandatory)] [string]$BridgeNodeId
)
$ErrorActionPreference = "Stop"
. "$PSScriptRoot/_common.ps1"
if (-not ($env:FOCAS_TRUST_WIRE -eq "1" -or $env:FOCAS_TRUST_WIRE -eq "true")) {
Write-Skip "FOCAS_TRUST_WIRE not set — no public simulator exists (task #222 tracks the lab rig). Set =1 when -CncHost points at a real CNC with Fwlib32.dll on PATH."
$simGated = -not [string]::IsNullOrWhiteSpace($ProfileName)
if (-not $simGated -and -not ($env:FOCAS_TRUST_WIRE -eq "1" -or $env:FOCAS_TRUST_WIRE -eq "true")) {
Write-Skip "FOCAS_TRUST_WIRE not set. Pass -ProfileName <profile> to run against the Docker mock in tests/.../Driver.FOCAS.IntegrationTests/Docker/, or set FOCAS_TRUST_WIRE=1 when -CncHost points at a real CNC."
exit 0
}
if ($simGated) {
Write-Info "Sim mode — profile '$ProfileName'. FOCAS_TRUST_WIRE gate bypassed."
}
# Per-series probe addresses — each one is inside the authoritative range for
# that series (docs/v2/focas-version-matrix.md). Picking one representative per
# kind (PMC / parameter / macro) is enough to exercise the driver's validator.
$seriesProbes = @{
"ZeroI_D" = "R100"
"ZeroI_F" = "R100"
"ZeroI_MF" = "R100"
"ZeroI_TF" = "R100"
"Sixteen_i" = "R100"
"Thirty_i" = "R100"
"ThirtyOne_i" = "R100"
"ThirtyTwo_i" = "R100"
"PowerMotion_i"= "R100"
}
$seriesList = @()
if (-not [string]::IsNullOrWhiteSpace($Series)) {
$seriesList = @($Series.Split(',') | ForEach-Object { $_.Trim() } | Where-Object { $_ })
$unknown = @($seriesList | Where-Object { -not $seriesProbes.ContainsKey($_) })
if ($unknown.Count -gt 0) {
Write-Fail "Unknown -Series entries: $($unknown -join ', '). Known: $($seriesProbes.Keys -join ', ')."
exit 2
}
}
$focasCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli" `
-ExeName "otopcua-focas-cli"
@@ -96,108 +118,103 @@ $opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
$commonFocas = @("-h", $CncHost, "-p", $CncPort)
# F4-d (issue #271) — thread the CNC connection password through to every CLI
# invocation. The CLI's --cnc-password flag emits cnc_wrunlockparam on connect
# and the driver's per-call retry path re-issues unlock + retries once on
# EW_PASSWD. PASSWORD INVARIANT: the password is NOT logged here. Write-Host
# and Test-* helpers never destructure $commonFocas, but we still avoid
# Write-Host'ing the array directly; the CLI's Serilog config also redacts.
if (-not [string]::IsNullOrWhiteSpace($CncPassword)) {
$commonFocas += @("--cnc-password", $CncPassword)
}
$results = @()
$allResults = @()
$results += Test-Probe `
-Cli $focasCli `
-ProbeArgs (@("probe") + $commonFocas + @("-a", $Address, "--type", "Int16"))
function Invoke-FocasCore {
param(
[string]$Label,
[string]$ProbeAddress
)
$writeValue = Get-Random -Minimum 1 -Maximum 9999
$results += Test-DriverLoopback `
Write-Header "FOCAS stages — $Label"
$commonFocas = @("-h", $CncHost, "-p", $CncPort)
$results = @()
$results += Test-Probe `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $Address, "-t", "Int16", "-v", $writeValue)) `
-ReadArgs (@("read") + $commonFocas + @("-a", $Address, "-t", "Int16")) `
-ProbeArgs (@("probe") + $commonFocas + @("-a", $ProbeAddress, "--type", "Int16"))
$writeValue = Get-Random -Minimum 1 -Maximum 9999
$results += Test-DriverLoopback `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16", "-v", $writeValue)) `
-ReadArgs (@("read") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16")) `
-ExpectedValue "$writeValue"
$bridgeValue = Get-Random -Minimum 10000 -Maximum 19999
$results += Test-ServerBridge `
$bridgeValue = Get-Random -Minimum 10000 -Maximum 19999
$results += Test-ServerBridge `
-DriverCli $focasCli `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $Address, "-t", "Int16", "-v", $bridgeValue)) `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16", "-v", $bridgeValue)) `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-ExpectedValue "$bridgeValue"
$reverseValue = Get-Random -Minimum 20000 -Maximum 29999
$results += Test-OpcUaWriteBridge `
$reverseValue = Get-Random -Minimum 20000 -Maximum 29999
$results += Test-OpcUaWriteBridge `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-DriverCli $focasCli `
-DriverReadArgs (@("read") + $commonFocas + @("-a", $Address, "-t", "Int16")) `
-DriverReadArgs (@("read") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16")) `
-ExpectedValue "$reverseValue"
$subValue = Get-Random -Minimum 30000 -Maximum 32766
$results += Test-SubscribeSeesChange `
$subValue = Get-Random -Minimum 30000 -Maximum 32766
$results += Test-SubscribeSeesChange `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-OpcUaNodeId $BridgeNodeId `
-DriverCli $focasCli `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $Address, "-t", "Int16", "-v", $subValue)) `
-DriverWriteArgs (@("write") + $commonFocas + @("-a", $ProbeAddress, "-t", "Int16", "-v", $subValue)) `
-ExpectedValue "$subValue"
if ($Write) {
# F4-b — macro + parameter round-trip writes. Both stages use the same
# write-then-read shape the existing PMC stages use; the per-tag value
# comes back through Test-DriverLoopback's read step.
#
# Macro writes run unconditionally when -Write is supplied — no MDI / no
# parameter-write switch dependency, lowest-risk write surface on a CNC.
$macroValue = Get-Random -Minimum 100 -Maximum 9999
$results += Test-DriverLoopback `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $MacroAddress, "-t", "Int32", "-v", $macroValue)) `
-ReadArgs (@("read") + $commonFocas + @("-a", $MacroAddress, "-t", "Int32")) `
-ExpectedValue "$macroValue"
return $results
}
# Parameter writes only fire when the operator double-opts in via
# FOCAS_PARAM_WRITE=1. The CNC must be in MDI mode + parameter-write
# switch enabled or every write returns EW_PASSWD (BadUserAccessDenied);
# without an opt-in the script won't even attempt the write. F4-d will
# land an OPC UA-side unlock workflow that lets this stage run without
# the pendant.
if ($env:FOCAS_PARAM_WRITE -eq "1" -or $env:FOCAS_PARAM_WRITE -eq "true") {
$paramValue = Get-Random -Minimum 100 -Maximum 9999
$results += Test-DriverLoopback `
function Invoke-HandleLeakStage {
param([int]$Cycles)
Write-Header "FOCAS handle-leak stress — $Cycles cycles"
$commonFocas = @("-h", $CncHost, "-p", $CncPort)
$failed = 0
for ($i = 1; $i -le $Cycles; $i++) {
$probe = Test-Probe `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $ParamAddress, "-t", "Int32", "-v", $paramValue)) `
-ReadArgs (@("read") + $commonFocas + @("-a", $ParamAddress, "-t", "Int32")) `
-ExpectedValue "$paramValue"
} else {
Write-Host "[skip] FOCAS_PARAM_WRITE not set — parameter-write stage requires the CNC to be in MDI mode + parameter-write switch enabled (see docs/v2/focas-deployment.md 'Write safety')."
-ProbeArgs (@("probe") + $commonFocas + @("-a", $Address, "--type", "Int16"))
if (-not $probe.Passed) {
$failed++
# First 3 failures are informative; the rest just tally.
if ($failed -le 3) {
Write-Fail "cycle $i failed: $($probe.Reason)"
}
# F4-c — PMC bit round-trip. PMC writes have a higher blast radius
# than PARAM/MACRO (a mistargeted bit can move motion or latch a
# feedhold) so the stage is gated on a separate FOCAS_PMC_WRITE=1
# opt-in. The bit write exercises the driver's read-modify-write
# path: write 'on' -> read returns 'on'; write 'off' -> read returns
# 'off'. Both halves run so a regression in either branch is caught.
if ($env:FOCAS_PMC_WRITE -eq "1" -or $env:FOCAS_PMC_WRITE -eq "true") {
$results += Test-DriverLoopback `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $PmcBitAddress, "-t", "Bit", "-v", "on")) `
-ReadArgs (@("read") + $commonFocas + @("-a", $PmcBitAddress, "-t", "Bit")) `
-ExpectedValue "True"
$results += Test-DriverLoopback `
-Cli $focasCli `
-WriteArgs (@("write") + $commonFocas + @("-a", $PmcBitAddress, "-t", "Bit", "-v", "off")) `
-ReadArgs (@("read") + $commonFocas + @("-a", $PmcBitAddress, "-t", "Bit")) `
-ExpectedValue "False"
}
# Tiny delay so a broken loop can't DDoS the CNC; FWLIB handles take a
# few tens of ms to recycle in practice.
Start-Sleep -Milliseconds 50
}
$passed = $Cycles - $failed
if ($failed -eq 0) {
Write-Pass "handle-leak stress: $passed/$Cycles cycles succeeded"
return @{ Passed = $true; Reason = "$passed/$Cycles" }
} else {
Write-Host "[skip] FOCAS_PMC_WRITE not set — PMC bit-write round-trip is off by default because a mistargeted PMC bit can move motion or latch a feedhold (see docs/v2/focas-deployment.md 'PMC pre-checks')."
Write-Fail "handle-leak stress: $failed/$Cycles cycles failed"
return @{ Passed = $false; Reason = "$failed/$Cycles failed" }
}
}
Write-Summary -Title "FOCAS e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
if ($seriesList.Count -eq 0) {
$allResults += Invoke-FocasCore -Label "single" -ProbeAddress $Address
} else {
foreach ($series in $seriesList) {
$probeAddr = $seriesProbes[$series]
Write-Info "Running matrix pass for series '$series' with address $probeAddr"
$allResults += Invoke-FocasCore -Label $series -ProbeAddress $probeAddr
}
}
if ($HandleLeakCycles -gt 0) {
$allResults += Invoke-HandleLeakStage -Cycles $HandleLeakCycles
}
Write-Summary -Title "FOCAS e2e" -Results $allResults
if ($allResults | Where-Object { -not $_.Passed }) { exit 1 }
+39 -19
View File
@@ -49,16 +49,20 @@
OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
.PARAMETER SourceNodeId
NodeId of the driver-sourced Galaxy tag (numeric, writable preferred).
Default matches the Phase 7 seed `ns=2;s=p7-smoke-tag-source`.
NodeId of the driver-sourced Galaxy tag (numeric, writable preferred). NodeIds
are path-based per OPC UA Part 3 §5.2.2 the default matches the Phase 7 seed
walking `p7-smoke-galaxy` (DriverInstanceId) `lab-floor` `galaxy-line`
`reactor-1` `Source` (Tag.Name).
.PARAMETER VirtualNodeId
NodeId of the VirtualTag computed as Source × 2 (Phase 7 scripting).
Default matches the Phase 7 seed `ns=2;s=p7-smoke-vt-derived`.
NodeId of the VirtualTag that computes MachineStatus = (Source > 0) (Phase 7
scripting). Same path-based scheme, ending in the VirtualTag.Name
(`MachineStatus`). The tag is historized so the write/subscribe exercise
doubles as a historian-sink check.
.PARAMETER AlarmNodeId
NodeId of the scripted-alarm Condition (fires when Source > 50).
Default matches the Phase 7 seed `ns=2;s=p7-smoke-al-overtemp`.
NodeId of the scripted-alarm Condition (fires when Source > 50). Same
path-based scheme, ending in ScriptedAlarm.Name (`OverTemp`).
.PARAMETER AlarmTriggerValue
Value written to -SourceNodeId to push it over the alarm threshold.
@@ -90,13 +94,20 @@
param(
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[string]$SourceNodeId = "ns=2;s=p7-smoke-tag-source",
[string]$VirtualNodeId = "ns=2;s=p7-smoke-vt-derived",
[string]$AlarmNodeId = "ns=2;s=p7-smoke-al-overtemp",
[string]$SourceNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source",
[string]$VirtualNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/MachineStatus",
[string]$AlarmNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/OverTemp",
[string]$AlarmTriggerValue = "75",
[int]$ChangeWaitSec = 10,
[int]$AlarmWaitSec = 10,
[int]$HistoryLookbackSec = 3600
[int]$HistoryLookbackSec = 3600,
# The default Phase 7 seed uses a Galaxy attribute with
# security_classification=Operate. Anonymous OPC UA sessions are denied writes
# against Operate-classified tags (PR 26 / docs/Security.md). Supply an LDAP
# user with WriteOperate to exercise the reverse-bridge stage — e.g.
# `-Username writeop -Password writeop123` against the dev-box GLAuth.
[string]$Username = "",
[string]$Password = ""
)
$ErrorActionPreference = "Stop"
@@ -106,6 +117,13 @@ $opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
# Auth-extension helper — appends `-U / -P` to the CLI args when credentials
# were supplied. Stays empty for anonymous runs so the default smoke path
# doesn't require an LDAP round-trip.
$authArgs = @()
if ($Username) { $authArgs += @("-U", $Username) }
if ($Password) { $authArgs += @("-P", $Password) }
$results = @()
# ---------------------------------------------------------------------------
@@ -115,7 +133,7 @@ $results = @()
# ---------------------------------------------------------------------------
Write-Header "Probe"
$probe = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $SourceNodeId)
$probe = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
if ($probe.ExitCode -eq 0 -and $probe.Output -match "Status:\s+0x00000000") {
Write-Pass "source NodeId readable (Galaxy pipe → proxy → server → client chain up)"
$results += @{ Passed = $true }
@@ -132,7 +150,7 @@ if ($probe.ExitCode -eq 0 -and $probe.Output -match "Status:\s+0x00000000") {
# ---------------------------------------------------------------------------
Write-Header "Source read"
$sourceRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $SourceNodeId)
$sourceRead = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
$sourceValue = $null
if ($sourceRead.ExitCode -eq 0 -and $sourceRead.Output -match "Value:\s+([^\r\n]+)") {
$sourceValue = $Matches[1].Trim()
@@ -156,7 +174,7 @@ if ([string]::IsNullOrEmpty($VirtualNodeId)) {
Write-Skip "VirtualNodeId not supplied — skipping Phase 7 bridge check"
} else {
Write-Header "Virtual-tag bridge"
$vtRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $VirtualNodeId)
$vtRead = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $VirtualNodeId) + $authArgs)
if ($vtRead.ExitCode -eq 0 -and $vtRead.Output -match "Value:\s+([^\r\n]+)") {
$vtValue = $Matches[1].Trim()
Write-Pass "virtual-tag value = $vtValue (source was $sourceValue)"
@@ -179,7 +197,7 @@ $stdout = New-TemporaryFile
$stderr = New-TemporaryFile
$subArgs = @($opcUaCli.PrefixArgs) + @(
"subscribe", "-u", $OpcUaUrl, "-n", $SourceNodeId,
"-i", "500", "--duration", "$ChangeWaitSec")
"-i", "500", "--duration", "$ChangeWaitSec") + $authArgs
$subProc = Start-Process -FilePath $opcUaCli.File `
-ArgumentList $subArgs -NoNewWindow -PassThru `
-RedirectStandardOutput $stdout.FullName `
@@ -191,8 +209,10 @@ $subOut = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -R
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
# Any `=` followed by `(Good)` line after the initial subscribe-confirmation
# indicates at least one data-change tick arrived.
$changeLines = ($subOut -split "`n") | Where-Object { $_ -match "=\s+.*\(Good\)" }
# indicates at least one data-change tick arrived. The `@(...)` forces an array
# so `.Count` works on the 0-match + single-match cases that Set-StrictMode
# -Version 3.0 otherwise flags as `property 'Count' cannot be found`.
$changeLines = @(($subOut -split "`n") | Where-Object { $_ -match "=\s+.*\(Good\)" })
if ($changeLines.Count -gt 0) {
Write-Pass "$($changeLines.Count) data-change events observed"
$results += @{ Passed = $true }
@@ -210,8 +230,8 @@ if ($changeLines.Count -gt 0) {
Write-Header "Reverse bridge (OPC UA write)"
$writeValue = [int]$AlarmTriggerValue # reuse the alarm trigger value — two stages for one write
$w = Invoke-Cli -Cli $opcUaCli -Args @(
"write", "-u", $OpcUaUrl, "-n", $SourceNodeId, "-v", "$writeValue")
$w = Invoke-Cli -Cli $opcUaCli -Args (@(
"write", "-u", $OpcUaUrl, "-n", $SourceNodeId, "-v", "$writeValue") + $authArgs)
if ($w.ExitCode -ne 0) {
# Connection/protocol failure — still a test failure.
Write-Fail "write CLI exit=$($w.ExitCode)"
@@ -226,7 +246,7 @@ if ($w.ExitCode -ne 0) {
} elseif ($w.Output -match "Write successful") {
# Read back — Galaxy poll interval + MXAccess advise may need a second or two to settle.
Start-Sleep -Seconds 2
$r = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $SourceNodeId)
$r = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
if ($r.Output -match "Value:\s+$([Regex]::Escape("$writeValue"))\b") {
Write-Pass "write propagated — source reads back $writeValue"
$results += @{ Passed = $true }
+353 -201
View File
@@ -2,108 +2,164 @@
<#
.SYNOPSIS
End-to-end CLI test for the OPC UA Client (gateway) driver bridged through
the OtOpcUa server. Stages: probe, read, subscribe, topology-change.
the OtOpcUa server.
.DESCRIPTION
The OPC UA Client driver reads from an upstream OPC UA server (default:
Microsoft's opc-plc simulator on opc.tcp://localhost:50000) and re-exposes
its address space through the local OtOpcUa server. This script drives
the bridged path end-to-end via `otopcua-cli`.
The OpcUaClient driver is unique in the fleet it's a gateway that connects
to ANOTHER OPC UA server and re-exposes its address space through the local
OtOpcUa server. So there's no protocol-specific driver CLI; both directions
of this test use `otopcua-cli` against two different endpoints:
Four stages:
remote = the upstream OPC UA server the driver connects to (opc-plc fixture
by default, opc.tcp://localhost:50000)
local = the OtOpcUa server itself, which mirrors remote nodes through the
OpcUaClient driver instance (opc.tcp://localhost:4840)
1. Probe otopcua-cli connect succeeds against the OtOpcUa
server; confirms the gateway is up.
2. Bridged read otopcua-cli read on the bridged NodeId returns a
Good value with a non-null payload; proves the
IReadable.ReadAsync path round-trips through the
driver to the upstream simulator.
3. Subscribe otopcua-cli subscribe observes a data change within
N seconds (opc-plc's StepUp ticks once per second by
default, so this should always see a change).
4. Topology change assert the auto-reimport-on-ModelChangeEvent path
is wired up. We can't easily fire a real upstream
model change without elevated opc-plc access, so
this stage prints the option settings + asserts the
driver's diagnostic surface reflects WatchModelChanges
is enabled (or skips with INFO when the upstream
doesn't expose ModelChangeEventType).
Eight stages cover the driver's full capability surface:
Requires:
- a running OtOpcUa server whose config DB has an OpcUaClient
DriverInstance bound to opc-plc (or another upstream server)
- the upstream OPC UA simulator reachable at $UpstreamUrl
- a Tag bridged from upstream NodeId $UpstreamNodeId to local
$BridgedNodeId
1. Remote probe otopcua-cli connect to the upstream. Confirms the
simulator / target server is reachable and
speaking UA Secure Channel.
2. Remote read otopcua-cli read of -RemoteNodeId on the upstream.
Captures the current value + confirms the node
exists. Baseline for the forward-bridge stage.
3. Forward bridge otopcua-cli read of -BridgeNodeId on the LOCAL
server. Proves the driver discovered + mirrored
the remote node into the local address space and
the read path is live (IReadable via session).
4. Subscribe-sees-change subscribe on local -BridgeNodeId in the
background. opc-plc's tickers (FastUInt1, StepUp)
mutate autonomously, so no driver poke is needed
a data-change event should arrive within the
subscription window. Covers ISubscribable +
upstream subscription transfer.
5. Reverse bridge otopcua-cli write to local -WritableBridgeNodeId,
then otopcua-cli read of -WritableRemoteNodeId
directly on the upstream. Confirms writes flow
through the driver to the remote (IWritable). Opt-
in opc-plc default image has no writable nodes
without `--sn`; pass -WritableBridgeNodeId AND
-WritableRemoteNodeId to enable.
6. Browse mirror otopcua-cli browse of the local -BridgeRootNodeId
at depth -BrowseDepth. Asserts at least
-BrowseMinNodes descendants appear. Covers
ITagDiscovery local-namespace projection.
7. Alarm fires otopcua-cli alarms subscription on local
-AlarmNodeId. opc-plc with `--alm` cycles a
TripAlarm autonomously; assert an Active alarm
event surfaces. Covers IAlarmSource OPC UA A&E
projection. Opt-in via -AlarmNodeId.
8. History read historyread on local -HistoryNodeId over a
lookback window. Covers IHistoryProvider
upstream HistoryRead dispatch. Opt-in via
-HistoryNodeId. Note: opc-plc's default image
does not historize a historizing upstream
(Prosys, UaExpert sample server) is required.
Prereqs:
1. Upstream OPC UA server reachable at -RemoteUrl. Default expects the
opc-plc Docker fixture (`tests/.../Driver.OpcUaClient.IntegrationTests/
Docker/docker-compose.yml`): `docker compose up -d` before running.
2. OtOpcUa server running at -OpcUaUrl with an OpcUaClient DriverInstance
in its Config DB whose EndpointUrl = -RemoteUrl. The server's
DiscoverAsync populates the mirrored namespace at startup; the
-BridgeNodeId / -BridgeRootNodeId you pass must correspond to whatever
NodeIds that discovery produced on your local server.
3. To exercise stages 5 / 7 / 8, the upstream must expose writable nodes /
alarm conditions / history. opc-plc alone doesn't cover all three see
parameter docs below for the combinations that work with opc-plc.
.PARAMETER RemoteUrl
Upstream OPC UA server endpoint (the server the driver connects to).
Default matches the opc-plc Docker fixture opc.tcp://localhost:50000.
.PARAMETER OpcUaUrl
Endpoint URL of the OtOpcUa server. Default opc.tcp://localhost:4840.
Local OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
.PARAMETER UpstreamUrl
Endpoint URL of the upstream OPC UA server (for documentation; the bridge
itself is wired in the OtOpcUa server config). Default opc.tcp://localhost:50000.
.PARAMETER RemoteNodeId
NodeId on the upstream used for stages 1-2. Default ns=3;s=FastUInt1 opc-plc
ticker that increments every 100 ms.
.PARAMETER BridgedNodeId
Local NodeId the OtOpcUa server exposes for the upstream tag. Required
set per your server config (e.g. 'ns=2;s=/warsaw/opc-plc/StepUp').
.PARAMETER BridgeNodeId
NodeId on the LOCAL server that mirrors -RemoteNodeId after the OpcUaClient
driver discovers it. Dev-specific whatever the local DiscoverAsync produced
for the upstream node. No default; mandatory for stages 3-4.
.PARAMETER UpstreamNodeId
The upstream NodeId being bridged (informational only; default
'ns=3;s=StepUp' which is opc-plc's monotonically-increasing UInt32).
.PARAMETER WritableRemoteNodeId
Writable NodeId on the upstream for the reverse-bridge stage. opc-plc's
default image has no writable nodes; add `--sn=1` to the compose command to
expose `ns=3;s=SlowUInt1` as writable (or similar per opc-plc docs). Omit to
skip stage 5.
.PARAMETER WritableBridgeNodeId
Matching local mirror of -WritableRemoteNodeId. Omit to skip stage 5.
.PARAMETER BridgeRootNodeId
Root NodeId on the local server under which the mirrored upstream sits. The
browse stage walks from this node down to -BrowseDepth. Default i=85
(ObjectsFolder) works but produces a lot of output; pass a narrower root
for faster / more targeted coverage.
.PARAMETER BrowseDepth
Max depth for the browse stage. Default 3.
.PARAMETER BrowseMinNodes
Minimum number of descendants expected under -BridgeRootNodeId. Default 5.
.PARAMETER AlarmNodeId
NodeId of the ConditionType on the local server for the alarm-fires stage.
opc-plc with `--alm` exposes e.g. TripAlarm conditions; the local mirror path
of that condition goes here. Omit to skip stage 7.
.PARAMETER AlarmWaitSec
Seconds to wait for the alarm to cycle. opc-plc's TripAlarm fires on its own
cadence; 15 s usually covers one cycle. Default 15.
.PARAMETER HistoryNodeId
NodeId on the local server whose history to query. Omit to skip stage 8.
.PARAMETER HistoryLookbackSec
Seconds back from now to query history. Default 3600.
.PARAMETER ChangeWaitSec
How long the subscribe stage waits for a data-change. Default 10s.
.PARAMETER ReverseConnect
When set, the script asserts the gateway is configured for reverse-connect
(server-initiated) mode. The OtOpcUa server's DriverConfig for the OpcUaClient
instance must already have ReverseConnect.Enabled=true + ListenerUrl set; this
script doesn't reconfigure the driver, only verifies the bridged path still
reads end-to-end with the listener up. The reverse-connect topology is opaque
to the downstream OPC UA client (us), so the read assertion is identical to
the dial-mode path the value of running the script in this mode is to catch
regressions where reverse-connect breaks the post-init capability surface.
.PARAMETER ReverseListenerUrl
Documentation-only. The listener URL the gateway is expected to be bound to
when -ReverseConnect is set; printed in the run banner so operators can
cross-check their server config. Default opc.tcp://0.0.0.0:4844.
Seconds the subscribe-sees-change stage waits for a natural ticker update.
opc-plc's FastUInt1 ticks every 100 ms so a short window suffices. Default 8.
.EXAMPLE
.\test-opcuaclient.ps1 -BridgedNodeId "ns=2;s=/warsaw/opc-plc/StepUp"
# Bare-minimum: stages 1-4 + browse, against the opc-plc compose fixture.
# Requires the local OtOpcUa server to have discovered opc-plc and placed
# FastUInt1 under (for example) ns=2;s=OpcUaClient/FastUInt1.
./scripts/e2e/test-opcuaclient.ps1 -BridgeNodeId "ns=2;s=OpcUaClient/FastUInt1"
.EXAMPLE
# OT-DMZ deployment: the upstream dials the gateway. The script flow is the
# same — we still drive the bridged read through the OtOpcUa server — but the
# banner reflects the reverse-connect topology.
.\test-opcuaclient.ps1 -BridgedNodeId "ns=2;s=/warsaw/opc-plc/StepUp" -ReverseConnect
# Full matrix — all eight stages. Requires an opc-plc image with --sn (for
# writable) + --alm (for alarms; default compose has this) + a historizing
# upstream (opc-plc does not; Prosys does).
./scripts/e2e/test-opcuaclient.ps1 `
-BridgeNodeId "ns=2;s=OpcUaClient/FastUInt1" `
-WritableRemoteNodeId "ns=3;s=SlowUInt1" `
-WritableBridgeNodeId "ns=2;s=OpcUaClient/SlowUInt1" `
-BridgeRootNodeId "ns=2;s=OpcUaClient" `
-AlarmNodeId "ns=2;s=OpcUaClient/TripAlarm" `
-HistoryNodeId "ns=2;s=OpcUaClient/StepUp"
#>
param(
[string]$RemoteUrl = "opc.tcp://localhost:50000",
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
[string]$UpstreamUrl = "opc.tcp://localhost:50000",
[Parameter(Mandatory)] [string]$BridgedNodeId,
[string]$UpstreamNodeId = "ns=3;s=StepUp",
[int]$ChangeWaitSec = 10,
[switch]$ReverseConnect,
[string]$ReverseListenerUrl = "opc.tcp://0.0.0.0:4844",
# PR-12: HistoryReadEvents passthrough check. Requires the upstream to be running
# in alarm-history mode (opc-plc --alm) AND the OtOpcUa server to expose a notifier
# node bridged to the upstream's events source. The CLI doesn't have a dedicated
# event-history command yet; this stage runs a regular historyread against the
# bridged notifier and confirms the gateway round-trips the request without
# surfacing BadHistoryOperationUnsupported, which would indicate the filter-aware
# ReadEventsAsync path lost wiring.
[switch]$HistoryEvents,
[string]$EventsNotifierNodeId = "i=2253",
# PR-14: upstream-redundancy probe. Passes the primary + secondary URLs
# straight through to the gateway driver via DriverConfig (operator must have
# already wired Redundancy.Enabled=true on the OpcUaClient instance — this
# script doesn't reconfigure the driver, only verifies the bridged read still
# works while both upstreams are reachable, and that the driver's redundancy
# diagnostics are non-null). Stage is no-op when neither URL is provided.
[string]$PrimaryUrl,
[string]$SecondaryUrl
[string]$RemoteNodeId = "ns=3;s=FastUInt1",
[Parameter(Mandatory)] [string]$BridgeNodeId,
[string]$WritableRemoteNodeId = "",
[string]$WritableBridgeNodeId = "",
[string]$BridgeRootNodeId = "i=85",
[int]$BrowseDepth = 3,
[int]$BrowseMinNodes = 5,
[string]$AlarmNodeId = "",
[int]$AlarmWaitSec = 15,
[string]$HistoryNodeId = "",
[int]$HistoryLookbackSec = 3600,
[int]$ChangeWaitSec = 8
)
$ErrorActionPreference = "Stop"
@@ -113,128 +169,224 @@ $opcUaCli = Get-CliInvocation `
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
-ExeName "otopcua-cli"
if ($ReverseConnect) {
Write-Host "[INFO] -ReverseConnect set: gateway is expected to be bound to listener $ReverseListenerUrl"
Write-Host "[INFO] Upstream OPC UA server should be configured with --rc=$ReverseListenerUrl (or equivalent on a real server)"
}
$results = @()
# Stage 1: probe
$results += Test-Probe `
-Name "OpcUaClient probe" `
-Cmd $opcUaCli `
-Args @("connect", "-u", $OpcUaUrl)
# ---------------------------------------------------------------------------
# Stage 1 — Remote probe. `otopcua-cli connect` exits 0 when the Secure Channel
# + Session handshake to the upstream complete cleanly. A failure here means
# opc-plc isn't running or the endpoint is unreachable — nothing downstream is
# worth trying.
# ---------------------------------------------------------------------------
# Stage 2: bridged read
$results += Test-Probe `
-Name "OpcUaClient bridged read" `
-Cmd $opcUaCli `
-Args @("read", "-u", $OpcUaUrl, "-n", $BridgedNodeId)
# Stage 3: subscribe-sees-change
Write-Host "[INFO] Subscribing to $BridgedNodeId for ${ChangeWaitSec}s..."
$subResults = & $opcUaCli.Cmd @($opcUaCli.Args + @(
"subscribe", "-u", $OpcUaUrl, "-n", $BridgedNodeId,
"-i", "500", "--duration", "$ChangeWaitSec"))
if ($LASTEXITCODE -eq 0 -and $subResults -match "DataChange|StepUp|value=") {
$results += [pscustomobject]@{ Stage = "Subscribe-sees-change"; Status = "PASS" }
Write-Header "Remote probe"
$probe = Invoke-Cli -Cli $opcUaCli -Args @("connect", "-u", $RemoteUrl)
if ($probe.ExitCode -eq 0 -and $probe.Output -match "Connection successful") {
Write-Pass "upstream $RemoteUrl reachable + speaks UA"
$results += @{ Passed = $true }
} else {
$results += [pscustomobject]@{ Stage = "Subscribe-sees-change"; Status = "FAIL" }
}
# Stage 4: topology change (auto-reimport on ModelChangeEvent)
#
# The OPC UA Client driver subscribes to BaseModelChangeEventType on the
# upstream Server node (i=2253) at the end of InitializeAsync, then debounces
# events over OpcUaClientDriverOptions.ModelChangeDebounce (default 5s) and
# triggers ReinitializeAsync.
#
# Driving a real upstream ModelChangeEvent from outside the simulator is
# upstream-specific:
# - opc-plc: invoke OpcPlc.AddSlowNode via OPC UA Call (requires a session
# directly to opc-plc, not via the gateway, since the gateway exposes
# mirrored read/write paths only for variables — methods are mirrored
# under PR-9 but call permissions on the simulator's namespace may
# not allow downstream invocation).
# - production server: deploy a topology-change to the upstream server +
# observe the local re-import.
#
# This stage is therefore documentation-only by default. Set
# $env:OPCUACLIENT_TOPOLOGY_TRIGGER_CMD to a command that drives a real
# topology change on the upstream and we'll execute it + wait for the
# debounced re-import.
$triggerCmd = $env:OPCUACLIENT_TOPOLOGY_TRIGGER_CMD
if ($triggerCmd) {
Write-Host "[INFO] Driving topology change via: $triggerCmd"
& cmd.exe /c $triggerCmd
Start-Sleep -Seconds 8 # debounce window + re-import duration
# After re-import the bridged node should still be readable (or, if
# the upstream removed the node, the read should return BadNodeIdUnknown).
# Either way the gateway must remain healthy.
$results += Test-Probe `
-Name "Topology-change re-read" `
-Cmd $opcUaCli `
-Args @("read", "-u", $OpcUaUrl, "-n", $BridgedNodeId)
} else {
Write-Host "[INFO] Topology-change stage skipped (set OPCUACLIENT_TOPOLOGY_TRIGGER_CMD to drive a real upstream model change)."
$results += [pscustomobject]@{ Stage = "Topology-change"; Status = "SKIP" }
}
# Stage 5 (gated): HistoryReadEvents passthrough
#
# PR-12 lands the filter-aware IHistoryProvider.ReadEventsAsync overload on the
# OPC UA Client driver. End-to-end coverage requires:
# (a) the upstream in alarm-history mode (opc-plc --alm or a real server);
# (b) the OtOpcUa server forwarding HistoryReadEvents to the gateway driver.
# Gated behind -HistoryEvents because the default opc-plc fixture image isn't
# launched with --alm. When set, the stage issues a historyread against the
# bridged notifier ($EventsNotifierNodeId) and confirms the gateway returns
# the request without BadHistoryOperationUnsupported.
# Stage 6 (gated): upstream-redundancy probe (PR-14)
#
# When -PrimaryUrl + -SecondaryUrl are both supplied, the script runs an extra
# read against the bridged NodeId and reports whether the gateway is still
# answering. The actual ServiceLevel-driven failover is observable only on the
# server side (driver-diagnostics RPC reports RedundancyFailoverCount); this
# stage is a smoke check that the bridged path keeps round-tripping while
# both upstreams are reachable. Drive a real failover by writing to the
# primary's ServiceLevel node from outside this script.
if ($PrimaryUrl -and $SecondaryUrl) {
Write-Host "[INFO] Upstream redundancy probe: primary=$PrimaryUrl secondary=$SecondaryUrl"
$results += Test-Probe `
-Name "OpcUaClient redundancy bridged-read" `
-Cmd $opcUaCli `
-Args @("read", "-u", $OpcUaUrl, "-n", $BridgedNodeId)
} else {
if (-not $PrimaryUrl -and -not $SecondaryUrl) {
Write-Host "[INFO] Upstream redundancy stage skipped (set -PrimaryUrl and -SecondaryUrl to enable)."
$results += [pscustomobject]@{ Stage = "Upstream-redundancy"; Status = "SKIP" }
}
}
if ($HistoryEvents) {
Write-Host "[INFO] HistoryEvents stage: issuing historyread against $EventsNotifierNodeId"
$start = (Get-Date).ToUniversalTime().AddMinutes(-30).ToString("o")
$end = (Get-Date).ToUniversalTime().AddMinutes(1).ToString("o")
$eventOut = & $opcUaCli.Cmd @($opcUaCli.Args + @(
"historyread", "-u", $OpcUaUrl, "-n", $EventsNotifierNodeId,
"--start", $start, "--end", $end))
if ($LASTEXITCODE -eq 0 -and $eventOut -notmatch "BadHistoryOperationUnsupported") {
$results += [pscustomobject]@{ Stage = "HistoryReadEvents"; Status = "PASS" }
} elseif ($eventOut -match "BadHistoryOperationUnsupported") {
Write-Host "[INFO] Upstream returned BadHistoryOperationUnsupported — re-run with --alm + a notifier that has event history."
$results += [pscustomobject]@{ Stage = "HistoryReadEvents"; Status = "SKIP" }
} else {
$results += [pscustomobject]@{ Stage = "HistoryReadEvents"; Status = "FAIL" }
}
}
Write-Host ""
Write-Host "=== test-opcuaclient.ps1 results ==="
$results | Format-Table -AutoSize
$failed = $results | Where-Object { $_.Status -eq "FAIL" }
if ($failed) {
Write-Fail "upstream connect failed (exit=$($probe.ExitCode))"
Write-Host $probe.Output
$results += @{ Passed = $false; Reason = "remote probe failed" }
# Fail fast: if the upstream is down every other stage will cascade.
Write-Summary -Title "OpcUaClient e2e" -Results $results
exit 1
}
exit 0
# ---------------------------------------------------------------------------
# Stage 2 — Remote read. Pulls the current value of -RemoteNodeId directly from
# the upstream. Recorded for later stages to compare against, and confirms the
# chosen NodeId actually exists on this upstream.
# ---------------------------------------------------------------------------
Write-Header "Remote read"
$remoteRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $RemoteUrl, "-n", $RemoteNodeId)
$remoteValue = $null
if ($remoteRead.ExitCode -eq 0 -and $remoteRead.Output -match "Value:\s+([^\r\n]+)") {
$remoteValue = $Matches[1].Trim()
Write-Pass "remote $RemoteNodeId = $remoteValue"
$results += @{ Passed = $true }
} else {
Write-Fail "remote read of $RemoteNodeId failed"
Write-Host $remoteRead.Output
$results += @{ Passed = $false; Reason = "remote read failed" }
}
# ---------------------------------------------------------------------------
# Stage 3 — Forward bridge. Read -BridgeNodeId on the LOCAL server. If the
# OpcUaClient driver is live + its discovery mapped -RemoteNodeId into the
# local namespace, this should return a Good value. For ticker nodes like
# FastUInt1 we don't require exact equality with stage 2 (the ticker has
# likely advanced between reads); a Good-status read is the real signal.
# ---------------------------------------------------------------------------
Write-Header "Forward bridge (remote → local)"
$localRead = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $OpcUaUrl, "-n", $BridgeNodeId)
if ($localRead.ExitCode -eq 0 -and $localRead.Output -match "Status:\s+0x00000000" -and $localRead.Output -match "Value:\s+([^\r\n]+)") {
$localValue = $Matches[1].Trim()
Write-Pass "local bridge $BridgeNodeId = $localValue (remote was $remoteValue)"
$results += @{ Passed = $true }
} else {
Write-Fail "local bridge read failed — driver instance may not be configured or discovery hasn't run"
Write-Host $localRead.Output
$results += @{ Passed = $false; Reason = "forward bridge failed" }
}
# ---------------------------------------------------------------------------
# Stage 4 — Subscribe sees change. opc-plc's FastUInt1 ticks autonomously so we
# don't need to drive a write. A properly wired OpcUaClient driver forwards
# remote MonitoredItem data-change callbacks to the local server, which then
# publishes them to our subscribe client. If nothing arrives within the
# window, either the remote node isn't a ticker OR the upstream subscription
# chain is broken (probe state, keep-alive, SDK publish queue).
# ---------------------------------------------------------------------------
Write-Header "Subscribe sees change"
$stdout = New-TemporaryFile
$stderr = New-TemporaryFile
$subArgs = @($opcUaCli.PrefixArgs) + @(
"subscribe", "-u", $OpcUaUrl, "-n", $BridgeNodeId,
"-i", "200", "--duration", "$ChangeWaitSec")
$subProc = Start-Process -FilePath $opcUaCli.File `
-ArgumentList $subArgs -NoNewWindow -PassThru `
-RedirectStandardOutput $stdout.FullName `
-RedirectStandardError $stderr.FullName
Write-Info "subscription started (pid $($subProc.Id)) for ${ChangeWaitSec}s"
$subProc.WaitForExit(($ChangeWaitSec + 5) * 1000) | Out-Null
if (-not $subProc.HasExited) { Stop-Process -Id $subProc.Id -Force }
$subOut = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -Raw)
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
# SubscribeCommand prints `[timestamp] <NodeId> = <value> (0xNNNNNNNN)` per
# data-change event. 0x00000000 == Good; anything else is a non-Good status
# we intentionally don't count (a quality drop isn't a "saw the change").
$changeLines = @(($subOut -split "`n") | Where-Object { $_ -match "=\s+\S.*\(0x00000000\)" })
if ($changeLines.Count -gt 0) {
Write-Pass "$($changeLines.Count) data-change events observed on bridge"
$results += @{ Passed = $true }
} else {
Write-Fail "no data-change events in ${ChangeWaitSec}s — upstream node may be static, or subscription chain broken"
Write-Host $subOut
$results += @{ Passed = $false; Reason = "no data-change" }
}
# ---------------------------------------------------------------------------
# Stage 5 — Reverse bridge. Only runs when both writable NodeIds are supplied.
# Writes on the local bridge side, reads directly on the upstream to verify
# the write crossed the driver. 2s settle accounts for the driver's next poll
# (non-idempotent writes on upstream side may take a tick to propagate).
# ---------------------------------------------------------------------------
if ([string]::IsNullOrEmpty($WritableBridgeNodeId) -or [string]::IsNullOrEmpty($WritableRemoteNodeId)) {
Write-Header "Reverse bridge (local → remote)"
Write-Skip "WritableBridgeNodeId / WritableRemoteNodeId not supplied — opc-plc default has no writable nodes. Add --sn=N to the compose and re-run with both params set."
} else {
Write-Header "Reverse bridge (local → remote)"
$writeValue = Get-Random -Minimum 1 -Maximum 9999
$w = Invoke-Cli -Cli $opcUaCli -Args @(
"write", "-u", $OpcUaUrl, "-n", $WritableBridgeNodeId, "-v", "$writeValue")
if ($w.ExitCode -ne 0 -or $w.Output -notmatch "Write successful") {
Write-Fail "local-side write failed"
Write-Host $w.Output
$results += @{ Passed = $false; Reason = "reverse-bridge write failed" }
} else {
Write-Info "local write ok, waiting 2s for driver propagate"
Start-Sleep -Seconds 2
$r = Invoke-Cli -Cli $opcUaCli -Args @("read", "-u", $RemoteUrl, "-n", $WritableRemoteNodeId)
if ($r.ExitCode -eq 0 -and $r.Output -match "Value:\s+$([Regex]::Escape("$writeValue"))\b") {
Write-Pass "remote reads back $writeValue"
$results += @{ Passed = $true }
} else {
Write-Fail "remote value did not reflect $writeValue"
Write-Host $r.Output
$results += @{ Passed = $false; Reason = "reverse-bridge readback mismatch" }
}
}
}
# ---------------------------------------------------------------------------
# Stage 6 — Browse mirror. Walks -BridgeRootNodeId to -BrowseDepth levels. The
# BrowseCommand emits one line per encountered node; we count non-empty lines
# minus the root-summary line and compare against -BrowseMinNodes. A naked
# i=85 root always has something; a narrower dev-specific root is stricter.
# ---------------------------------------------------------------------------
Write-Header "Browse mirror"
$br = Invoke-Cli -Cli $opcUaCli -Args @(
"browse", "-u", $OpcUaUrl, "-n", $BridgeRootNodeId,
"-r", "-d", "$BrowseDepth")
if ($br.ExitCode -ne 0) {
Write-Fail "browse failed (exit=$($br.ExitCode))"
Write-Host $br.Output
$results += @{ Passed = $false; Reason = "browse failed" }
} else {
# BrowseCommand prints one line per node: `[Type] Name (NodeId: xxx)` with
# indentation for depth. Count every line carrying a NodeId marker.
$nodeLines = @(($br.Output -split "`n") | Where-Object { $_ -match "\(NodeId:" })
$count = $nodeLines.Count
if ($count -ge $BrowseMinNodes) {
Write-Pass "$count descendants under $BridgeRootNodeId (>= $BrowseMinNodes)"
$results += @{ Passed = $true }
} else {
Write-Fail "only $count descendants — expected >= $BrowseMinNodes"
Write-Host $br.Output
$results += @{ Passed = $false; Reason = "browse under-populated" }
}
}
# ---------------------------------------------------------------------------
# Stage 7 — Alarm fires. opc-plc with --alm (set in the compose) cycles a
# TripAlarm Condition autonomously. The local alarm subscription should
# surface at least one Active transition within the wait window. Opt-in:
# requires the user to know the local mirror of the upstream alarm Condition.
# ---------------------------------------------------------------------------
if ([string]::IsNullOrEmpty($AlarmNodeId)) {
Write-Header "Alarm fires"
Write-Skip "AlarmNodeId not supplied — skipping alarm stage"
} else {
Write-Header "Alarm fires"
$stdout = New-TemporaryFile
$stderr = New-TemporaryFile
$allArgs = @($opcUaCli.PrefixArgs) + @(
"alarms", "-u", $OpcUaUrl, "-n", $AlarmNodeId, "-i", "500", "--refresh")
$proc = Start-Process -FilePath $opcUaCli.File `
-ArgumentList $allArgs -NoNewWindow -PassThru `
-RedirectStandardOutput $stdout.FullName `
-RedirectStandardError $stderr.FullName
Write-Info "alarm subscription started (pid $($proc.Id)), waiting ${AlarmWaitSec}s for opc-plc alarm cycle"
Start-Sleep -Seconds $AlarmWaitSec
if (-not $proc.HasExited) { Stop-Process -Id $proc.Id -Force }
$out = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -Raw)
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
if ($out -match "ALARM\b" -and $out -match "Active\b") {
Write-Pass "alarm condition fired with Active state"
$results += @{ Passed = $true }
} else {
Write-Fail "no Active alarm event observed in ${AlarmWaitSec}s — check opc-plc compose has --alm + the AlarmNodeId is the local mirror of the upstream Condition"
Write-Host $out
$results += @{ Passed = $false; Reason = "no alarm event" }
}
}
# ---------------------------------------------------------------------------
# Stage 8 — History read. IHistoryProvider dispatch to the upstream's
# HistoryRead service. opc-plc does NOT historize by default, so this stage
# SKIPs when -HistoryNodeId is empty. Against a historizing upstream (Prosys,
# UA Expert sample server, AVEVA Historian) point -HistoryNodeId at the local
# mirror of a historized node.
# ---------------------------------------------------------------------------
if ([string]::IsNullOrEmpty($HistoryNodeId)) {
Write-Header "History read"
Write-Skip "HistoryNodeId not supplied — opc-plc default does not historize; supply a historized-upstream mirror NodeId to enable."
} else {
$results += Test-HistoryHasSamples `
-OpcUaCli $opcUaCli `
-OpcUaUrl $OpcUaUrl `
-NodeId $HistoryNodeId `
-LookbackSec $HistoryLookbackSec
}
Write-Summary -Title "OpcUaClient e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
-23
View File
@@ -96,28 +96,5 @@ $results += Test-SubscribeSeesChange `
-DriverWriteArgs (@("write") + $commonS7 + @("-a", $Address, "-t", "Int16", "-v", $subValue)) `
-ExpectedValue "$subValue"
# PR-S7-D2 / #300 — UDT-member round-trip. Exercises the byte offsets the
# driver's UDT fan-out uses when expanding a UDT-typed parent tag into per-
# member scalar leaves: Real at DB1.DBD400 and Int16 at DB1.DBW404 match the
# `MyUdt` layout seeded by Docker/profiles/s7_1500.json's udt_layout meta-seed
# and declared by S7_1500UdtFanOutTests. The CLI itself is UDT-unaware so the
# e2e step writes / reads at the explicit member byte offsets — proves the
# wire-level path the fan-out emits is sound end-to-end.
$udtPressureAddress = $Address.Substring(0, $Address.IndexOf('.')) + ".DBD400"
$udtPressureValue = "27.5"
$results += Test-DriverLoopback `
-Cli $s7Cli `
-WriteArgs (@("write") + $commonS7 + @("-a", $udtPressureAddress, "-t", "Float32", "-v", $udtPressureValue)) `
-ReadArgs (@("read") + $commonS7 + @("-a", $udtPressureAddress, "-t", "Float32")) `
-ExpectedValue $udtPressureValue
$udtStatusAddress = $Address.Substring(0, $Address.IndexOf('.')) + ".DBW404"
$udtStatusValue = Get-Random -Minimum 100 -Maximum 999
$results += Test-DriverLoopback `
-Cli $s7Cli `
-WriteArgs (@("write") + $commonS7 + @("-a", $udtStatusAddress, "-t", "Int16", "-v", $udtStatusValue)) `
-ReadArgs (@("read") + $commonS7 + @("-a", $udtStatusAddress, "-t", "Int16")) `
-ExpectedValue "$udtStatusValue"
Write-Summary -Title "S7 e2e" -Results $results
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
-108
View File
@@ -1,108 +0,0 @@
<#
.SYNOPSIS
Registers the OtOpcUaFocasHost Windows service. Optional companion to
Install-Services.ps1 only run this on nodes where FOCAS driver instances will run
with Tier-C process isolation enabled.
.DESCRIPTION
FOCAS PR #220 / Tier-C isolation plan. Wraps OtOpcUa.Driver.FOCAS.Host.exe (net48 x86)
as a Windows service using NSSM, running under the same service account as the main
OtOpcUa service so the named-pipe ACL works. Passes the per-process shared secret via
environment variable at service-start time so it never hits disk.
.PARAMETER InstallRoot
Where the FOCAS Host binaries live (typically
C:\Program Files\OtOpcUa\Driver.FOCAS.Host).
.PARAMETER ServiceAccount
Service account SID or DOMAIN\name. Must match the main OtOpcUa server account so the
PipeAcl match succeeds.
.PARAMETER FocasSharedSecret
Per-process secret passed via env var. Generated freshly per install if not supplied.
.PARAMETER FocasBackend
Backend selector for the Host process. One of:
fwlib32 (default real Fanuc Fwlib32.dll integration; requires licensed DLL on PATH)
fake (in-memory; smoke-test mode)
unconfigured (safe default returning structured errors; use until hardware is wired)
.PARAMETER FocasPipeName
Pipe name the Host listens on. Default: OtOpcUaFocas.
.EXAMPLE
.\Install-FocasHost.ps1 -InstallRoot 'C:\Program Files\OtOpcUa\Driver.FOCAS.Host' `
-ServiceAccount 'OTOPCUA\svc-otopcua' -FocasBackend fwlib32
#>
[CmdletBinding()]
param(
[Parameter(Mandatory)] [string]$InstallRoot,
[Parameter(Mandatory)] [string]$ServiceAccount,
[string]$FocasSharedSecret,
[ValidateSet('fwlib32','fake','unconfigured')] [string]$FocasBackend = 'unconfigured',
[string]$FocasPipeName = 'OtOpcUaFocas',
[string]$ServiceName = 'OtOpcUaFocasHost',
[string]$NssmPath = 'C:\Program Files\nssm\nssm.exe'
)
$ErrorActionPreference = 'Stop'
function Resolve-Sid {
param([string]$Account)
if ($Account -match '^S-\d-\d+') { return $Account }
try {
$nt = New-Object System.Security.Principal.NTAccount($Account)
return $nt.Translate([System.Security.Principal.SecurityIdentifier]).Value
} catch {
throw "Could not resolve '$Account' to a SID. Pass an explicit SID or check the account name."
}
}
if (-not (Test-Path $NssmPath)) {
throw "nssm.exe not found at '$NssmPath'. Install NSSM or pass -NssmPath."
}
$hostExe = Join-Path $InstallRoot 'OtOpcUa.Driver.FOCAS.Host.exe'
if (-not (Test-Path $hostExe)) {
throw "FOCAS Host binary not found at '$hostExe'. Publish the Driver.FOCAS.Host project first."
}
if (-not $FocasSharedSecret) {
$FocasSharedSecret = [System.Guid]::NewGuid().ToString('N')
Write-Host "Generated FocasSharedSecret — store it alongside the OtOpcUa service config."
}
$allowedSid = Resolve-Sid $ServiceAccount
# Idempotent install — remove + re-create if present.
$existing = Get-Service -Name $ServiceName -ErrorAction SilentlyContinue
if ($existing) {
Write-Host "Removing existing '$ServiceName' service..."
& $NssmPath stop $ServiceName confirm | Out-Null
& $NssmPath remove $ServiceName confirm | Out-Null
}
& $NssmPath install $ServiceName $hostExe | Out-Null
& $NssmPath set $ServiceName DisplayName 'OT-OPC-UA FOCAS Host (Tier-C isolated Fwlib32)' | Out-Null
& $NssmPath set $ServiceName Description 'Out-of-process Fwlib32.dll host for OtOpcUa FOCAS driver. Crash-isolated from the main OPC UA server.' | Out-Null
& $NssmPath set $ServiceName ObjectName $ServiceAccount | Out-Null
& $NssmPath set $ServiceName Start SERVICE_AUTO_START | Out-Null
& $NssmPath set $ServiceName AppStdout (Join-Path $env:ProgramData 'OtOpcUa\focas-host-stdout.log') | Out-Null
& $NssmPath set $ServiceName AppStderr (Join-Path $env:ProgramData 'OtOpcUa\focas-host-stderr.log') | Out-Null
& $NssmPath set $ServiceName AppRotateFiles 1 | Out-Null
& $NssmPath set $ServiceName AppRotateBytes 10485760 | Out-Null
& $NssmPath set $ServiceName AppEnvironmentExtra `
"OTOPCUA_FOCAS_PIPE=$FocasPipeName" `
"OTOPCUA_ALLOWED_SID=$allowedSid" `
"OTOPCUA_FOCAS_SECRET=$FocasSharedSecret" `
"OTOPCUA_FOCAS_BACKEND=$FocasBackend" | Out-Null
& $NssmPath set $ServiceName DependOnService OtOpcUa | Out-Null
Write-Host "Installed '$ServiceName' under '$ServiceAccount' (SID=$allowedSid)."
Write-Host "Pipe: \\.\pipe\$FocasPipeName Backend: $FocasBackend"
Write-Host "Start the service with: Start-Service $ServiceName"
Write-Host ""
Write-Host "NOTE: the Fwlib32 backend requires the licensed Fwlib32.dll on PATH"
Write-Host "alongside the Host exe. See docs/v2/focas-deployment.md."
+87
View File
@@ -0,0 +1,87 @@
# Integration runners
Scripts that orchestrate multi-component integration-test loops —
each one wires up docker fixtures, support binaries, and `dotnet test`
in sequence so a developer (or a CI agent) can get from "freshly
cloned repo" to "green integration suite" with one command.
Unlike `scripts/e2e/test-*.ps1` (which drive the built server through
the CLI for black-box coverage), scripts in this folder operate
**below** the server layer — they bring up the raw fixtures the
driver-level `IntegrationTests` projects need.
## Scripts
| Script | Purpose |
|--------|---------|
| [`run-focas.ps1`](run-focas.ps1) | FOCAS driver: builds shim DLLs + starts focas-mock docker + copies shim into test bin + runs `WireCompatGatedTests` + `FocasSimSmokeTests` + tears down docker |
## run-focas.ps1
### Prerequisites
- **Windows + PowerShell 7+**
- **.NET 10 SDK** — `dotnet --version` prints 10.x
- **Native C compiler** — one of:
- Visual Studio Build Tools with the C++ workload (then run from an
"x64 Native Tools Command Prompt for VS" shell), or
- Zig (`zig.exe` on PATH) as a drop-in alternative
- **Docker Desktop** running, OR pass `-SkipDocker` and run the mock
externally
### One-shot run
```powershell
cd C:\Users\dohertj2\Desktop\lmxopcua
pwsh .\scripts\integration\run-focas.ps1
```
That's the default invocation: thirtyone profile, debug build,
docker cleans up on exit.
### Development iteration
Re-run the tests without rebuilding the shim or restarting docker:
```powershell
# First run bootstraps everything + keeps the mock up.
pwsh .\scripts\integration\run-focas.ps1 -KeepDocker
# Iterate on test bodies without re-doing the slow steps.
pwsh .\scripts\integration\run-focas.ps1 -SkipShimBuild -SkipDocker
```
### Per-series runs
```powershell
pwsh .\scripts\integration\run-focas.ps1 -Profile thirty # 30i series
pwsh .\scripts\integration\run-focas.ps1 -Profile zerod # 0i-D
pwsh .\scripts\integration\run-focas.ps1 -Profile powermotion
```
Full profile list is in
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/docker-compose.yml`.
### Exit codes
| Code | Meaning |
|------|---------|
| 0 | All tests passed or cleanly skipped |
| 1 | `dotnet test` reported failures |
| 2 | The runner itself crashed (missing file, unexpected exception) |
| 3 | No C compiler detected for shim build |
| 4 | Docker CLI not on PATH |
### CI integration
Wire this into the project's CI runner (Gitea Actions, Jenkins,
whatever's hosting this repo) by calling:
```yaml
- name: FOCAS integration
shell: pwsh
run: ./scripts/integration/run-focas.ps1
```
The script is idempotent; a previous run's `docker compose down`
failure won't block the next one.
+123
View File
@@ -0,0 +1,123 @@
#Requires -Version 7.0
<#
.SYNOPSIS
Orchestrates the FOCAS driver integration-test loop: bring up the
focas-mock Docker container, run the managed wire-client integration
tests, tear down Docker.
.DESCRIPTION
The FOCAS integration fixture now needs just two things running
together:
1. A single focas-mock container listening on :8193 (one service,
no per-series compose profile ceremony the mock's native
FOCAS Ethernet responder handles every call the managed driver
issues).
2. The integration-test assembly built. The managed
`WireFocasClient` dials the mock directly; there is no shim
DLL, no P/Invoke, no test-bin DLL copy step.
This script handles both and cleans up on exit.
Designed to run unattended on a build agent or on a developer box.
Exit code matches the test suite (0 = all pass or skip-clean,
non-zero when any integration test failed).
.PARAMETER Profile
focas-mock profile name to seed at startup (e.g. `ThirtyOne_i`,
`Sixteen_i`, `fwlib30i64`). Defaults to `ThirtyOne_i`. The fixture
resolves aliases via `FocasCncSeries`, and tests that need per-series
state can call `fixture.LoadProfileAsync` directly at test start to
override the default.
.PARAMETER SkipDocker
Skip docker up/down. Use when the mock is already running from
another shell.
.PARAMETER Configuration
Build configuration Debug or Release. Default: Debug.
.PARAMETER KeepDocker
Don't tear down the docker stack on exit. Useful for iterating on
tests.
#>
param(
[string]$Profile = "ThirtyOne_i",
[switch]$SkipDocker,
[ValidateSet("Debug", "Release")]
[string]$Configuration = "Debug",
[switch]$KeepDocker
)
Set-StrictMode -Version 3.0
$ErrorActionPreference = "Stop"
$repoRoot = (Resolve-Path (Join-Path $PSScriptRoot "..\..")).Path
$integTests = Join-Path $repoRoot "tests\ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests"
$dockerYml = Join-Path $integTests "Docker\docker-compose.yml"
function Write-Step { param([string]$Msg) Write-Host ""; Write-Host "=== $Msg ===" -ForegroundColor Cyan }
function Write-Info { param([string]$Msg) Write-Host "[INFO] $Msg" -ForegroundColor Gray }
function Write-Fail { param([string]$Msg) Write-Host "[FAIL] $Msg" -ForegroundColor Red }
$cleanupScripts = @()
trap {
Write-Host ""
Write-Fail "run-focas.ps1 crashed: $_"
foreach ($c in $cleanupScripts) { try { & $c } catch { Write-Host "cleanup failed: $_" -ForegroundColor DarkYellow } }
exit 2
}
Write-Step "Build FOCAS IntegrationTests ($Configuration)"
dotnet build (Join-Path $integTests "ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests.csproj") `
--configuration $Configuration --nologo --verbosity minimal
if ($LASTEXITCODE -ne 0) { throw "dotnet build failed (exit $LASTEXITCODE)" }
if (-not $SkipDocker) {
Write-Step "docker compose up"
if (-not (Get-Command docker -ErrorAction SilentlyContinue)) {
Write-Fail "docker CLI not on PATH. Install Docker Desktop or pass -SkipDocker + run the mock externally."
exit 4
}
docker compose -f $dockerYml up -d --build --wait 2>&1 | Write-Host
if ($LASTEXITCODE -ne 0) { throw "docker compose up failed (exit $LASTEXITCODE)" }
if (-not $KeepDocker) {
$cleanupScripts += {
Write-Step "docker compose down"
docker compose -f $dockerYml down --remove-orphans 2>&1 | Write-Host
}
}
Write-Info "probing localhost:8193..."
$tcp = [System.Net.Sockets.TcpClient]::new()
try {
$ok = $tcp.ConnectAsync("127.0.0.1", 8193).Wait([TimeSpan]::FromSeconds(5))
if (-not $ok -or -not $tcp.Connected) {
throw "TCP probe to localhost:8193 failed after docker compose --wait succeeded"
}
}
finally { $tcp.Dispose() }
Write-Info "mock is accepting connections"
}
else {
Write-Step "Docker (skipped)"
}
Write-Step "dotnet test (wire-backend integration)"
$env:OTOPCUA_FOCAS_SIM_PROFILE = $Profile
dotnet test (Join-Path $integTests "ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests.csproj") `
--configuration $Configuration --no-build --nologo --verbosity minimal
$testExit = $LASTEXITCODE
foreach ($c in $cleanupScripts) { & $c }
if ($testExit -ne 0) {
Write-Fail "integration tests failed with exit $testExit"
exit $testExit
}
Write-Host ""
Write-Host "run-focas.ps1 completed successfully." -ForegroundColor Green
exit 0
-13
View File
@@ -80,11 +80,6 @@ VALUES (@Gen, @EqId, @EqUuid, @DrvId, @LineId, 'ab-sim', 'abcip-001', 1);
-- AB CIP DriverInstance — single ControlLogix device at the ab_server fixture
-- gateway. DriverConfig shape mirrors AbCipDriverConfigDto.
--
-- The second device entry (CompactLogix L2 example, commented out) demonstrates
-- the PR abcip-3.1 ConnectionSize override knob. Uncomment + point at a real
-- 5069-L2 to verify the narrow-buffer Forward Open path; ab_server itself
-- doesn't enforce the narrow cap (see docs/drivers/AbServer-Test-Fixture.md §5).
INSERT dbo.DriverInstance(GenerationId, DriverInstanceId, ClusterId, NamespaceId,
Name, DriverType, DriverConfig, Enabled)
VALUES (@Gen, @DrvId, @ClusterId, @NsId, 'ab-server-smoke', 'AbCip', N'{
@@ -95,14 +90,6 @@ VALUES (@Gen, @DrvId, @ClusterId, @NsId, 'ab-server-smoke', 'AbCip', N'{
"PlcFamily": "ControlLogix",
"DeviceName": "ab-server"
}
/*
, {
"HostAddress": "ab://10.0.0.7/1,0",
"PlcFamily": "CompactLogix",
"DeviceName": "compactlogix-l2-narrow",
"ConnectionSize": 504
}
*/
],
"Probe": { "Enabled": true, "IntervalMs": 5000, "TimeoutMs": 2000 },
"Tags": [
+3 -52
View File
@@ -31,11 +31,10 @@ DECLARE @LineId nvarchar(64) = 'ablegacy-smoke-line';
DECLARE @EqId nvarchar(64) = 'ablegacy-smoke-eq';
DECLARE @EqUuid uniqueidentifier = '5A1D2030-5A1D-4203-A5A1-D20305A1D203';
DECLARE @TagId nvarchar(64) = 'ablegacy-smoke-tag-n7_5';
DECLARE @ArrTagId nvarchar(64) = 'ablegacy-smoke-tag-n7_block';
BEGIN TRAN;
DELETE FROM dbo.Tag WHERE TagId IN (@TagId, @ArrTagId);
DELETE FROM dbo.Tag WHERE TagId IN (@TagId);
DELETE FROM dbo.Equipment WHERE EquipmentId = @EqId;
DELETE FROM dbo.UnsLine WHERE UnsLineId = @LineId;
DELETE FROM dbo.UnsArea WHERE UnsAreaId = @AreaId;
@@ -80,28 +79,15 @@ VALUES (@Gen, @EqId, @EqUuid, @DrvId, @LineId, 'slc-sim', 'ablegacy-001', 1);
-- AB Legacy DriverInstance — SLC 500 target. Replace the placeholder gateway
-- `192.168.1.10` with the real PLC / RSEmulate host before running.
--
-- PR 9 / #252 demo: the device row carries `"TimeoutMs": 500` + `"Retries": 1`,
-- both overriding the driver-wide `TimeoutMs: 2000` / `Retries: 0` defaults.
-- For real chassis tune per family (SLC 5/01 ≈ 5000, SLC 5/05 ≈ 2000,
-- MicroLogix 1100 ≈ 3000); see docs/Driver.AbLegacy.Cli.md for the cheat sheet.
INSERT dbo.DriverInstance(GenerationId, DriverInstanceId, ClusterId, NamespaceId,
Name, DriverType, DriverConfig, Enabled)
VALUES (@Gen, @DrvId, @ClusterId, @NsId, 'ablegacy-smoke', 'AbLegacy', N'{
"TimeoutMs": 2000,
"Retries": 0,
"Devices": [
{
"HostAddress": "ab://127.0.0.1:44818/1,0",
"PlcFamily": "Slc500",
"DeviceName": "slc-500",
"TimeoutMs": 500,
"Retries": 1,
"Demote": {
"FailureThreshold": 3,
"DemoteForMs": 30000,
"Enabled": true
}
"DeviceName": "slc-500"
}
],
"Probe": { "Enabled": true, "IntervalMs": 5000, "TimeoutMs": 2000, "ProbeAddress": "S:0" },
@@ -112,16 +98,7 @@ VALUES (@Gen, @DrvId, @ClusterId, @NsId, 'ablegacy-smoke', 'AbLegacy', N'{
"Address": "N7:5",
"DataType": "Int",
"Writable": true,
"WriteIdempotent": true,
"AbsoluteDeadband": 5
},
{
"Name": "N7_Block",
"DeviceHostAddress": "ab://127.0.0.1:44818/1,0",
"Address": "N7:0,10",
"DataType": "Int",
"Writable": false,
"ArrayLength": 10
"WriteIdempotent": true
}
]
}', 1);
@@ -131,17 +108,6 @@ INSERT dbo.Tag(GenerationId, TagId, DriverInstanceId, EquipmentId, Name, DataTyp
VALUES (@Gen, @TagId, @DrvId, @EqId, 'N7_5', 'Int16', 'ReadWrite',
N'{"FullName":"N7_5","Address":"N7:5","DataType":"Int"}', 1);
-- PR 7 — array contiguous-block tag. The TagConfig JSON carries the address suffix
-- + ArrayLength override; the driver picks both up at discovery time and emits the
-- DriverAttributeInfo with IsArray=true + ArrayDim=10 so the generic node manager
-- materialises a 1-D Int16 array variable. The dbo.Tag schema doesn't carry
-- IsArray/ArrayDim columns — the array shape is fully driver-side metadata.
-- Read-only because the smoke harness only exercises array reads.
INSERT dbo.Tag(GenerationId, TagId, DriverInstanceId, EquipmentId, Name, DataType,
AccessLevel, TagConfig, WriteIdempotent)
VALUES (@Gen, @ArrTagId, @DrvId, @EqId, 'N7_Block', 'Int16', 'Read',
N'{"FullName":"N7_Block","Address":"N7:0,10","DataType":"Int","ArrayLength":10}', 0);
EXEC dbo.sp_PublishGeneration @ClusterId = @ClusterId, @DraftGenerationId = @Gen,
@Notes = N'AB Legacy smoke — task #213';
@@ -157,18 +123,3 @@ PRINT 'NOTE: default points at the ab_server slc500 Docker fixture with a /1,0';
PRINT ' cip-path (required by ab_server). For real SLC/MicroLogix/PLC-5';
PRINT ' hardware, edit the DriverConfig HostAddress to end with /<empty>';
PRINT ' e.g. "ab://<plc-ip>:44818/" and re-run this seed.';
PRINT '';
PRINT 'PR ablegacy-10 / #253 — diagnostic counters auto-emit per device under';
PRINT ' AbLegacy/<host>/_Diagnostics/<name>. No dbo.Tag rows needed — the';
PRINT ' driver registers them at DiscoverAsync time. Nine counters per device:';
PRINT ' RequestCount, ResponseCount, ErrorCount, RetryCount, LastErrorCode,';
PRINT ' LastErrorMessage, CommFailures, DemoteCount, LastDemotedUtc. See';
PRINT ' docs/drivers/AbLegacy-Diagnostics.md for the full surface + reset';
PRINT ' semantics.';
PRINT '';
PRINT 'PR ablegacy-12 / #255 — auto-demote on comm failure: 3 consecutive';
PRINT ' failed reads / probes mark the device Demoted for DemoteFor=PT30S';
PRINT ' (30 s); reads against a demoted device short-circuit with';
PRINT ' BadCommunicationError so one slow PLC can''t starve the driver.';
PRINT ' Tune via the Demote block on each Devices[] row. DemoteCount +';
PRINT ' LastDemotedUtc on the _Diagnostics folder surface flapping links.';
+31 -11
View File
@@ -71,6 +71,13 @@ INSERT dbo.ClusterNode(NodeId, ClusterId, RedundancyRole, Host, OpcUaPort, Dashb
VALUES (@NodeId, @ClusterId, 'Primary', 'localhost', 4840, 5000,
'urn:OtOpcUa:p7-smoke-node', 200, 1, 'p7-smoke');
-- sp_GetCurrentGenerationForCluster gates access by SUSER_SNAME() against
-- ClusterNodeCredential; without this binding the Server bootstrap fails with
-- `Unauthorized: caller sa is not bound to NodeId p7-smoke-node`. Dev Docker
-- SQL runs with `sa`; production deploys would rotate to a per-node login.
INSERT dbo.ClusterNodeCredential(NodeId, Kind, Value, Enabled, CreatedBy)
VALUES (@NodeId, N'SqlLogin', N'sa', 1, N'p7-smoke');
-- 2. Generation (created Draft, flipped to Published at the end so insert order
-- constraints (one Draft per cluster, etc.) don't fight us).
DECLARE @Gen bigint;
@@ -106,30 +113,43 @@ VALUES (@Gen, @DrvId, @ClusterId, @NsId, 'galaxy-smoke', 'Galaxy', N'{
}', 1);
-- 6. One driver-sourced Tag bound to the Equipment. TagConfig is the Galaxy
-- fullRef ("DelmiaReceiver_001.DownloadPath" style); replace with a real
-- attribute on this Galaxy. The script paths below use
-- /lab-floor/galaxy-line/reactor-1/Source which the EquipmentNodeWalker
-- emits + the DriverSubscriptionBridge maps to this driver fullRef.
-- fullRef; the EquipmentNodeWalker reads the JSON's FullName field and
-- hands that string to the Galaxy driver as the MXAccess reference.
-- Default points at TestMachine_001.TestHistoryValue — the live dev-box
-- Galaxy ships it as Int32 + writable (security_classification=Operate)
-- + historized (HistoryExtension primitive), so the E2E script can
-- exercise read + write + subscribe + alarm + history against the
-- same attribute. Swap to any other writable historized attribute on
-- this Galaxy by re-running `gr/queries/attributes_extended.sql` with
-- `WHERE is_historized=1 AND security_classification > 0`.
INSERT dbo.Tag(GenerationId, TagId, DriverInstanceId, EquipmentId, Name, DataType,
AccessLevel, TagConfig, WriteIdempotent)
VALUES (@Gen, @TagId, @DrvId, @EqId, 'Source', 'Float64', 'Read',
N'{"FullName":"REPLACE_WITH_REAL_GALAXY_ATTRIBUTE","DataType":"Float64"}', 0);
VALUES (@Gen, @TagId, @DrvId, @EqId, 'Source', 'Int32', 'ReadWrite',
N'{"FullName":"TestMachine_001.TestHistoryValue","DataType":"Int32"}', 0);
-- 7. Scripts (SourceHash is SHA-256 of SourceCode, computed externally — using
-- a placeholder here; the engine recomputes on first use anyway).
--
-- MachineStatus predicate — the dynamic "is the machine running?" status
-- derived from the raw Source value. Boolean: true when Source > 0. Matches
-- the shape Aveva operators typically want on a machine-status tile (green
-- when above zero, otherwise grey) without needing a separate threshold
-- attribute. Rename + re-threshold here to mirror site semantics.
INSERT dbo.Script(GenerationId, ScriptId, Name, SourceCode, SourceHash, Language)
VALUES
(@Gen, @VtScript, 'doubled-source',
N'return ((double)ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) * 2.0;',
(@Gen, @VtScript, 'machine-status-predicate',
N'return System.Convert.ToInt32(ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) > 0;',
'0000000000000000000000000000000000000000000000000000000000000000', 'CSharp'),
(@Gen, @AlScript, 'overtemp-predicate',
N'return ((double)ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) > 50.0;',
N'return System.Convert.ToInt32(ctx.GetTag("/lab-floor/galaxy-line/reactor-1/Source").Value) > 50;',
'0000000000000000000000000000000000000000000000000000000000000000', 'CSharp');
-- 8. VirtualTag — derived value computed by Roslyn each time Source changes.
-- 8. VirtualTag — MachineStatus boolean computed by Roslyn each time Source
-- changes. Historized so the dashboard can plot a running/idle timeline
-- next to the raw TestHistoryValue trend from Aveva Historian.
INSERT dbo.VirtualTag(GenerationId, VirtualTagId, EquipmentId, Name, DataType,
ScriptId, ChangeTriggered, TimerIntervalMs, Historize, Enabled)
VALUES (@Gen, @VtId, @EqId, 'Doubled', 'Float64', @VtScript, 1, NULL, 0, 1);
VALUES (@Gen, @VtId, @EqId, 'MachineStatus', 'Boolean', @VtScript, 1, NULL, 1, 1);
-- 9. ScriptedAlarm — Active when Source > 50.
INSERT dbo.ScriptedAlarm(GenerationId, ScriptedAlarmId, EquipmentId, Name, AlarmType,
@@ -51,6 +51,7 @@ else
<li class="nav-item"><button class="nav-link @Tab("uns")" @onclick='() => _tab = "uns"'>UNS Structure</button></li>
<li class="nav-item"><button class="nav-link @Tab("namespaces")" @onclick='() => _tab = "namespaces"'>Namespaces</button></li>
<li class="nav-item"><button class="nav-link @Tab("drivers")" @onclick='() => _tab = "drivers"'>Drivers</button></li>
<li class="nav-item"><button class="nav-link @Tab("tags")" @onclick='() => _tab = "tags"'>Tags</button></li>
<li class="nav-item"><button class="nav-link @Tab("acls")" @onclick='() => _tab = "acls"'>ACLs</button></li>
<li class="nav-item"><button class="nav-link @Tab("redundancy")" @onclick='() => _tab = "redundancy"'>Redundancy</button></li>
<li class="nav-item"><button class="nav-link @Tab("audit")" @onclick='() => _tab = "audit"'>Audit</button></li>
@@ -89,6 +90,10 @@ else
{
<DriversTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "tags" && _currentDraft is not null)
{
<TagsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "acls" && _currentDraft is not null)
{
<AclsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
@@ -1,3 +1,5 @@
@using System.Text.Json
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Pages.Modbus
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject DriverInstanceService DriverSvc
@@ -17,7 +19,21 @@ else
<tbody>
@foreach (var d in _drivers)
{
<tr><td><code>@d.DriverInstanceId</code></td><td>@d.Name</td><td>@d.DriverType</td><td><code>@d.NamespaceId</code></td></tr>
<tr>
<td><code>@d.DriverInstanceId</code></td>
<td>@d.Name</td>
<td>
@if (string.Equals(d.DriverType, "Focas", StringComparison.OrdinalIgnoreCase))
{
<a href="/drivers/focas/@d.DriverInstanceId">@d.DriverType</a>
}
else
{
@d.DriverType
}
</td>
<td><code>@d.NamespaceId</code></td>
</tr>
}
</tbody>
</table>
@@ -36,13 +52,14 @@ else
<label class="form-label">DriverType</label>
<select class="form-select" @bind="_type">
<option>Galaxy</option>
<option>ModbusTcp</option>
<option>Modbus</option>
<option>AbCip</option>
<option>AbLegacy</option>
<option>S7</option>
<option>Focas</option>
<option>OpcUaClient</option>
</select>
<div class="form-text">Type string must match the driver's registered factory name; this dropdown wraps the canonical names.</div>
</div>
<div class="col-md-6">
<label class="form-label">Namespace</label>
@@ -51,9 +68,19 @@ else
</select>
</div>
<div class="col-12">
@if (string.Equals(_type, "Modbus", StringComparison.OrdinalIgnoreCase))
{
@* #147 — typed editor for Modbus drivers. The generic textarea is a fall-back
for driver types that haven't yet shipped a typed editor. *@
<label class="form-label">Modbus options (typed editor)</label>
<ModbusOptionsEditor Model="_modbusOptions"/>
}
else
{
<label class="form-label">DriverConfig JSON (schemaless per driver type)</label>
<textarea class="form-control font-monospace" rows="6" @bind="_config"></textarea>
<div class="form-text">Phase 1: generic JSON editor — per-driver schema validation arrives in each driver's phase (decision #94).</div>
}
</div>
</div>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
@@ -73,11 +100,17 @@ else
private List<Namespace>? _namespaces;
private bool _showForm;
private string _name = string.Empty;
private string _type = "ModbusTcp";
private string _type = "Modbus";
private string _nsId = string.Empty;
private string _config = "{}";
private string? _error;
// #147 — typed editor model for Modbus drivers. Defaults match ModbusDriverOptions
// defaults so an unedited form produces config equivalent to the historical
// pre-typed-editor wire output. Serialised to _config on Save when type=Modbus.
private ModbusOptionsEditor.ModbusOptionsViewModel _modbusOptions = new();
private static readonly JsonSerializerOptions ModbusJsonOptions = new() { WriteIndented = true };
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync()
@@ -97,11 +130,57 @@ else
}
try
{
await DriverSvc.AddAsync(GenerationId, ClusterId, _nsId, _name, _type, _config, CancellationToken.None);
// #147 — for Modbus drivers serialize the typed editor model into the DriverConfig
// JSON column. Other driver types still use the raw textarea contents until each
// ships its own typed editor (decision #94 — per-driver schema validation arrives
// per driver phase).
var configJson = string.Equals(_type, "Modbus", StringComparison.OrdinalIgnoreCase)
? SerializeModbusOptions(_modbusOptions)
: _config;
await DriverSvc.AddAsync(GenerationId, ClusterId, _nsId, _name, _type, configJson, CancellationToken.None);
_name = string.Empty; _config = "{}";
_modbusOptions = new();
_showForm = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
/// <summary>
/// Maps the view-model field names onto the JSON shape <c>ModbusDriverFactoryExtensions</c>
/// consumes. Hand-rolled because the DTO uses millisecond / byte field flavours that the
/// view model exposes as TimeSpan-derived integers; a System.Text.Json round-trip would
/// emit the .NET-native names instead.
/// </summary>
private static string SerializeModbusOptions(ModbusOptionsEditor.ModbusOptionsViewModel m) =>
JsonSerializer.Serialize(new
{
host = m.Host,
port = m.Port,
unitId = m.UnitId,
family = m.Family.ToString(),
melsecSubFamily = m.MelsecSubFamily.ToString(),
keepAlive = new
{
enabled = m.KeepAliveEnabled,
timeMs = m.KeepAliveTimeSec * 1000,
intervalMs = m.KeepAliveIntervalSec * 1000,
retryCount = m.KeepAliveRetryCount,
},
reconnect = new
{
initialDelayMs = m.ReconnectInitialDelayMs,
maxDelayMs = m.ReconnectMaxDelayMs,
backoffMultiplier = m.ReconnectBackoffMultiplier,
},
maxRegistersPerRead = m.MaxRegistersPerRead,
maxRegistersPerWrite = m.MaxRegistersPerWrite,
maxCoilsPerRead = m.MaxCoilsPerRead,
maxReadGap = m.MaxReadGap,
useFC15ForSingleCoilWrites = m.UseFC15ForSingleCoilWrites,
useFC16ForSingleRegisterWrites = m.UseFC16ForSingleRegisterWrites,
writeOnChangeOnly = m.WriteOnChangeOnly,
tags = Array.Empty<object>(),
}, ModbusJsonOptions);
}
@@ -26,6 +26,15 @@
reservation conflict exists.
</div>
<div class="alert alert-secondary small mb-3">
<strong>Per-tag addressing for Modbus drivers</strong> isn't part of equipment import —
tags are configured at the driver-instance level via the
<a href="/clusters/@ClusterId/draft/@GenerationId">Drivers tab</a>. Use the
<a href="/modbus/address-preview" target="_blank">address-preview tool</a> to sanity-check
grammar strings (<code>40001:F:CDAB</code>, <code>HR1:I</code>, <code>V2000</code> for
DL205 family, etc.) before pasting them into the driver config.
</div>
<div class="card mb-3">
<div class="card-body">
<div class="row g-3">
@@ -0,0 +1,372 @@
@using System.Text.Json
@using ZB.MOM.WW.OtOpcUa.Admin.Components.Pages.Modbus
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@inject TagService TagSvc
@inject DriverInstanceService DriverSvc
@inject EquipmentService EquipmentSvc
@*
#155 — interactive Tag CRUD scoped to a draft generation. Conditional editor: when the
selected DriverInstance is Modbus, the address input switches to ModbusAddressEditor (#145)
so users get the live-parse preview + grammar validation. Other driver types fall back to
a generic JSON textarea, matching the DriversTab pattern from #147.
*@
<div class="d-flex justify-content-between mb-3">
<h4>Tags (draft gen @GenerationId)</h4>
<button class="btn btn-primary btn-sm" @onclick="StartAdd">Add tag</button>
</div>
<div class="row g-3 mb-3">
<div class="col-md-4">
<label class="form-label small text-muted">Filter by driver</label>
<select class="form-select form-select-sm" @bind="_filterDriverId" @bind:after="ReloadAsync">
<option value="">— all drivers —</option>
@if (_drivers is not null)
{
@foreach (var d in _drivers)
{
<option value="@d.DriverInstanceId">@d.Name (@d.DriverType)</option>
}
}
</select>
</div>
</div>
@if (_tags is null) { <p>Loading…</p> }
else if (_tags.Count == 0 && !_showForm) { <p class="text-muted">No tags in this filter.</p> }
else if (_tags.Count > 0)
{
<table class="table table-sm">
<thead>
<tr><th>Name</th><th>Driver</th><th>Equipment</th><th>DataType</th><th>Access</th><th>TagConfig</th><th></th></tr>
</thead>
<tbody>
@foreach (var t in _tags)
{
<tr>
<td>@t.Name</td>
<td><code>@t.DriverInstanceId</code></td>
<td>@(t.EquipmentId ?? "—")</td>
<td>@t.DataType</td>
<td>@t.AccessLevel</td>
<td class="font-monospace small text-truncate" style="max-width:18rem">@t.TagConfig</td>
<td>
<button class="btn btn-sm btn-outline-secondary me-1" @onclick="() => StartEdit(t)">Edit</button>
<button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteAsync(t.TagRowId)">Remove</button>
</td>
</tr>
}
</tbody>
</table>
}
@if (_showForm)
{
<div class="card mt-3">
<div class="card-body">
<h5>@(_editMode ? "Edit tag" : "New tag")</h5>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">Name</label>
<input class="form-control" @bind="_draft.Name"/>
</div>
<div class="col-md-4">
<label class="form-label">DriverInstance</label>
<select class="form-select" @bind="_draft.DriverInstanceId" @bind:after="OnDriverChanged">
<option value="">— select driver —</option>
@if (_drivers is not null)
{
@foreach (var d in _drivers) { <option value="@d.DriverInstanceId">@d.Name (@d.DriverType)</option> }
}
</select>
</div>
<div class="col-md-4">
<label class="form-label">Equipment (optional)</label>
<select class="form-select" @bind="_draft.EquipmentId">
<option value="">— none (folder-path mode) —</option>
@if (_equipment is not null)
{
@foreach (var e in _equipment) { <option value="@e.EquipmentId">@e.Name</option> }
}
</select>
</div>
<div class="col-md-4">
<label class="form-label">DataType</label>
<input class="form-control" @bind="_draft.DataType" placeholder="Boolean / Int32 / Float / etc."/>
</div>
<div class="col-md-4">
<label class="form-label">AccessLevel</label>
<select class="form-select" @bind="_draft.AccessLevel">
@foreach (var a in Enum.GetValues<TagAccessLevel>())
{
<option value="@a">@a</option>
}
</select>
</div>
<div class="col-md-4">
<div class="form-check mt-4">
<input type="checkbox" class="form-check-input" @bind="_draft.WriteIdempotent"/>
<label class="form-check-label">WriteIdempotent</label>
</div>
</div>
</div>
<div class="mt-3">
@if (_isModbus)
{
<ModbusAddressEditor @bind-AddressString="_modbusAddress"
@bind-AddressString:after="OnAddressChanged"/>
}
else
{
<label class="form-label">TagConfig (driver-specific JSON or string)</label>
<textarea class="form-control font-monospace" rows="3" @bind="_draft.TagConfig"
placeholder='@("{\"address\": ...}")'></textarea>
}
</div>
@* #156 — advanced Modbus fields. Collapsed by default; the basic form covers the
typical edit workflow. Expander surfaces Deadband (#141) / UnitId override (#142) /
CoalesceProhibited (#143) for the multi-slave / noisy-analog / protected-hole
deployments. Save-side flushes these into TagConfig as a structured JSON object
that ModbusTagDto's BuildTag honours alongside the address string. *@
@if (_isModbus)
{
<div class="mt-3">
<button type="button" class="btn btn-sm btn-link p-0"
@onclick="() => _showAdvanced = !_showAdvanced">
@(_showAdvanced ? "▼ Advanced" : "▶ Advanced") (Deadband / UnitId override / CoalesceProhibited)
</button>
</div>
@if (_showAdvanced)
{
<div class="row g-3 mt-1 ps-3 border-start">
<div class="col-md-4">
<label class="form-label small">Deadband
<span class="text-muted">(numeric scalar types only)</span>
</label>
<input type="number" step="any" class="form-control form-control-sm"
@bind="_advancedDeadband" @bind:after="OnAdvancedChanged"
placeholder="e.g. 0.5"/>
<div class="form-text">Suppress publishes when |new - last| &lt; threshold.</div>
</div>
<div class="col-md-4">
<label class="form-label small">UnitId override
<span class="text-muted">(0255, blank = use driver default)</span>
</label>
<input type="number" min="0" max="255" class="form-control form-control-sm"
@bind="_advancedUnitId" @bind:after="OnAdvancedChanged"
placeholder="leave blank for driver-level"/>
<div class="form-text">Per-tag MBAP unit ID. Required when fronting a multi-slave gateway.</div>
</div>
<div class="col-md-4">
<label class="form-label small">CoalesceProhibited</label>
<div class="form-check mt-1">
<input type="checkbox" class="form-check-input"
@bind="_advancedCoalesceProhibited" @bind:after="OnAdvancedChanged"/>
<label class="form-check-label">Read in isolation (#143)</label>
</div>
<div class="form-text">Use when surrounding registers are write-only or fault on read.</div>
</div>
</div>
}
}
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button class="btn btn-sm btn-primary" @onclick="SaveAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="Cancel">Cancel</button>
</div>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<Tag>? _tags;
private List<DriverInstance>? _drivers;
private List<Equipment>? _equipment;
private string _filterDriverId = string.Empty;
private bool _showForm;
private bool _editMode;
private Tag _draft = NewBlankDraft();
private string? _error;
private bool _isModbus;
private string? _modbusAddress;
// #156 — advanced Modbus fields. Bound separately from _draft.TagConfig because they
// round-trip through a structured JSON shape, not a single string. Synced into TagConfig
// by OnAdvancedChanged / OnAddressChanged (whichever fires).
private bool _showAdvanced;
private double? _advancedDeadband;
private byte? _advancedUnitId;
private bool _advancedCoalesceProhibited;
private static Tag NewBlankDraft() => new()
{
TagId = string.Empty, DriverInstanceId = string.Empty, Name = string.Empty,
DataType = "Int32", AccessLevel = TagAccessLevel.Read, TagConfig = string.Empty,
};
protected override async Task OnParametersSetAsync()
{
_drivers = await DriverSvc.ListAsync(GenerationId, CancellationToken.None);
_equipment = await EquipmentSvc.ListAsync(GenerationId, CancellationToken.None);
await ReloadAsync();
}
private async Task ReloadAsync()
{
_tags = await TagSvc.ListAsync(GenerationId,
string.IsNullOrWhiteSpace(_filterDriverId) ? null : _filterDriverId,
equipmentId: null,
CancellationToken.None);
}
private void StartAdd()
{
_draft = NewBlankDraft();
_editMode = false;
_modbusAddress = null;
_isModbus = false;
_error = null;
_showForm = true;
ResetAdvanced();
}
private void ResetAdvanced()
{
_showAdvanced = false;
_advancedDeadband = null;
_advancedUnitId = null;
_advancedCoalesceProhibited = false;
}
private void StartEdit(Tag row)
{
_draft = new Tag
{
TagRowId = row.TagRowId,
GenerationId = row.GenerationId,
TagId = row.TagId,
DriverInstanceId = row.DriverInstanceId,
DeviceId = row.DeviceId,
EquipmentId = row.EquipmentId,
Name = row.Name,
FolderPath = row.FolderPath,
DataType = row.DataType,
AccessLevel = row.AccessLevel,
WriteIdempotent = row.WriteIdempotent,
PollGroupId = row.PollGroupId,
TagConfig = row.TagConfig,
};
_editMode = true;
OnDriverChanged();
// Try to extract addressString + advanced fields from existing JSON config so the
// form pre-fills correctly when an operator hits Edit on an existing row.
ResetAdvanced();
if (_isModbus) HydrateModbusFromTagConfig(row.TagConfig);
_error = null;
_showForm = true;
}
private void HydrateModbusFromTagConfig(string tagConfig)
{
try
{
using var doc = JsonDocument.Parse(tagConfig);
var root = doc.RootElement;
if (root.TryGetProperty("addressString", out var addr) && addr.ValueKind == JsonValueKind.String)
_modbusAddress = addr.GetString();
if (root.TryGetProperty("deadband", out var db) && db.ValueKind is JsonValueKind.Number)
_advancedDeadband = db.GetDouble();
if (root.TryGetProperty("unitId", out var uid) && uid.ValueKind is JsonValueKind.Number)
_advancedUnitId = uid.GetByte();
if (root.TryGetProperty("coalesceProhibited", out var cp) && cp.ValueKind is JsonValueKind.True or JsonValueKind.False)
_advancedCoalesceProhibited = cp.GetBoolean();
// Auto-expand the advanced panel when any of those fields was actually set so
// operators see immediately what's been configured.
if (_advancedDeadband.HasValue || _advancedUnitId.HasValue || _advancedCoalesceProhibited)
_showAdvanced = true;
}
catch { /* Malformed JSON falls back to empty advanced state. */ }
}
private void OnDriverChanged()
{
var driver = _drivers?.FirstOrDefault(d => d.DriverInstanceId == _draft.DriverInstanceId);
_isModbus = driver is not null
&& string.Equals(driver.DriverType, "Modbus", StringComparison.OrdinalIgnoreCase);
}
private void OnAddressChanged() => RefreshTagConfigJson();
private void OnAdvancedChanged() => RefreshTagConfigJson();
/// <summary>
/// Re-serializes the current address + advanced fields into TagConfig as a structured
/// JSON object. ModbusTagDto's BuildTag honours every field — addressString drives
/// the parser, while the structured bits (deadband / unitId / coalesceProhibited)
/// pass through directly. Fields with default / empty values are omitted from the
/// JSON to keep diffs in the existing draft-diff viewer clean.
/// </summary>
private void RefreshTagConfigJson()
{
if (string.IsNullOrWhiteSpace(_modbusAddress)
&& !_advancedDeadband.HasValue
&& !_advancedUnitId.HasValue
&& !_advancedCoalesceProhibited)
{
return;
}
var payload = new Dictionary<string, object?>();
if (!string.IsNullOrWhiteSpace(_modbusAddress)) payload["addressString"] = _modbusAddress;
if (_advancedDeadband.HasValue) payload["deadband"] = _advancedDeadband.Value;
if (_advancedUnitId.HasValue) payload["unitId"] = _advancedUnitId.Value;
if (_advancedCoalesceProhibited) payload["coalesceProhibited"] = true;
_draft.TagConfig = JsonSerializer.Serialize(payload);
}
private void Cancel()
{
_showForm = false;
_editMode = false;
}
private async Task SaveAsync()
{
_error = null;
try
{
if (string.IsNullOrWhiteSpace(_draft.Name) || string.IsNullOrWhiteSpace(_draft.DriverInstanceId))
{
_error = "Name and DriverInstance are required.";
return;
}
if (_editMode)
await TagSvc.UpdateAsync(_draft, CancellationToken.None);
else
await TagSvc.CreateAsync(GenerationId, _draft, CancellationToken.None);
_showForm = false;
_editMode = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
private async Task DeleteAsync(Guid id)
{
await TagSvc.DeleteAsync(id, CancellationToken.None);
await ReloadAsync();
}
}
@@ -0,0 +1,224 @@
@page "/drivers/focas/{InstanceId}"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@inject FocasDriverDetailService DetailSvc
<h1 class="mb-3">FOCAS driver <code>@InstanceId</code></h1>
@if (_loading)
{
<p>Loading…</p>
}
else if (_detail is null)
{
<div class="alert alert-warning">
No FOCAS driver instance with id <code>@InstanceId</code> was found.
<div class="small text-muted mt-1">
Either the id is wrong, or the instance's <code>DriverType</code> is not "Focas". The list of drivers per cluster draft is on the <a href="/clusters">Clusters</a> page.
</div>
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Name</h6>
<div class="fs-5">@_detail.Instance.Name</div>
</div></div></div>
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Cluster</h6>
<div class="fs-5"><code>@_detail.Instance.ClusterId</code></div>
</div></div></div>
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Namespace</h6>
<div class="fs-5"><code>@_detail.Instance.NamespaceId</code></div>
</div></div></div>
<div class="col-md-3"><div class="card @(_detail.Instance.Enabled ? "border-success" : "border-secondary")"><div class="card-body">
<h6 class="text-muted mb-1">Enabled</h6>
<div class="fs-5">@(_detail.Instance.Enabled ? "Yes" : "No")</div>
</div></div></div>
</div>
@if (_detail.ParseError is not null)
{
<div class="alert alert-danger">
<strong>DriverConfig JSON failed to parse:</strong> @_detail.ParseError
<div class="small text-muted mt-1">
Falling back to raw-JSON view below; the per-section tables are hidden because the shape couldn't be projected.
</div>
</div>
}
else if (_detail.Config is not null)
{
<h2 class="h5 mt-4">Devices</h2>
@if (_detail.Config.Devices is null || _detail.Config.Devices.Count == 0)
{
<p class="text-muted">No devices configured.</p>
}
else
{
<table class="table table-sm align-middle">
<thead><tr><th>HostAddress</th><th>DeviceName</th><th>Series</th></tr></thead>
<tbody>
@foreach (var d in _detail.Config.Devices)
{
<tr>
<td><code>@d.HostAddress</code></td>
<td>@(d.DeviceName ?? "—")</td>
<td>@(string.IsNullOrEmpty(d.Series) ? "Unknown" : d.Series)</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h5 mt-4">Tags</h2>
@if (_detail.Config.Tags is null || _detail.Config.Tags.Count == 0)
{
<p class="text-muted">No tags configured.</p>
}
else
{
<p class="small text-muted">@_detail.Config.Tags.Count tag(s) configured.</p>
<table class="table table-sm align-middle">
<thead><tr><th>Name</th><th>Device</th><th>Address</th><th>DataType</th><th>Writable</th></tr></thead>
<tbody>
@foreach (var t in _detail.Config.Tags)
{
<tr>
<td>@t.Name</td>
<td><code class="small">@t.DeviceHostAddress</code></td>
<td><code>@t.Address</code></td>
<td>@t.DataType</td>
<td>@(t.Writable ? "Yes" : "No")</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h5 mt-4">Driver behaviour</h2>
<table class="table table-sm align-middle" style="max-width: 640px;">
<tbody>
<tr>
<th style="width: 30%;">Probe</th>
<td>
@if (_detail.Config.Probe is { } probe)
{
<span class="badge @(probe.Enabled ? "bg-success" : "bg-secondary")">@(probe.Enabled ? "Enabled" : "Disabled")</span>
<span class="ms-2 small text-muted">Interval: @(probe.Interval ?? "default")</span>
}
else { <span class="text-muted">default (enabled)</span> }
</td>
</tr>
<tr>
<th>Alarm projection</th>
<td>
@if (_detail.Config.AlarmProjection is { } ap)
{
<span class="badge @(ap.Enabled ? "bg-success" : "bg-secondary")">@(ap.Enabled ? "Enabled" : "Disabled")</span>
<span class="ms-2 small text-muted">PollInterval: @(ap.PollInterval ?? "default")</span>
}
else { <span class="text-muted">disabled (default)</span> }
</td>
</tr>
<tr>
<th>Handle recycling</th>
<td>
@if (_detail.Config.HandleRecycle is { } hr)
{
<span class="badge @(hr.Enabled ? "bg-warning text-dark" : "bg-secondary")">@(hr.Enabled ? "Enabled" : "Disabled")</span>
<span class="ms-2 small text-muted">Interval: @(hr.Interval ?? "default (01:00:00)")</span>
}
else { <span class="text-muted">disabled (default)</span> }
</td>
</tr>
</tbody>
</table>
}
<h2 class="h5 mt-4">Host status</h2>
@if (_detail.HostStatuses.Count == 0)
{
<div class="alert alert-secondary small">
No <code>DriverHostStatus</code> rows yet for this instance. The Server publishes its first
tick ~2 s after the driver starts — if this stays empty after a minute, check that the Server is running and the instance is in a published generation.
</div>
}
else
{
<table class="table table-sm table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Host</th>
<th>State</th>
<th class="text-end" title="Consecutive failures">Fail#</th>
<th>Breaker last opened</th>
<th>Last recycled</th>
<th>Last seen</th>
<th>Detail</th>
</tr>
</thead>
<tbody>
@foreach (var r in _detail.HostStatuses)
{
<tr class="@(IsStale(r) ? "table-warning" : "")">
<td><code>@r.NodeId</code></td>
<td>@r.HostName</td>
<td><span class="badge @StateBadge(r.State)">@r.State</span></td>
<td class="text-end small">@r.ConsecutiveFailures</td>
<td class="small">@FormatUtc(r.LastCircuitBreakerOpenUtc)</td>
<td class="small">@FormatUtc(r.LastRecycleUtc)</td>
<td class="small @(IsStale(r) ? "text-warning" : "")">@FormatAge(r.LastSeenUtc)</td>
<td class="text-truncate small" style="max-width: 240px;" title="@r.Detail">@r.Detail</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h5 mt-4">Raw DriverConfig JSON</h2>
<pre class="small bg-light border p-3"><code>@_detail.Instance.DriverConfig</code></pre>
<div class="mt-4 small text-muted">
Docs: <code>docs/drivers/FOCAS.md</code> (getting started) · <code>docs/v2/focas-deployment.md</code> (NSSM + pipe ACL) · <code>docs/drivers/FOCAS-Test-Fixture.md</code> (test coverage).
</div>
}
@code {
[Parameter] public string InstanceId { get; set; } = string.Empty;
private FocasDriverDetail? _detail;
private bool _loading = true;
protected override async Task OnParametersSetAsync()
{
_loading = true;
try { _detail = await DetailSvc.GetAsync(InstanceId, CancellationToken.None); }
finally { _loading = false; }
}
private static bool IsStale(FocasHostStatusRow r) =>
DateTime.UtcNow - r.LastSeenUtc > TimeSpan.FromSeconds(30);
private static string StateBadge(string state) => state switch
{
"Running" => "bg-success",
"Faulted" => "bg-danger",
"Starting" => "bg-info",
"Stopped" => "bg-secondary",
_ => "bg-secondary",
};
private static string FormatUtc(DateTime? utc) =>
utc is null ? "—" : utc.Value.ToString("yyyy-MM-dd HH:mm 'UTC'");
private static string FormatAge(DateTime utc)
{
var age = DateTime.UtcNow - utc;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 48) return $"{(int)age.TotalHours}h ago";
return utc.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
}
@@ -1,9 +1,12 @@
@page "/hosts"
@using Microsoft.AspNetCore.SignalR.Client
@using Microsoft.EntityFrameworkCore
@using ZB.MOM.WW.OtOpcUa.Admin.Hubs
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject IServiceScopeFactory ScopeFactory
@implements IDisposable
@inject NavigationManager Nav
@implements IAsyncDisposable
<h1 class="mb-4">Driver host status</h1>
@@ -128,6 +131,7 @@ else
private bool _refreshing;
private DateTime? _lastRefreshUtc;
private Timer? _timer;
private HubConnection? _hub;
protected override async Task OnInitializedAsync()
{
@@ -136,6 +140,44 @@ else
state: null,
dueTime: TimeSpan.FromSeconds(RefreshIntervalSeconds),
period: TimeSpan.FromSeconds(RefreshIntervalSeconds));
await ConnectHubAsync();
}
// Phase 6.1 Stream E.2 — subscribe to FleetStatusHub so resilience deltas upsert the
// matching row without waiting for the next RefreshIntervalSeconds tick. The 10 s
// poll stays as a safety net in case the hub connection is down.
private async Task ConnectHubAsync()
{
var hubUrl = Nav.ToAbsoluteUri("/hubs/fleet");
_hub = new HubConnectionBuilder().WithUrl(hubUrl).WithAutomaticReconnect().Build();
_hub.On<ResilienceStatusChangedMessage>("ResilienceStatusChanged", OnResilienceChanged);
try
{
await _hub.StartAsync();
await _hub.SendAsync("SubscribeFleet");
}
catch
{
// Hub is best-effort; polling refresh is the fallback. Swallow connect errors
// so the page still renders against the initial RefreshAsync pass.
}
}
private async Task OnResilienceChanged(ResilienceStatusChangedMessage msg)
{
if (_rows is null) return;
var idx = _rows.FindIndex(r =>
r.DriverInstanceId == msg.DriverInstanceId && r.HostName == msg.HostName);
if (idx < 0) return;
var prior = _rows[idx];
_rows[idx] = prior with
{
ConsecutiveFailures = msg.ConsecutiveFailures,
LastCircuitBreakerOpenUtc = msg.LastCircuitBreakerOpenUtc,
CurrentBulkheadDepth = msg.CurrentBulkheadDepth,
LastRecycleUtc = msg.LastRecycleUtc,
};
await InvokeAsync(StateHasChanged);
}
private async Task RefreshAsync()
@@ -180,5 +222,12 @@ else
return t.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
public void Dispose() => _timer?.Dispose();
public async ValueTask DisposeAsync()
{
_timer?.Dispose();
if (_hub is not null)
{
try { await _hub.DisposeAsync(); } catch { }
}
}
}
@@ -0,0 +1,79 @@
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@*
#145 — Live address-string parser preview for Modbus tag editing. Bound to a string
AddressString; on every input keystroke the parser runs and surfaces the resolved
breakdown (Region, PduOffset, DataType, Bit, ByteOrder, ArrayCount, StringLength) or
the parse error. Family flag drives the parser's family-native branch (#144).
Re-uses the same ModbusAddressParser the wire driver uses, so grammar drift is
impossible by construction. Internal-namespace component called from the larger
DriverInstance editor.
*@
<div class="mb-3">
<label class="form-label">Address string</label>
<input type="text" class="form-control @(IsValid ? "is-valid" : Diagnostic is null ? "" : "is-invalid")"
value="@AddressString"
@oninput="@OnInputChanged"
placeholder="e.g. 40001:F:CDAB:5"/>
@if (IsValid && _parsed is not null)
{
<div class="form-text text-success">
<strong>Parsed:</strong>
Region=<code>@_parsed.Region</code>
Offset=<code>@_parsed.Offset</code>
Type=<code>@_parsed.DataType</code>
@if (_parsed.Bit.HasValue) { <text>Bit=<code>@_parsed.Bit</code></text> }
@if (_parsed.ByteOrder != ModbusByteOrder.BigEndian) { <text>Order=<code>@_parsed.ByteOrder</code></text> }
@if (_parsed.ArrayCount.HasValue) { <text>Array[<code>@_parsed.ArrayCount</code>]</text> }
@if (_parsed.StringLength > 0) { <text>StrLen=<code>@_parsed.StringLength</code></text> }
</div>
}
else if (Diagnostic is not null)
{
<div class="invalid-feedback">@Diagnostic</div>
}
</div>
@code {
[Parameter] public string? AddressString { get; set; }
[Parameter] public EventCallback<string?> AddressStringChanged { get; set; }
[Parameter] public ModbusFamily Family { get; set; } = ModbusFamily.Generic;
[Parameter] public MelsecFamily MelsecSubFamily { get; set; } = MelsecFamily.Q_L_iQR;
[Parameter] public EventCallback<ParsedModbusAddress?> ParsedChanged { get; set; }
private ParsedModbusAddress? _parsed;
private string? Diagnostic;
private bool IsValid => _parsed is not null && Diagnostic is null;
protected override void OnParametersSet() => Reparse();
private async Task OnInputChanged(ChangeEventArgs e)
{
AddressString = e.Value as string;
await AddressStringChanged.InvokeAsync(AddressString);
Reparse();
await ParsedChanged.InvokeAsync(_parsed);
}
private void Reparse()
{
if (string.IsNullOrWhiteSpace(AddressString))
{
_parsed = null;
Diagnostic = null;
return;
}
if (ModbusAddressParser.TryParse(AddressString, Family, MelsecSubFamily, out var parsed, out var err))
{
_parsed = parsed;
Diagnostic = null;
}
else
{
_parsed = null;
Diagnostic = err;
}
}
}
@@ -0,0 +1,85 @@
@page "/modbus/address-preview"
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@*
#149 — standalone preview / sanity-check tool for Modbus address strings. The Admin UI
doesn't yet have a per-tag CRUD surface (tags are seeded via SQL or arrive at runtime
through ITagDiscovery), so the ModbusAddressEditor component shipped in #145 needs a
page where operators can paste an address string and confirm it parses to what they
expect before committing it to a config row.
Doubles as a "did the parser ship correctly" smoke target for QA + a copy-pasteable
grammar reference for users skimming the docs.
*@
<PageTitle>Modbus address preview</PageTitle>
<div class="container py-4">
<h1>Modbus address preview</h1>
<p class="text-muted">
Paste an address string and watch the parser break it down field by field. Useful for
sanity-checking a tag spreadsheet row before adding it to a driver's <code>DriverConfig</code>.
Full grammar: <a href="https://github.com/" target="_blank">docs/v2/modbus-addressing.md</a>.
</p>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">PLC family hint (drives the family-native branch)</label>
<select class="form-select" @bind="_family">
@foreach (var f in Enum.GetValues<ModbusFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
@if (_family == ModbusFamily.MELSEC)
{
<div class="col-md-4">
<label class="form-label">MELSEC sub-family</label>
<select class="form-select" @bind="_melsecSubFamily">
@foreach (var f in Enum.GetValues<MelsecFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
}
</div>
<div class="mt-4">
<ModbusAddressEditor @bind-AddressString="_address"
Family="_family"
MelsecSubFamily="_melsecSubFamily"/>
</div>
<h3 class="mt-5">Quick-reference grammar</h3>
<pre class="bg-light p-3 rounded small">@_grammarReference</pre>
</div>
@code {
private string? _address;
private ModbusFamily _family = ModbusFamily.Generic;
private MelsecFamily _melsecSubFamily = MelsecFamily.Q_L_iQR;
// Held as a const string rather than inline markup so the Razor parser doesn't try to
// interpret the angle-bracket grammar tokens as element open/close.
private const string _grammarReference = @"<region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]
Examples (post-#146 type codes):
40001 HoldingRegisters[0], Int16
400001 same, 6-digit form
40001:F Float32 (HR[0..1])
40001:F:CDAB Float32 word-swapped
40001:STR20 20-char ASCII string
40001:S:5 Int16[5] array (3-field shorthand)
40001:I:CDAB:10 Int32[10] word-swapped (4-field strict)
40001.5 bit 5 of HR[0]
HR1:I Int32 via mnemonic region (matches Wonderware)
C100 Coil 100 (mnemonic, 1-based)
V2000:F:CDAB DL205 V-memory at PDU 1024 (Family=DL205)
D100:I MELSEC D-register 100, Int32 (Family=MELSEC)
Type codes: BOOL, S (Int16), US (UInt16), I (Int32), UI (UInt32),
I_64 (Int64), UI_64 (UInt64), F, D, BCD, BCD_32, STR<n>
Byte order: ABCD (BE default), CDAB (word-swap), BADC (byte-swap), DCBA (full reverse)";
}
@@ -0,0 +1,120 @@
@page "/modbus/diagnostics/{DriverInstanceId}"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@inject DriverDiagnosticsClient Diagnostics
@*
#154 — operator-facing view of the Server's auto-prohibition state for a Modbus driver.
Fetches via DriverDiagnosticsClient (HttpClient against the Server's HealthEndpointsHost).
Refreshes on demand; auto-refresh is a future task once a SignalR diag channel exists.
*@
<PageTitle>Modbus diagnostics — @DriverInstanceId</PageTitle>
<div class="container py-4">
<h1>Modbus auto-prohibitions</h1>
<p class="text-muted">
Driver instance <code>@DriverInstanceId</code>. Live snapshot of coalesced ranges
the planner has learned to read individually (#148 / #150 / #151 / #152).
</p>
<div class="mb-3">
<button class="btn btn-sm btn-outline-primary" @onclick="LoadAsync" disabled="@_loading">
@(_loading ? "Loading…" : "Refresh")
</button>
@if (_lastRefreshed is not null)
{
<span class="text-muted ms-3 small">Last refreshed @_lastRefreshed.Value.ToLocalTime().ToString("HH:mm:ss")</span>
}
</div>
@if (_error is not null)
{
<div class="alert alert-danger">@_error</div>
}
else if (_response is null)
{
<p class="text-muted">Click <strong>Refresh</strong> to load.</p>
}
else if (_response.Count == 0)
{
<div class="alert alert-success">No auto-prohibitions. The planner is coalescing freely.</div>
}
else
{
<table class="table table-sm">
<thead>
<tr>
<th>Unit</th>
<th>Region</th>
<th>Start</th>
<th>End</th>
<th>Span</th>
<th>Status</th>
<th>Last probed</th>
</tr>
</thead>
<tbody>
@foreach (var r in _response.Ranges.OrderBy(r => r.UnitId).ThenBy(r => r.Region).ThenBy(r => r.StartAddress))
{
<tr>
<td><code>@r.UnitId</code></td>
<td><code>@r.Region</code></td>
<td><code>@r.StartAddress</code></td>
<td><code>@r.EndAddress</code></td>
<td>@(r.EndAddress - r.StartAddress + 1)</td>
<td>
@if (r.BisectionPending)
{
<span class="badge bg-warning text-dark">BISECTING</span>
}
else
{
<span class="badge bg-danger">ISOLATED</span>
}
</td>
<td class="small text-muted">@FormatTimeSince(r.LastProbedUtc)</td>
</tr>
}
</tbody>
</table>
}
</div>
@code {
[Parameter] public string DriverInstanceId { get; set; } = string.Empty;
private ModbusAutoProhibitionsResponse? _response;
private string? _error;
private bool _loading;
private DateTime? _lastRefreshed;
private async Task LoadAsync()
{
_loading = true;
_error = null;
try
{
_response = await Diagnostics.GetModbusAutoProhibitedRangesAsync(DriverInstanceId);
_lastRefreshed = DateTime.UtcNow;
if (_response is null)
_error = $"Server reports driver '{DriverInstanceId}' is not present or is not a Modbus driver.";
}
catch (Exception ex)
{
_error = $"Fetch failed: {ex.Message}";
}
finally
{
_loading = false;
}
}
private static string FormatTimeSince(DateTime utc)
{
var span = DateTime.UtcNow - utc;
if (span.TotalSeconds < 60) return $"{(int)span.TotalSeconds}s ago";
if (span.TotalMinutes < 60) return $"{(int)span.TotalMinutes}m ago";
if (span.TotalHours < 24) return $"{(int)span.TotalHours}h ago";
return $"{(int)span.TotalDays}d ago";
}
}
@@ -0,0 +1,169 @@
@using ZB.MOM.WW.OtOpcUa.Driver.Modbus
@*
#145 — Driver-instance options panel for the Modbus driver. Surfaces every option group
added by #136-#144 so users can configure the driver via the UI rather than hand-editing
DriverConfig JSON. Bound to a ModbusOptionsViewModel; the parent page round-trips that
model to the DriverConfig.json column on save.
*@
<div class="modbus-options-editor">
<h5>Connection</h5>
<div class="row mb-3">
<div class="col-sm-6">
<label class="form-label">Host</label>
<input class="form-control" @bind="Model.Host"/>
</div>
<div class="col-sm-3">
<label class="form-label">Port</label>
<input type="number" class="form-control" @bind="Model.Port"/>
</div>
<div class="col-sm-3">
<label class="form-label">Default UnitId</label>
<input type="number" class="form-control" @bind="Model.UnitId"/>
</div>
</div>
<h5>Family (#144)</h5>
<div class="row mb-3">
<div class="col-sm-6">
<label class="form-label">PLC family</label>
<select class="form-select" @bind="Model.Family">
@foreach (var f in Enum.GetValues<ModbusFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
@if (Model.Family == ModbusFamily.MELSEC)
{
<div class="col-sm-6">
<label class="form-label">MELSEC sub-family</label>
<select class="form-select" @bind="Model.MelsecSubFamily">
@foreach (var f in Enum.GetValues<MelsecFamily>())
{
<option value="@f">@f</option>
}
</select>
</div>
}
</div>
<h5>Keep-alive (#139)</h5>
<div class="row mb-3">
<div class="col-sm-3">
<div class="form-check mt-4">
<input type="checkbox" class="form-check-input" @bind="Model.KeepAliveEnabled"/>
<label class="form-check-label">Enabled</label>
</div>
</div>
<div class="col-sm-3">
<label class="form-label">Time (s)</label>
<input type="number" class="form-control" @bind="Model.KeepAliveTimeSec"/>
</div>
<div class="col-sm-3">
<label class="form-label">Interval (s)</label>
<input type="number" class="form-control" @bind="Model.KeepAliveIntervalSec"/>
</div>
<div class="col-sm-3">
<label class="form-label">Retry count</label>
<input type="number" class="form-control" @bind="Model.KeepAliveRetryCount"/>
</div>
</div>
<h5>Reconnect (#139)</h5>
<div class="row mb-3">
<div class="col-sm-4">
<label class="form-label">Initial delay (ms)</label>
<input type="number" class="form-control" @bind="Model.ReconnectInitialDelayMs"/>
</div>
<div class="col-sm-4">
<label class="form-label">Max delay (ms)</label>
<input type="number" class="form-control" @bind="Model.ReconnectMaxDelayMs"/>
</div>
<div class="col-sm-4">
<label class="form-label">Backoff multiplier</label>
<input type="number" step="0.1" class="form-control" @bind="Model.ReconnectBackoffMultiplier"/>
</div>
</div>
<h5>Protocol (#140)</h5>
<div class="row mb-3">
<div class="col-sm-3">
<label class="form-label">Max regs / read</label>
<input type="number" class="form-control" @bind="Model.MaxRegistersPerRead"/>
</div>
<div class="col-sm-3">
<label class="form-label">Max regs / write</label>
<input type="number" class="form-control" @bind="Model.MaxRegistersPerWrite"/>
</div>
<div class="col-sm-3">
<label class="form-label">Max coils / read</label>
<input type="number" class="form-control" @bind="Model.MaxCoilsPerRead"/>
</div>
<div class="col-sm-3">
<label class="form-label">Max read gap (#143)</label>
<input type="number" class="form-control" @bind="Model.MaxReadGap"/>
</div>
</div>
<div class="row mb-3">
<div class="col-sm-4">
<div class="form-check">
<input type="checkbox" class="form-check-input" @bind="Model.UseFC15ForSingleCoilWrites"/>
<label class="form-check-label">Use FC15 for single coil</label>
</div>
</div>
<div class="col-sm-4">
<div class="form-check">
<input type="checkbox" class="form-check-input" @bind="Model.UseFC16ForSingleRegisterWrites"/>
<label class="form-check-label">Use FC16 for single reg</label>
</div>
</div>
<div class="col-sm-4">
<div class="form-check">
<input type="checkbox" class="form-check-input" @bind="Model.WriteOnChangeOnly"/>
<label class="form-check-label">Write-on-change only (#141)</label>
</div>
</div>
</div>
</div>
@code {
[Parameter, EditorRequired] public ModbusOptionsViewModel Model { get; set; } = default!;
/// <summary>
/// UI binding model. Maps 1:1 onto the JSON DTO the driver factory accepts; serialised
/// to DriverConfig.json by the calling save handler. Defaults match
/// <c>ModbusDriverOptions</c> defaults so unedited rows produce the historical wire
/// output verbatim.
/// </summary>
public sealed class ModbusOptionsViewModel
{
public string Host { get; set; } = "127.0.0.1";
public int Port { get; set; } = 502;
public byte UnitId { get; set; } = 1;
public ModbusFamily Family { get; set; } = ModbusFamily.Generic;
public MelsecFamily MelsecSubFamily { get; set; } = MelsecFamily.Q_L_iQR;
public bool KeepAliveEnabled { get; set; } = true;
public int KeepAliveTimeSec { get; set; } = 30;
public int KeepAliveIntervalSec { get; set; } = 10;
public int KeepAliveRetryCount { get; set; } = 3;
public int ReconnectInitialDelayMs { get; set; } = 0;
public int ReconnectMaxDelayMs { get; set; } = 30000;
public double ReconnectBackoffMultiplier { get; set; } = 2.0;
public int MaxRegistersPerRead { get; set; } = 125;
public int MaxRegistersPerWrite { get; set; } = 123;
public int MaxCoilsPerRead { get; set; } = 2000;
public int MaxReadGap { get; set; } = 0;
public bool UseFC15ForSingleCoilWrites { get; set; } = false;
public bool UseFC16ForSingleRegisterWrites { get; set; } = false;
public bool WriteOnChangeOnly { get; set; } = false;
}
}
@@ -37,3 +37,18 @@ public sealed record NodeStateChangedMessage(
string? LastAppliedError,
DateTime? LastAppliedAt,
DateTime? LastSeenAt);
/// <summary>
/// Pushed by <c>FleetStatusPoller</c> when it observes a change in a
/// <c>DriverInstanceResilienceStatus</c> row. Closes the last Phase 6.1 Stream E.2/E.3
/// deferral — lets the Admin <c>/hosts</c> page upsert the matching row without the
/// 10-second polling round-trip. Keyed on (DriverInstanceId, HostName); the client
/// fan-outs to the matching row by matching both.
/// </summary>
public sealed record ResilienceStatusChangedMessage(
string DriverInstanceId,
string HostName,
int ConsecutiveFailures,
DateTime? LastCircuitBreakerOpenUtc,
int CurrentBulkheadDepth,
DateTime? LastRecycleUtc);
+11
View File
@@ -41,9 +41,20 @@ builder.Services.AddDbContext<OtOpcUaConfigDbContext>(opt =>
builder.Services.AddScoped<ClusterService>();
builder.Services.AddScoped<GenerationService>();
builder.Services.AddScoped<EquipmentService>();
builder.Services.AddScoped<TagService>();
builder.Services.AddScoped<UnsService>();
builder.Services.AddScoped<NamespaceService>();
builder.Services.AddScoped<DriverInstanceService>();
builder.Services.AddScoped<FocasDriverDetailService>();
// #154 — Server diagnostics client. Default base URL points at the same machine's
// HealthEndpointsHost (loopback :4841); deployments with remote Servers set
// "DriverDiagnostics:ServerBaseUrl" in appsettings.json.
builder.Services.AddHttpClient<DriverDiagnosticsClient>(client =>
{
var baseUrl = builder.Configuration["DriverDiagnostics:ServerBaseUrl"] ?? "http://localhost:4841/";
client.BaseAddress = new Uri(baseUrl);
});
builder.Services.AddScoped<NodeAclService>();
builder.Services.AddScoped<PermissionProbeService>();
builder.Services.AddScoped<AclChangeNotifier>();
@@ -0,0 +1,61 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// #154 — Admin-side client for the Server's driver-diagnostics HTTP endpoints. Wraps
/// <see cref="HttpClient"/> so Blazor pages can fetch per-driver runtime state from a
/// remote Server process. The base URL is configured at registration time
/// (typically read from <c>appsettings.json</c> at startup).
/// </summary>
/// <remarks>
/// One client instance per Server endpoint. Multi-server deployments register multiple
/// keyed clients. Errors propagate as exceptions; pages catch and surface to the
/// operator rather than swallowing.
/// </remarks>
public sealed class DriverDiagnosticsClient
{
private readonly HttpClient _http;
public DriverDiagnosticsClient(HttpClient http) => _http = http;
/// <summary>
/// Fetch the current Modbus auto-prohibition list for the named driver instance.
/// Returns null when the Server reports the driver doesn't exist or isn't a Modbus
/// driver. Throws on transport / serialization failures.
/// </summary>
public async Task<ModbusAutoProhibitionsResponse?> GetModbusAutoProhibitedRangesAsync(
string driverInstanceId, CancellationToken ct = default)
{
var resp = await _http.GetAsync(
$"/diagnostics/drivers/{Uri.EscapeDataString(driverInstanceId)}/modbus/auto-prohibited", ct)
.ConfigureAwait(false);
if (resp.StatusCode is System.Net.HttpStatusCode.NotFound or System.Net.HttpStatusCode.BadRequest)
return null;
resp.EnsureSuccessStatusCode();
return await resp.Content.ReadFromJsonAsync<ModbusAutoProhibitionsResponse>(cancellationToken: ct).ConfigureAwait(false);
}
}
/// <summary>
/// Server response shape for the Modbus auto-prohibition diagnostic. Mirrors the JSON the
/// <c>HealthEndpointsHost</c> serialises; fields are flat strings/numbers so the
/// Admin-side client doesn't take a dependency on the Driver.Modbus assembly's
/// <c>ModbusAutoProhibition</c> record.
/// </summary>
public sealed class ModbusAutoProhibitionsResponse
{
public string DriverInstanceId { get; set; } = string.Empty;
public int Count { get; set; }
public List<ModbusAutoProhibitionRow> Ranges { get; set; } = new();
}
public sealed class ModbusAutoProhibitionRow
{
public byte UnitId { get; set; }
public string Region { get; set; } = string.Empty;
public ushort StartAddress { get; set; }
public ushort EndAddress { get; set; }
public DateTime LastProbedUtc { get; set; }
public bool BisectionPending { get; set; }
}
@@ -0,0 +1,123 @@
using System.Text.Json;
using System.Text.Json.Serialization;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Per-instance detail view for FOCAS driver rows. Loads the latest
/// <see cref="DriverInstance"/> row for the requested <c>DriverInstanceId</c> (most-recent
/// draft wins when multiple rows exist across generations), parses the schemaless
/// <c>DriverConfig</c> JSON into <see cref="FocasDriverConfigView"/>, and joins the
/// per-device <see cref="DriverHostStatus"/> rows so the Admin page can render host
/// state + consecutive-failure counters next to each configured device.
/// </summary>
public sealed class FocasDriverDetailService(OtOpcUaConfigDbContext db)
{
private static readonly JsonSerializerOptions JsonOpts = new()
{
PropertyNameCaseInsensitive = true,
NumberHandling = JsonNumberHandling.AllowReadingFromString,
};
public async Task<FocasDriverDetail?> GetAsync(string driverInstanceId, CancellationToken ct = default)
{
if (string.IsNullOrWhiteSpace(driverInstanceId)) return null;
var instance = await db.DriverInstances.AsNoTracking()
.Where(d => d.DriverInstanceId == driverInstanceId
&& d.DriverType.ToLower() == "focas")
.OrderByDescending(d => d.GenerationId)
.FirstOrDefaultAsync(ct);
if (instance is null) return null;
FocasDriverConfigView? config = null;
string? parseError = null;
try { config = JsonSerializer.Deserialize<FocasDriverConfigView>(instance.DriverConfig, JsonOpts); }
catch (JsonException ex) { parseError = ex.Message; }
var hostStatuses = await (from s in db.DriverHostStatuses.AsNoTracking()
where s.DriverInstanceId == driverInstanceId
join r in db.DriverInstanceResilienceStatuses.AsNoTracking()
on new { s.DriverInstanceId, s.HostName }
equals new { r.DriverInstanceId, r.HostName } into rj
from r in rj.DefaultIfEmpty()
orderby s.HostName
select new FocasHostStatusRow(
s.NodeId,
s.HostName,
s.State.ToString(),
s.StateChangedUtc,
s.LastSeenUtc,
s.Detail,
r != null ? r.ConsecutiveFailures : 0,
r != null ? r.LastCircuitBreakerOpenUtc : null,
r != null ? r.LastRecycleUtc : null)).ToListAsync(ct);
return new FocasDriverDetail(instance, config, parseError, hostStatuses);
}
}
/// <summary>Projected view of a FOCAS driver's parsed config. Unknown fields are ignored.</summary>
public sealed record FocasDriverConfigView
{
public List<FocasDeviceView>? Devices { get; set; }
public List<FocasTagView>? Tags { get; set; }
public FocasProbeView? Probe { get; set; }
public FocasAlarmProjectionView? AlarmProjection { get; set; }
public FocasHandleRecycleView? HandleRecycle { get; set; }
}
public sealed record FocasDeviceView
{
public string? HostAddress { get; set; }
public string? DeviceName { get; set; }
public string? Series { get; set; }
}
public sealed record FocasTagView
{
public string? Name { get; set; }
public string? DeviceHostAddress { get; set; }
public string? Address { get; set; }
public string? DataType { get; set; }
public bool Writable { get; set; } = true;
}
public sealed record FocasProbeView
{
public bool Enabled { get; set; } = true;
public string? Interval { get; set; }
}
public sealed record FocasAlarmProjectionView
{
public bool Enabled { get; set; }
public string? PollInterval { get; set; }
}
public sealed record FocasHandleRecycleView
{
public bool Enabled { get; set; }
public string? Interval { get; set; }
}
/// <summary>Composite payload returned to the Admin page.</summary>
public sealed record FocasDriverDetail(
DriverInstance Instance,
FocasDriverConfigView? Config,
string? ParseError,
IReadOnlyList<FocasHostStatusRow> HostStatuses);
public sealed record FocasHostStatusRow(
string NodeId,
string HostName,
string State,
DateTime StateChangedUtc,
DateTime LastSeenUtc,
string? Detail,
int ConsecutiveFailures,
DateTime? LastCircuitBreakerOpenUtc,
DateTime? LastRecycleUtc);
@@ -0,0 +1,71 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// #155 — Tag CRUD scoped to a draft generation. Tags are the canonical signal definitions
/// (one row per OPC UA variable) the Server materialises into the address space at startup.
/// Mirrors the shape of <see cref="EquipmentService"/>; writes are restricted to draft
/// generations only (published generations are immutable per the validation pipeline).
/// </summary>
public sealed class TagService(OtOpcUaConfigDbContext db)
{
/// <summary>Lists all tags in a generation, ordered by name. Optional driver / equipment filter.</summary>
public Task<List<Tag>> ListAsync(long generationId,
string? driverInstanceId = null,
string? equipmentId = null,
CancellationToken ct = default)
{
var query = db.Tags.AsNoTracking().Where(t => t.GenerationId == generationId);
if (!string.IsNullOrWhiteSpace(driverInstanceId))
query = query.Where(t => t.DriverInstanceId == driverInstanceId);
if (!string.IsNullOrWhiteSpace(equipmentId))
query = query.Where(t => t.EquipmentId == equipmentId);
return query.OrderBy(t => t.Name).ToListAsync(ct);
}
/// <summary>
/// Creates a new tag row in the given draft. TagId is auto-derived as a GUID — the
/// human-friendly Name is the user-facing identifier.
/// </summary>
public async Task<Tag> CreateAsync(long draftId, Tag input, CancellationToken ct)
{
input.GenerationId = draftId;
if (string.IsNullOrWhiteSpace(input.TagId))
input.TagId = Guid.NewGuid().ToString("N");
db.Tags.Add(input);
await db.SaveChangesAsync(ct);
return input;
}
public async Task UpdateAsync(Tag updated, CancellationToken ct)
{
var existing = await db.Tags
.FirstOrDefaultAsync(t => t.TagRowId == updated.TagRowId, ct)
?? throw new InvalidOperationException($"Tag row {updated.TagRowId} not found");
// Editable fields. TagId / GenerationId are immutable; the Validation pipeline rejects
// changes that would break referential integrity (sp_ValidateDraft per decision #110).
existing.Name = updated.Name;
existing.DriverInstanceId = updated.DriverInstanceId;
existing.DeviceId = updated.DeviceId;
existing.EquipmentId = updated.EquipmentId;
existing.FolderPath = updated.FolderPath;
existing.DataType = updated.DataType;
existing.AccessLevel = updated.AccessLevel;
existing.WriteIdempotent = updated.WriteIdempotent;
existing.PollGroupId = updated.PollGroupId;
existing.TagConfig = updated.TagConfig;
await db.SaveChangesAsync(ct);
}
public async Task DeleteAsync(Guid tagRowId, CancellationToken ct)
{
var existing = await db.Tags.FirstOrDefaultAsync(t => t.TagRowId == tagRowId, ct);
if (existing is null) return;
db.Tags.Remove(existing);
await db.SaveChangesAsync(ct);
}
}
@@ -16,8 +16,8 @@
<PackageReference Include="Novell.Directory.Ldap.NETStandard" Version="3.6.0"/>
<PackageReference Include="Microsoft.AspNetCore.SignalR.Client" Version="10.0.0"/>
<PackageReference Include="Serilog.AspNetCore" Version="9.0.0"/>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.15.2"/>
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.15.2-beta.1"/>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.15.3"/>
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.15.3-beta.1"/>
</ItemGroup>
<ItemGroup>
@@ -25,6 +25,7 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core\ZB.MOM.WW.OtOpcUa.Core.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.Scripting\ZB.MOM.WW.OtOpcUa.Core.Scripting.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian\ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing\ZB.MOM.WW.OtOpcUa.Driver.Modbus.Addressing.csproj"/>
</ItemGroup>
<ItemGroup>
@@ -1,4 +1,5 @@
using System.Text.RegularExpressions;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Validation;
@@ -173,4 +174,65 @@ public static class DraftValidator
di.DriverInstanceId));
}
}
/// <summary>
/// Phase 6.3 Stream A.2 + task #148 part 2 — managed pre-publish guard for cluster
/// topology vs. <see cref="ServerCluster.RedundancyMode"/>. The SQL
/// <c>CK_ServerCluster_RedundancyMode_NodeCount</c> CHECK already enforces the
/// (NodeCount, RedundancyMode) pair on the row itself, but it cannot see the
/// <see cref="ClusterNode.Enabled"/> flag on child nodes — an operator can toggle
/// nodes off (effective count = 1) while leaving RedundancyMode at Hot and the
/// constraint stays green. This check catches that drift before publish so the
/// runtime doesn't boot into a topology the <see cref="Enums.RedundancyMode"/> claims
/// is invalid.
/// </summary>
/// <remarks>
/// Called from the publish pipeline separately from <see cref="Validate"/> because the
/// cluster/nodes rows aren't generation-versioned — they don't belong on
/// <see cref="DraftSnapshot"/>. Returns every failing rule in one pass, same shape as
/// <see cref="Validate"/>.
/// </remarks>
public static IReadOnlyList<ValidationError> ValidateClusterTopology(
ServerCluster cluster,
IReadOnlyList<ClusterNode> clusterNodes)
{
ArgumentNullException.ThrowIfNull(cluster);
ArgumentNullException.ThrowIfNull(clusterNodes);
var errors = new List<ValidationError>();
var enabledNodes = clusterNodes.Count(n => n.Enabled);
// Declared count must match declared mode (belt around the SQL CHECK).
var declaredOk = (cluster.NodeCount, cluster.RedundancyMode) switch
{
(1, RedundancyMode.None) => true,
(2, RedundancyMode.Warm) => true,
(2, RedundancyMode.Hot) => true,
_ => false,
};
if (!declaredOk)
errors.Add(new("ClusterRedundancyModeInvalid",
$"Cluster '{cluster.ClusterId}' declares NodeCount={cluster.NodeCount} + RedundancyMode={cluster.RedundancyMode}. " +
$"Supported combinations: (1, None), (2, Warm), (2, Hot).",
cluster.ClusterId));
// Enabled-node count must match declared count. Disabling a node to 1 while leaving
// mode at Hot/Warm would boot the runtime into InvalidTopology band.
if (enabledNodes != cluster.NodeCount)
errors.Add(new("ClusterEnabledNodeCountMismatch",
$"Cluster '{cluster.ClusterId}' declares NodeCount={cluster.NodeCount} but has {enabledNodes} Enabled nodes. " +
$"Toggle the missing node(s) back on or change RedundancyMode/NodeCount to match.",
cluster.ClusterId));
// Primary uniqueness — decision #84. Two Primary nodes is always an invariant violation
// regardless of mode; catch it here so publish fails loud rather than the runtime
// demoting both to ServiceLevelBand.InvalidTopology at boot.
var primaryCount = clusterNodes.Count(n => n.Enabled && n.RedundancyRole == RedundancyRole.Primary);
if (primaryCount > 1)
errors.Add(new("ClusterMultiplePrimary",
$"Cluster '{cluster.ClusterId}' has {primaryCount} Enabled Primary nodes. At most one Primary per cluster.",
cluster.ClusterId));
return errors;
}
}
@@ -45,13 +45,6 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// Set when <paramref name="Source"/> is <see cref="NodeSourceKind.ScriptedAlarm"/> —
/// stable logical id the ScriptedAlarmEngine addresses by. Null otherwise.
/// </param>
/// <param name="Description">
/// Human-readable description for this attribute. When non-null + non-empty the generic
/// node-manager surfaces the value as the OPC UA <c>Description</c> attribute on the
/// Variable node so SCADA / engineering clients see the field comment from the source
/// project (Studio 5000 tag descriptions, Galaxy attribute help text, etc.). Defaults to
/// null so drivers that don't carry descriptions are unaffected.
/// </param>
public sealed record DriverAttributeInfo(
string FullName,
DriverDataType DriverDataType,
@@ -63,8 +56,7 @@ public sealed record DriverAttributeInfo(
bool WriteIdempotent = false,
NodeSourceKind Source = NodeSourceKind.Driver,
string? VirtualTagId = null,
string? ScriptedAlarmId = null,
string? Description = null);
string? ScriptedAlarmId = null);
/// <summary>
/// Per ADR-002 — discriminates which runtime subsystem owns this node's Read/Write/
@@ -25,7 +25,7 @@ public enum DriverCapability
/// <summary><see cref="ITagDiscovery.DiscoverAsync"/>. Retries by default.</summary>
Discover,
/// <summary><see cref="ISubscribable.SubscribeAsync(IReadOnlyList{string}, TimeSpan, CancellationToken)"/> and unsubscribe. Retries by default.</summary>
/// <summary><see cref="ISubscribable.SubscribeAsync"/> and unsubscribe. Retries by default.</summary>
Subscribe,
/// <summary><see cref="IHostConnectivityProbe"/> probe loop. Retries by default.</summary>
@@ -25,11 +25,4 @@ public enum DriverDataType
/// <summary>Galaxy-style attribute reference encoded as an OPC UA String.</summary>
Reference,
/// <summary>
/// OPC UA <c>Duration</c> — a Double-encoded period in milliseconds. Subtype of Double
/// in the address space; surfaced as <see cref="System.TimeSpan"/> in the driver layer.
/// Used by IEC 61131-3 <c>TIME</c> / <c>TOD</c> attributes (TwinCAT et al.).
/// </summary>
Duration,
}
@@ -7,26 +7,10 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <param name="State">Current driver-instance state.</param>
/// <param name="LastSuccessfulRead">Timestamp of the most recent successful equipment read; null if never.</param>
/// <param name="LastError">Most recent error message; null when state is Healthy.</param>
/// <param name="Diagnostics">
/// Optional driver-attributable counters/metrics surfaced for the <c>driver-diagnostics</c>
/// RPC (introduced for Modbus task #154). Drivers populate the dictionary with stable,
/// well-known keys (e.g. <c>PublishRequestCount</c>, <c>NotificationsPerSecond</c>);
/// Core treats it as opaque metadata. Defaulted to an empty read-only dictionary so
/// existing drivers and call-sites that don't construct this field stay back-compat.
/// </param>
public sealed record DriverHealth(
DriverState State,
DateTime? LastSuccessfulRead,
string? LastError,
IReadOnlyDictionary<string, double>? Diagnostics = null)
{
/// <summary>Driver-attributable counters, empty when the driver doesn't surface any.</summary>
public IReadOnlyDictionary<string, double> DiagnosticsOrEmpty
=> Diagnostics ?? EmptyDiagnostics;
private static readonly IReadOnlyDictionary<string, double> EmptyDiagnostics
= new Dictionary<string, double>(0);
}
string? LastError);
/// <summary>Driver-instance lifecycle state.</summary>
public enum DriverState
@@ -0,0 +1,19 @@
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <summary>
/// Point-in-time state of a single historian cluster node, included inside
/// <see cref="HistorianHealthSnapshot.Nodes"/> when the backend is clustered.
/// </summary>
/// <param name="Name">Node identifier — backend-specific (typically a hostname).</param>
/// <param name="IsHealthy">True when the node is currently considered usable for reads.</param>
/// <param name="CooldownUntil">When the next retry against an unhealthy node is allowed; null when no cooldown is active.</param>
/// <param name="FailureCount">Consecutive failures observed against this node since the last success.</param>
/// <param name="LastError">Diagnostic text from the last failure against this node; null when no failures.</param>
/// <param name="LastFailureTime">UTC of the last failure against this node; null when no failures.</param>
public sealed record HistorianClusterNodeState(
string Name,
bool IsHealthy,
DateTime? CooldownUntil,
int FailureCount,
string? LastError,
DateTime? LastFailureTime);
@@ -0,0 +1,32 @@
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <summary>
/// Point-in-time runtime health of a historian data source. Returned by
/// <see cref="IHistorianDataSource.GetHealthSnapshot"/> and projected onto the
/// server status dashboard.
/// </summary>
/// <param name="TotalQueries">Lifetime count of read calls received.</param>
/// <param name="TotalSuccesses">Subset of <paramref name="TotalQueries"/> that completed without error.</param>
/// <param name="TotalFailures">Subset of <paramref name="TotalQueries"/> that ended in error.</param>
/// <param name="ConsecutiveFailures">Failures since the last success — non-zero means the source is currently degraded.</param>
/// <param name="LastSuccessTime">UTC of the most recent successful read; null if none yet.</param>
/// <param name="LastFailureTime">UTC of the most recent failed read; null if none yet.</param>
/// <param name="LastError">Diagnostic text from the most recent failure; null when no failures recorded.</param>
/// <param name="ProcessConnectionOpen">True when the source's process-data connection is currently established.</param>
/// <param name="EventConnectionOpen">True when the source's event-data connection is currently established. Some backends share one connection — implementations may report the same value here as <paramref name="ProcessConnectionOpen"/>.</param>
/// <param name="ActiveProcessNode">Cluster node currently serving process reads; null when no node is active or the backend is non-clustered.</param>
/// <param name="ActiveEventNode">Cluster node currently serving event reads; null when no node is active or the backend is non-clustered.</param>
/// <param name="Nodes">Per-cluster-node state. Empty when the backend is non-clustered.</param>
public sealed record HistorianHealthSnapshot(
long TotalQueries,
long TotalSuccesses,
long TotalFailures,
int ConsecutiveFailures,
DateTime? LastSuccessTime,
DateTime? LastFailureTime,
string? LastError,
bool ProcessConnectionOpen,
bool EventConnectionOpen,
string? ActiveProcessNode,
string? ActiveEventNode,
IReadOnlyList<HistorianClusterNodeState> Nodes);
@@ -0,0 +1,74 @@
namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <summary>
/// Server-side historian data source. Registered with the server's history router
/// and resolved per OPC UA namespace, independent of any driver's lifecycle.
/// </summary>
/// <remarks>
/// Distinct from <see cref="IHistoryProvider"/>:
/// <list type="bullet">
/// <item><see cref="IHistoryProvider"/> is a *driver capability* — the server
/// dispatches to it via the driver instance.</item>
/// <item><see cref="IHistorianDataSource"/> is a *server registration* — the
/// server resolves it via namespace and calls it directly, so a single
/// historian (e.g. Wonderware) can serve many drivers' nodes, and drivers can
/// restart without dropping history availability.</item>
/// </list>
/// All values returned use the shared <see cref="DataValueSnapshot"/> /
/// <see cref="HistoricalEvent"/> shapes; backend-specific quality / type encodings
/// are translated to OPC UA <c>StatusCode</c> uints inside the data source.
/// </remarks>
public interface IHistorianDataSource : IDisposable
{
/// <summary>
/// Read raw historical samples for a single tag over a time range.
/// </summary>
Task<HistoryReadResult> ReadRawAsync(
string fullReference,
DateTime startUtc,
DateTime endUtc,
uint maxValuesPerNode,
CancellationToken cancellationToken);
/// <summary>
/// Read processed (interval-bucketed) samples — average / min / max / count / etc.
/// A bucket with no source data returns a sample whose
/// <see cref="DataValueSnapshot.StatusCode"/> indicates BadNoData.
/// </summary>
Task<HistoryReadResult> ReadProcessedAsync(
string fullReference,
DateTime startUtc,
DateTime endUtc,
TimeSpan interval,
HistoryAggregateType aggregate,
CancellationToken cancellationToken);
/// <summary>
/// Read one sample per requested timestamp — OPC UA HistoryReadAtTime service.
/// Implementations interpolate or return prior-boundary samples per their
/// backend's policy. The returned list MUST be the same length and order as
/// <paramref name="timestampsUtc"/>; gaps are returned as Bad-quality snapshots.
/// </summary>
Task<HistoryReadResult> ReadAtTimeAsync(
string fullReference,
IReadOnlyList<DateTime> timestampsUtc,
CancellationToken cancellationToken);
/// <summary>
/// Read historical alarm / event records — OPC UA HistoryReadEvents service.
/// Distinct from any live event stream; sources here come from the historian's
/// event log. <paramref name="sourceName"/> is null to return all sources.
/// </summary>
Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName,
DateTime startUtc,
DateTime endUtc,
int maxEvents,
CancellationToken cancellationToken);
/// <summary>
/// Point-in-time health snapshot for diagnostics and dashboards. Pure
/// observation; never blocks on backend I/O.
/// </summary>
HistorianHealthSnapshot GetHealthSnapshot();
}

Some files were not shown because too many files have changed in this diff Show More