Commit Graph

582 Commits

Author SHA1 Message Date
Joseph Doherty
7922e573b1 PR 4.6 — DeployWatcher (IRediscoverable scaffold)
DeployWatcher consumes GalaxyRepositoryClient.WatchDeployEventsAsync,
suppresses the bootstrap event, and raises RediscoveryEventArgs whenever
time_of_last_deploy actually changes. Reconnect-on-error with capped
exponential backoff. GalaxyDriver wiring (IRediscoverable.OnRediscoveryNeeded
event + StartAsync inside InitializeAsync) lands in a follow-up so this PR
doesn't conflict with the parallel runtime track.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:33:37 -04:00
Joseph Doherty
ce004c80ab PR 4.4 — ISubscribable + EventPump
Subscription path online. GalaxyDriver implements ISubscribable; subscribes
batches via gw SubscribeBulkAsync, runs a single shared EventPump consumer
of StreamEventsAsync, fans out OnDataChange events to every driver
subscription that observes the changed gw item handle.

Files:
- Runtime/GalaxySubscriptionHandle.cs — record implementing ISubscriptionHandle.
- Runtime/SubscriptionRegistry.cs — bookkeeping with forward (subscriptionId
  → bindings) and reverse (itemHandle → list of subscriptionIds) maps. The
  reverse map is the fan-out index so a single OnDataChange dispatches to
  every subscription that observes the changed handle.
- Runtime/IGalaxySubscriber.cs — driver-side seam: SubscribeBulk +
  UnsubscribeBulk + StreamEventsAsync. Production wraps GalaxyMxSession;
  tests substitute a fake driving synthetic MxEvents.
- Runtime/GatewayGalaxySubscriber.cs — production. Forwards to
  MxGatewaySession; bufferedUpdateIntervalMs is captured for now and
  becomes a SetBufferedUpdateInterval call once gw issue #102 / gw-9 lands
  (PR 6.3 picks this up).
- Runtime/EventPump.cs — long-running background consumer of
  StreamEventsAsync. Decodes MxValue + maps quality byte/MxStatusProxy via
  StatusCodeMap. Fan-out per subscriber resolves through the registry; bad
  handler exceptions are caught + logged, never break the dispatch loop.
  Filters out non-OnDataChange families (write-complete and operation-
  complete come back via InvokeAsync's reply path, not the event stream).

GalaxyDriver:
- Adds ISubscribable. SubscribeAsync allocates a subscription id,
  SubscribeBulks, builds the binding list (failed gw entries get
  ItemHandle=0 + a per-tag warn log), registers, and returns the handle.
  EventPump is started lazily on first subscribe; one pump per driver
  shared across all subscriptions.
- UnsubscribeAsync removes from the registry first (so stale events are
  filtered immediately) then calls UnsubscribeBulk best-effort. Foreign
  handles throw ArgumentException.
- ReadAsync NotSupportedException message updated: PR 4.4 no longer the
  pointer (deferred to a small follow-up that wraps the pump as a
  one-shot reader).
- Dispose tears down the pump first, then the repository client, then
  clears state.
- Internal ctor extended with optional subscriber parameter.

Tests (15 new, 109 Galaxy total):
- SubscriptionRegistryTests: monotonic id allocation, single+multi
  subscription fan-out, failed-handle exclusion, removal isolation, count
  invariants.
- GalaxyDriverSubscribeTests: handle allocation + value-change dispatch,
  multi-subscription fan-out, failed-tag silence, unsubscribe drops gw
  handle and stops dispatch, foreign handle throws, no-subscriber throws,
  empty-tag-list returns handle without calling gw.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:33:27 -04:00
Joseph Doherty
a617086da1 PR 4.3 — IWritable + secured-write routing
Write path online. GalaxyDriver implements IWritable; routes by
SecurityClassification — SecuredWrite / VerifiedWrite tags go through
MxCommandKind.WriteSecured, everything else through MxGatewaySession.
WriteAsync. Per-tag classifications are captured during ITagDiscovery via a
SecurityCapturingBuilder wrapper that intercepts Variable() calls without
the discoverer needing to know about the driver's internal state.

Files:
- Runtime/MxValueEncoder.cs — boxed CLR value → MxValue. Covers seven Galaxy
  scalar types (bool/int8-32/uint8-32 → Int32, int64/uint64 → Int64, float,
  double, string, DateTime/DateTimeOffset → Timestamp) and 1-D array
  variants. Inverse of MxValueDecoder; round-trip pinned by tests.
  DateTime.Local converts to UTC; unsupported types throw ArgumentException.
- Runtime/IGalaxyDataWriter.cs — driver-side seam. Tests inject a fake to
  capture routing decisions; production path uses GatewayGalaxyDataWriter.
- Runtime/GatewayGalaxyDataWriter.cs — production. Lazy-AddItem caches
  itemHandles, encodes value, routes Write vs WriteSecured, translates
  MxCommandReply (ProtocolStatus → BadCommunicationError; first
  MxStatusProxy in statuses[] via StatusCodeMap.FromMxStatus). Per-tag
  exception isolation: one bad write doesn't fail the batch.
- GalaxyDriver: now implements IWritable. Discovery wraps the supplied
  IAddressSpaceBuilder in SecurityCapturingBuilder which records each
  attribute's SecurityClass into _securityByFullRef before delegating.
  WriteAsync resolves classification per tag (FreeAccess default for
  unknown tags — matches the legacy backend), routes through the injected
  writer. Throws NotSupportedException with PR 4.4 pointer when no writer
  is wired (production path requires GalaxyMxSession.Connect from PR 4.4).

Tests (32 new, 94 Galaxy total):
- MxValueEncoder: every scalar type, narrowing checks (sbyte/short/byte/
  ushort fit Int32; uint within Int32 range; ulong within Int64),
  DateTime.Local → UTC conversion, array variants for bool/double/string/
  DateTime, Dimensions populated, unsupported-type throws ArgumentException,
  encoder/decoder round-trip pin.
- GalaxyDriverWriteTests: WriteAsync routes through fake writer with
  values intact; theory exercises every SecurityClassification value through
  the discovery-then-write path; unknown-tag defaults to FreeAccess; empty-
  request short-circuit; no-writer fail-loud; post-dispose throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:24:22 -04:00
Joseph Doherty
85bdf0d58b PR 4.2 — IReadable abstraction + StatusCodeMap + MxValueDecoder
Read path scaffold + the byte→uint quality mapping table that the parity
matrix (PR 5.x) pins. PR 4.4 supplies the production GW-backed reader; this
PR ships the abstraction and the supporting infrastructure so 4.4 just
plugs the implementation in.

Files:
- Runtime/StatusCodeMap.cs — explicit OPC DA quality byte → OPC UA
  StatusCode uint mapping. Extends the legacy Galaxy.Host
  HistorianQualityMapper with named constants (Good / GoodLocalOverride,
  Uncertain + 4 substatuses, Bad + 7 substatuses, BadInternalError) and an
  MxStatusProxy → uint helper that honors success flag → detail byte →
  detected_by transport-error fallback. Unknown bytes fall back to category
  bucket with a once-per-session diagnostic log so field captures can
  extend the table.
- Runtime/MxValueDecoder.cs — gateway MxValue → boxed CLR value for the
  seven Galaxy data types (Boolean, Int32, Int64, Float32, Float64, String,
  DateTime) plus their array variants. Honors MxValue.IsNull and
  RawValue passthrough.
- Runtime/IGalaxyDataReader.cs — driver-side seam for one-shot reads. PR
  4.4 ships the production wrapper around MxGatewaySession.SubscribeBulk +
  StreamEvents + UnsubscribeBulk; this PR exposes the contract so
  GalaxyDriver.ReadAsync wires through it.
- Runtime/GalaxyMxSession.cs — wrapper around MxGatewaySession that owns
  the Register handle. ConnectAsync opens session + Register; AttachForTests
  lets tests bypass real gw construction. PR 4.3/4.4/4.5 add write,
  subscribe, and reconnect surfaces.

GalaxyDriver:
- Implements IReadable. ReadAsync routes through the injected
  IGalaxyDataReader (test seam) when present; production path throws
  NotSupportedException pointing at PR 4.4 — protects deployments running
  this PR from silent wrong reads while signaling that the legacy-host
  backend (Galaxy:Backend=legacy-host) handles reads in the meantime.
- Internal ctor extended with optional dataReader parameter (default null,
  preserves PR 4.0/4.1 callers).

Tests: 42 new — exhaustive byte→uint table for StatusCodeMap (15 known
codes + category-bucket fallback for unknowns + MxStatusProxy precedence
rules + OPC UA top-byte invariants), every MxValue oneof case for the
decoder (bool/int32/int64/float/double/string/timestamp/3 array variants/
raw bytes/null), GalaxyDriver IReadable wiring (route-through, empty-
request, no-reader-throws, post-dispose-throws, status-code preservation).
62 Galaxy tests total pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:15:42 -04:00
Joseph Doherty
ecba5cedf9 PR 4.1 — ITagDiscovery via GalaxyRepositoryClient + AlarmRefBuilder
Browse path online. GalaxyDriver now implements ITagDiscovery against the
gateway's GalaxyRepositoryClient (PR 0.1's mxaccessgw browse RPC) and feeds
the address-space builder one folder per gobject + one variable per dynamic
attribute, with alarm-bearing attributes carrying all five sub-attribute refs
the server-level AlarmConditionService (PR 2.2) needs.

Files:
- Browse/IGalaxyHierarchySource.cs — driver-side seam between the discoverer
  and the gateway. Test fakes return canned hierarchies so the discoverer's
  translation logic is exercised without a real gRPC channel.
- Browse/GatewayGalaxyHierarchySource.cs — production wrapper around
  GalaxyRepositoryClient.DiscoverHierarchyAsync (paged internally).
- Browse/GalaxyDiscoverer.cs — translates GalaxyObject → IAddressSpaceBuilder
  calls. Browse name = contained_name (falls back to tag_name); full
  reference = attr.full_tag_reference when set, else tag_name + "." +
  attribute_name. Skips objects/attributes with empty identity.
- Browse/DataTypeMap.cs — mx_data_type → DriverDataType (port from legacy
  GalaxyProxyDriver.MapDataType, same fallback to String for unknown codes).
- Browse/SecurityMap.cs — security_classification → SecurityClassification
  (port from legacy GalaxyProxyDriver.MapSecurity).
- Browse/AlarmRefBuilder.cs — populates the five sub-attribute refs by
  Galaxy convention (.InAlarm/.Priority/.DescAttrName/.Acked/.AckMsg). The
  same convention the legacy GalaxyAlarmTracker hard-coded; concentrated
  here so PR 2.2's service receives complete AlarmConditionInfo rows.

GalaxyDriver:
- Added internal ctor accepting IGalaxyHierarchySource? for test injection.
  Default lazily builds GatewayGalaxyHierarchySource around a
  GalaxyRepositoryClient constructed from options on first DiscoverAsync.
- Owned GalaxyRepositoryClient disposed in Dispose.
- ApiKey resolution is currently a passthrough of ApiKeySecretRef — PR 4.W
  (or follow-up) wires DPAPI-backed secret resolution.

csproj: path-based ProjectReference to mxaccessgw (the user is shipping
that repo on a parallel track; both repos sit side-by-side on the dev box).
Tests project also references MxGateway.Contracts directly to construct
GalaxyObject / GalaxyAttribute fixtures.

Tests: 10 new in Browse/GalaxyDiscovererTests.cs covering folder-per-object,
variable-per-attribute, full-ref defaulting + gw-supplied override, browse-
name fallback, every metadata field propagation, alarm sub-attribute ref
population, non-alarm rows skip MarkAsAlarmCondition, empty-identity skips,
empty-attribute-name skips, end-to-end through GalaxyDriver.DiscoverAsync.
20 total Galaxy tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:06:02 -04:00
Joseph Doherty
f6a4f919e2 PR 4.0 — Driver.Galaxy project skeleton + factory
New in-process .NET 10 driver project at
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/. The Tier-A replacement for
Driver.Galaxy.Host + Driver.Galaxy.Proxy. PR 4.0 ships only the IDriver
shape + factory + options; capability bodies (browse, read, write,
subscribe, deploy-watch, host probes) land in PRs 4.1–4.7.

Files:
- Driver.Galaxy.csproj — net10 x64, AnyCPU+x64 platforms, references
  Core.Abstractions + Core. No MxGatewayClient ProjectReference yet — that
  comes in PR 4.2 once the gw NuGet package is wired (the user is
  shipping mxaccessgw on a parallel track).
- Config/GalaxyDriverOptions.cs — nested record hierarchy
  (Gateway/MxAccess/Repository/Reconnect) mirroring the JSON shape spelled
  out in lmx_mxgw_impl.md PR 4.0 acceptance section.
- GalaxyDriver.cs — minimal IDriver impl. Initialize/Shutdown toggle
  DriverHealth between Healthy/Unknown; Reinitialize bumps the timestamp;
  GetMemoryFootprint=0 (PR 4.4 wires SubscriptionRegistry size);
  FlushOptionalCachesAsync no-op. Logs intent on lifecycle calls so
  partial deployments are diagnosable.
- GalaxyDriverFactoryExtensions.cs — JSON parser, default fill-ins,
  validation throw on missing required fields. Driver type name
  "GalaxyMxGateway" intentionally distinct from legacy "Galaxy" so both
  factories coexist during parity testing (Phase 5). PR 4.W's
  Galaxy:Backend switch picks one or the other.

Tests:
- 10 tests in Driver.Galaxy.Tests covering minimal-config defaults, full
  override path, three required-field error cases, factory registration
  via DriverFactoryRegistry.TryGet, lifecycle health transitions
  (Init → Shutdown → Reinit), Dispose idempotency, and post-disposal
  ObjectDisposedException.

slnx: registers the new Driver.Galaxy + Driver.Galaxy.Tests projects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:57:31 -04:00
Joseph Doherty
854827090a PR 3.W — Phase 3 wire-up: Wonderware sidecar DI registration
Solution + DI plumbing to complete Phase 3. With this PR the .NET 10 server
can boot with the Wonderware historian sidecar in the loop, gated by config
so existing deployments are unaffected.

slnx: registers Driver.Historian.Wonderware (net48 sidecar),
Driver.Historian.Wonderware.Client (net10 client), and both test projects.

Server.csproj: adds ProjectReference to the .NET 10 client.

Program.cs: reads Historian:Wonderware:* configuration. When Enabled=true,
constructs a WonderwareHistorianClient singleton and:
  - Registers it as IAlarmHistorianWriter so the SqliteStoreAndForwardSink
    drain (task #248) can pick it up.
  - Registers a WonderwareHistorianBootstrap hosted service that, on
    StartAsync, calls IHistoryRouter.Register(prefix, client) under the
    configured DriverInstancePrefix (default "galaxy") — lets the
    HistoryRead* dispatch in DriverNodeManager find the sidecar via
    longest-prefix-match resolution.

When Enabled=false (the default), DriverNodeManager keeps using its
internal LegacyDriverHistoryAdapter for the read path and the existing
NullAlarmHistorianSink stays in place — drop-in compatible with every
deployment that hasn't moved off Galaxy.Host yet.

42 server integration tests + 10 client tests pass. Full solution build
clean (0/0).

Note: scripts/install/Install-Services.ps1 and
src/.../Server/appsettings.json carry intermixed user WIP and are NOT
committed in this PR. Equivalent edits applied locally:

  Install-Services.ps1: new -InstallWonderwareHistorian switch installs the
  OtOpcUaWonderwareHistorian service alongside OtOpcUaGalaxyHost;
  generates a fresh historian shared secret; OtOpcUa service depends on
  both when historian sidecar is installed.

  Server/appsettings.json: new Historian.Wonderware section with
  Enabled=false default, PipeName/SharedSecret/PeerName/
  DriverInstancePrefix/ConnectTimeoutSeconds/CallTimeoutSeconds keys.

Both pieces should land in a follow-up commit once the user's WIP on those
files clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:48:47 -04:00
Joseph Doherty
14947fde51 PR 3.4 — Wonderware historian sidecar .NET 10 client
New project Driver.Historian.Wonderware.Client (net10 x64) implements both
Core.Abstractions.IHistorianDataSource (read paths consumed by the server's
IHistoryRouter) and Core.AlarmHistorian.IAlarmHistorianWriter (alarm-event
drain consumed by SqliteStoreAndForwardSink) against the sidecar's PR 3.3
pipe protocol.

Wire-format files (Framing/MessageKind, Hello, Contracts, FrameReader,
FrameWriter) are byte-identical mirrors of the sidecar's net48 originals —
the sidecar can't be referenced as a ProjectReference because of the
runtime/bitness gap, so we duplicate and pin the wire bytes via tests.

PipeChannel owns one bidirectional NamedPipeClientStream + Hello handshake +
serializes calls. Single in-flight at a time (semaphore); transport failures
trigger one in-flight reconnect-and-retry before propagating. Connect is
abstracted behind a Func<CancellationToken, Task<Stream>> so tests inject
in-process pipes.

WonderwareHistorianClient maps:
- HistorianSampleDto.Quality (raw OPC DA byte) → OPC UA StatusCode uint via
  QualityMapper (port of HistorianQualityMapper from sidecar).
- HistorianAggregateSampleDto.Value=null → BadNoData (0x800E0000).
- WriteAlarmEventsReply.PerEventOk[i]=true → Ack, false → RetryPlease.
  Whole-call failure or transport exception → RetryPlease for every event in
  the batch (drain worker handles backoff).
- AlarmHistorianEvent → AlarmHistorianEventDto with severity bucketed via
  AlarmSeverity-to-ushort mapping (Low=250, Medium=500, High=700, Crit=900).

GetHealthSnapshot tracks transport success + sidecar-reported failure
separately; ConsecutiveFailures rises on operation-level errors, not just
transport drops.

10 round-trip tests via FakeSidecarServer (in-process net10 fake using the
client's own framing): byte→uint quality mapping, null-bucket BadNoData,
at-time order preservation, event-field round-trip, sidecar error surfacing,
WriteBatch per-event status, whole-call retry-please mapping, Hello
shared-secret rejection, transport-drop reconnect-and-retry, health snapshot
counters.

PR 3.W will register this client as IHistorianDataSource + IAlarmHistorianWriter
in OpcUaServerService DI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:40:56 -04:00
Joseph Doherty
9f7a4ac769 PR 3.3 — Wonderware sidecar pipe protocol + dispatcher
Sidecar now serves a length-prefixed, kind-tagged MessagePack pipe protocol
mirroring Galaxy.Host's: 4-byte BE length + 1-byte MessageKind + body, 16 MiB
cap. Hello handshake validates per-process shared secret + protocol major
version + caller SID via ImpersonateNamedPipeClient before any work frame
runs.

Five contract pairs ship in this PR:

  ReadRawRequest          ↔ ReadRawReply
  ReadProcessedRequest    ↔ ReadProcessedReply
  ReadAtTimeRequest       ↔ ReadAtTimeReply
  ReadEventsRequest       ↔ ReadEventsReply
  WriteAlarmEventsRequest ↔ WriteAlarmEventsReply

Timestamps cross the wire as DateTime ticks (long) to dodge MessagePack's
DateTime kind/timezone quirks; both sides convert with DateTime(ticks, Utc).
Sample values cross as MessagePack-serialized byte[] so the .NET 10 client
(PR 3.4) deserializes per the tag's mx_data_type without the sidecar needing
to know OPC UA types.

HistorianFrameHandler dispatches by MessageKind to IHistorianDataSource (the
PR 3.2 lifted interface) for reads, and to a new IAlarmEventWriter strategy
for the alarm-event persistence path. Per-call exceptions surface as
Success=false replies so a single bad request doesn't kill the connection.
WriteAlarmEvents replies carry per-event success flags; the SQLite
store-and-forward sink retries failed slots on the next drain tick.

Program.cs spins the pipe server when OTOPCUA_HISTORIAN_ENABLED=true. Pipe-
only mode (default false) preserves PR 3.1's smoke-test behaviour: the host
still validates env vars and waits for Ctrl-C, but doesn't initialize the
Wonderware SDK.

Sidecar test project gains 8 round-trip tests (37 total now): every contract
pair round-trips through FrameReader/FrameWriter via in-memory streams, the
handler surfaces historian exceptions cleanly, WriteAlarmEvents per-event
status flows through, and the no-writer-configured path returns a clean
error reply.

Added MessagePack 2.5.187 to the sidecar csproj.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:27:17 -04:00
Joseph Doherty
bc7ec746c5 PR 1+2.W — Wire HistoryRouter + AlarmConditionService into DI
Server-side singletons threaded through OpcUaApplicationHost → OtOpcUaServer
→ DriverNodeManager construction. New ctor parameters are last-position
optional with null defaults so every existing test construction site
(OpcUaServerIntegrationTests, AlarmSubscribeIntegrationTests, etc.) keeps
working unchanged.

Program.cs:
  AddSingleton<IHistoryRouter, HistoryRouter>();
  AddSingleton<AlarmConditionService>();

The router stays empty after this PR. DriverNodeManager's internal
LegacyDriverHistoryAdapter handles every driver that still implements
IHistoryProvider; PR 3.W will register the Wonderware sidecar as a router
source; PR 7.2 retires the legacy fallback entirely.

44 alarm + history + integration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:13:51 -04:00
Joseph Doherty
9365beb966 PR 3.2 — Lift Wonderware Historian SDK code to sidecar
Move all historian implementation files from Driver.Galaxy.Host/Backend/Historian/
to Driver.Historian.Wonderware/Backend/. Sidecar now owns the aahClientManaged /
aahClientCommon SDK references; Galaxy.Host project-references the sidecar so
MxAccessGalaxyBackend keeps building until PR 7.2 retires Galaxy.Host entirely.

10 source files moved (preserving git history via git mv):
  IHistorianDataSource, HistorianDataSource, HistorianClusterEndpointPicker,
  HistorianClusterNodeState, HistorianConfiguration, HistorianEventDto,
  HistorianHealthSnapshot, HistorianQualityMapper, HistorianSample,
  IHistorianConnectionFactory.

2 historian tests moved alongside (HistorianClusterEndpointPickerTests,
HistorianQualityMapperTests). Sidecar test project now hosts 29 tests (1 PR 3.1
smoke + 28 moved historian tests, all passing).

Galaxy.Host's remaining 6 historian-flavored tests (HistorianWiringTests,
HistoryReadAtTimeTests, HistoryReadEventsTests, HistoryReadProcessedTests)
keep passing via the project reference — using directives updated to reach
the new namespace.

Sidecar deliberately speaks no Core.Abstractions — its surface is the legacy
List<HistorianSample> shape; PR 3.4's .NET 10 client translates to the
Core.Abstractions shapes added in PR 1.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:13:13 -04:00
Joseph Doherty
ef22a61c39 v2 mxgw migration — Phase 1+2+3.1 wiring (7 PRs)
Foundational PRs from lmx_mxgw_impl.md, all green. Bodies only — DI/wiring
deferred to PR 1+2.W (combined wire-up) and PR 3.W.

PR 1.1 — IHistorianDataSource lifted to Core.Abstractions/Historian/
  Reuses existing DataValueSnapshot + HistoricalEvent shapes; sidecar (PR
  3.4) translates byte-quality → uint StatusCode internally.

PR 1.2 — IHistoryRouter + HistoryRouter on the server
  Longest-prefix-match resolution, case-insensitive, ObjectDisposed-guarded,
  swallow-on-shutdown disposal of misbehaving sources.

PR 1.3 — DriverNodeManager.HistoryRead* dispatch through IHistoryRouter
  Per-tag resolution with LegacyDriverHistoryAdapter wrapping
  `_driver as IHistoryProvider` so existing tests + drivers keep working
  until PR 7.2 retires the fallback.

PR 2.1 — AlarmConditionInfo extended with five sub-attribute refs
  InAlarmRef / PriorityRef / DescAttrNameRef / AckedRef / AckMsgWriteRef.
  Optional defaulted parameters preserve all existing 3-arg call sites.

PR 2.2 — AlarmConditionService state machine in Server/Alarms/
  Driver-agnostic port of GalaxyAlarmTracker. Sub-attribute refs come from
  AlarmConditionInfo, values arrive as DataValueSnapshot, ack writes route
  through IAlarmAcknowledger. State machine preserves Active/Acknowledged/
  Inactive transitions, Acked-on-active reset, post-disposal silence.

PR 2.3 — DriverNodeManager wires AlarmConditionService
  MarkAsAlarmCondition registers each alarm-bearing variable with the
  service; DriverWritableAcknowledger routes ack-message writes through
  the driver's IWritable + CapabilityInvoker. Service-raised transitions
  route via OnAlarmServiceTransition → matching ConditionSink. Legacy
  IAlarmSource path unchanged for null service.

PR 3.1 — Driver.Historian.Wonderware shell project (net48 x86)
  Console host shell + smoke test; SDK references + code lift come in
  PR 3.2.

Tests: 9 (PR 1.1) + 5 (PR 2.1) + 10 (PR 1.2) + 19 (PR 2.2) + 1 (PR 3.1)
all pass. Existing AlarmSubscribeIntegrationTests + HistoryReadIntegrationTests
unchanged.

Plan + audit docs (lmx_backend.md, lmx_mxgw.md, lmx_mxgw_impl.md)
included so parallel subagent worktrees can read them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:03:36 -04:00
Joseph Doherty
012c42a846 Task #156 — TagsTab: per-tag advanced Modbus fields (Deadband, UnitId, CoalesceProhibited)
#155 wired the basic tag form (Name / Driver / Equipment / DataType / Access /
WriteIdempotent + ModbusAddressEditor for the address). The per-tag knobs added
across #141 / #142 / #143 still required operators to hand-edit TagConfig JSON.
This commit exposes them through an "Advanced" expander.

UI changes (TagsTab.razor):

- Collapsible "▶ Advanced (Deadband / UnitId override / CoalesceProhibited)"
  button below the address editor, visible only when the selected driver is
  Modbus. Collapsed by default — basic form covers the typical edit workflow.
- Three numeric / checkbox inputs with inline help text explaining each knob's
  purpose and when to use it.
- _showAdvanced auto-opens on Edit when any of the advanced fields are present
  in the existing TagConfig — operators see immediately what's been configured.

Save-side serialization:

- New RefreshTagConfigJson serializes the address + advanced fields into a
  structured JSON object using a Dictionary<string, object?>. Fields with
  default / empty values are omitted to keep diffs in the existing draft-diff
  viewer minimal — a tag with only an address still produces
  `{"addressString":"40001:F"}` and not a full superset object with nulls.
- OnAddressChanged + OnAdvancedChanged both delegate to RefreshTagConfigJson
  so any input change keeps TagConfig in sync.

Read-side hydration:

- New HydrateModbusFromTagConfig parses an existing TagConfig JSON and
  populates _modbusAddress + the three advanced fields. Falls back to empty
  defaults on malformed JSON. ResetAdvanced is called before hydration on
  every form open so leftover state from a previous edit doesn't leak.

ResetAdvanced helper introduced + called from StartAdd so a fresh "New tag"
form starts with everything cleared.

Tests (1 new in TagServiceTests):
- TagConfig_With_Advanced_Modbus_Fields_RoundTrips_Through_Factory — creates a
  tag whose TagConfig carries addressString + deadband + unitId +
  coalesceProhibited, persists via TagService, reloads, asserts every field
  survives. Then constructs a wrapping driver-config JSON and feeds it to
  ModbusDriverFactoryExtensions.CreateInstance — confirms the field NAMES the
  UI emits match what BuildTag's DTO consumes. If the UI's JSON shape ever
  drifts from the factory's expected DTO, this test catches it before users do.

119 + 1 = 120 Admin tests green. Solution build clean.
2026-04-25 04:22:50 -04:00
Joseph Doherty
ec57df1009 Task #155 — TagService + TagsTab CRUD UI for Modbus tags
Closes the remaining loop on user-visible Modbus tag editing. Pre-#155 tags
arrived only via SQL seeding or runtime ITagDiscovery; the Admin UI had no
interactive surface for creating / editing / deleting tag rows.

Changes:

- TagService.cs (Admin/Services/) — CRUD wrapper around OtOpcUaConfigDbContext.Tags.
  ListAsync supports optional driver / equipment filters; CreateAsync auto-derives
  TagId; UpdateAsync persists editable fields; DeleteAsync removes the row. Mirrors
  the EquipmentService shape.
- TagsTab.razor (Components/Pages/Clusters/) — list + filter + add/edit/remove form.
  The address/config editor is conditional: when the selected DriverInstance is
  Modbus, ModbusAddressEditor (#145) renders with live-parse preview; otherwise a
  generic JSON textarea (matches the DriversTab pattern from #147). Save-side
  serializes the address-string into TagConfig as `{"addressString":"..."}` JSON.
- ClusterDetail.razor — new "Tags" tab in the cluster-detail nav strip + the routing
  switch.
- Program.cs — TagService registered as a scoped DI service.

Drive-by fix: ModbusDriverFactoryExtensions.CreateInstance promoted from internal
to public — Admin.Tests was using it via reflection-friendly internal access that
broke under the #153 logger overload addition. Public is the right access modifier
anyway since the Server-side bootstrapper calls it from a different assembly.

Drive-by fix #2: ModbusDriverConfigDto was missing MaxReadGap (#143) — surfaced by
the #147 round-trip test that flips MaxReadGap=12 in the view model and asserts
it lands on the resolved options. Added the field + binding line. Confirms #143's
DriverConfig JSON binding was incomplete since the original commit; no production
deployment configured this knob through JSON until now so the gap stayed hidden.

Tests (4 new TagServiceTests):
- Create_And_List_Surfaces_The_Tag — CreateAsync auto-assigns TagId; list returns
  the row.
- List_Filters_By_DriverInstance — driver-scoped filter works.
- Update_Persists_Editable_Fields — Name / DataType / AccessLevel / TagConfig all
  persist through Update.
- Delete_Removes_The_Row — basic delete verification.

113 + 4 (TagService) + 2 (DriversTab round-trip restored after compile fix) = 119
Admin tests green. Solution build clean.

Caveat: bUnit-style render tests for TagsTab still aren't included — Admin.Tests
doesn't have bUnit set up. The TagService logic is fully covered; the razor
component's parser/save glue is exercised by hand at runtime for now.
2026-04-25 01:51:02 -04:00
Joseph Doherty
802366c2c6 Task #154 — driver-diagnostics RPC: HTTP endpoint + Admin client
Foundation for surfacing per-driver runtime state from the Server process to
the Admin UI. #152 shipped GetAutoProhibitedRanges() as an in-process
accessor; #154 makes it reachable across processes.

Server side (HealthEndpointsHost):
- New URL family: /diagnostics/drivers/{driverInstanceId}/{driverType}/{topic}
- First wired topic: /diagnostics/drivers/{id}/modbus/auto-prohibited
- Driver-agnostic at the URL level — future driver types add their own
  segments[3] cases (e.g. /diagnostics/drivers/{id}/s7/dropped-pdus).
- 404 when the driver instance doesn't exist; 400 when the driver exists
  but isn't a Modbus driver (the per-type endpoint is wrong for this row).
- Response shape is flat JSON (unitId / region / startAddress / endAddress /
  lastProbedUtc / bisectionPending) so consumers don't have to reference the
  Driver.Modbus assembly's ModbusAutoProhibition record.
- Re-uses the existing HttpListener bound to localhost:4841 — same auth /
  reachability story as /healthz and /readyz.

Admin side:
- DriverDiagnosticsClient (Services/) — HttpClient wrapper that fetches the
  per-driver Modbus prohibition list. Returns null on 404/400 (driver
  missing or wrong type); throws on transport failures.
- ModbusAutoProhibitionsResponse + ModbusAutoProhibitionRow flat DTOs —
  client doesn't take a dep on Driver.Modbus.
- ModbusDiagnostics.razor at /modbus/diagnostics/{driverInstanceId} —
  table view with BISECTING (warning yellow) / ISOLATED (danger red)
  badges, relative timestamps (e.g. "5m ago"), Refresh button. Errors
  surface inline rather than swallowing.
- HttpClient registration in Program.cs reads
  DriverDiagnostics:ServerBaseUrl from appsettings.json (default
  http://localhost:4841/ for same-host deployments).

Tests (3 new in HealthEndpointsHostTests):
- Diagnostics_ReturnsModbusAutoProhibitions_ForLiveDriver — registers a
  Modbus driver with a programmable transport that protects register 102,
  records the prohibition via a coalesced ReadAsync, hits the endpoint,
  asserts the returned JSON matches (unitId / region / start / end / pending).
- Diagnostics_404_When_Driver_Not_Found
- Diagnostics_400_When_Driver_Is_Wrong_Type

Architecture note: the Admin-side bUnit-style component test isn't included
because Admin.Tests doesn't have bUnit set up. The DriverDiagnosticsClient
is unit-testable on its own with a mock HandlerStub if needed — left as a
follow-up alongside the broader bUnit setup task.

The diagnostic page is now reachable at /modbus/diagnostics/{driverId} from
any Admin instance pointing at a Server endpoint URL. Future driver types
(S7, AbCip) plug into the same channel by adding their own URL segments
in HealthEndpointsHost.WriteDriverDiagnosticsAsync.
2026-04-25 01:32:21 -04:00
Joseph Doherty
8004394892 Task #153 — ModbusDriver: inject ILogger so prohibition events reach a sink
#152 left a hook for structured logging when an auto-prohibition first
fires; this commit completes the wiring.

Changes:
- ModbusDriver constructor takes an optional ILogger<ModbusDriver> (defaults
  to NullLogger). Existing standalone callers stay compile-clean.
- RecordAutoProhibition logs LogWarning on first-fire only (re-fires of the
  same range stay quiet via the existing isNew de-dupe). Format includes
  DriverInstanceId, UnitId, Region, Start, End, Span — log aggregators can
  filter / count by any field.
- New LogProhibitionCleared helper called by both StraightReprobeAsync (when
  the re-probe succeeds on a single-register range) and BisectAndReprobeAsync
  (per-half clearing + a single combined line when both halves succeed).
- ModbusDriverFactoryExtensions.Register accepts an optional ILoggerFactory.
  Captured at registration time and used in the factory closure to construct
  a per-driver logger. Server bootstrap code that already has an ILoggerFactory
  in DI threads it through with a single argument addition; old call sites
  (Register(registry)) keep working with a null logger.

Tests (2 new ModbusLoggerInjectionTests):
- First_Failure_Emits_Single_Warning_Subsequent_Refire_Stays_Quiet — pins
  the de-dupe behaviour. First scan logs one warning with the expected
  structured fields; second scan with the same prohibition stays silent.
- Reprobe_Clearing_Prohibition_Emits_Information_Log — protected register
  unlocked between record and re-probe; re-probe success emits an info log
  containing "cleared".

CapturingLogger test harness is purpose-built (xUnit doesn't ship a logger
mock by default and adding Moq is overkill for two tests).

240 + 2 = 242 unit tests green.
2026-04-25 01:26:20 -04:00
Joseph Doherty
b8df230eb8 Task #152 — Modbus coalescing: surface auto-prohibitions through diagnostics
Auto-prohibited ranges (#148) were previously visible only through an
internal AutoProhibitedRangeCount accessor used by tests. Production
operators had no way to see what the planner had learned without pulling
logs or inspecting driver state.

Changes:

- New public record `ModbusAutoProhibition(UnitId, Region, StartAddress,
  EndAddress, LastProbedUtc, BisectionPending)` — operator-facing snapshot
  shape. Lives in the addressing assembly's logical namespace alongside
  the other public types.
- `ModbusDriver.GetAutoProhibitedRanges()` returns
  `IReadOnlyList<ModbusAutoProhibition>` — a copy of the live prohibition
  map. Lock-protected snapshot so consumers don't race with the re-probe
  loop.
- RecordAutoProhibition tracks first-fire vs re-fire via the dictionary
  insert path, leaving a hook to add structured logging once an ILogger
  is plumbed through (currently elided to keep the constructor minimal
  for testability — a future change can wire ILogger and emit a single
  warning per first-fire).

Tests (1 new, additive to the 6 in ModbusCoalescingAutoRecoveryTests):
- GetAutoProhibitedRanges_Surfaces_Operator_Visible_Snapshot — confirms
  the snapshot shape: empty before any failure, populated with correct
  UnitId/Region/Start/End/BisectionPending after a failed coalesced read,
  LastProbedUtc within the recent past.

Docs:
- docs/v2/modbus-addressing.md — new "Coalescing auto-recovery" subsection
  consolidates the #148/#150/#151/#152 surface in one place. Documents
  the diagnostic accessor + flags the in-process consumption pattern
  (Server health endpoints today; Admin UI when an RPC channel exists).

239 + 1 = 240 unit tests green.

Caveat: the Admin UI surfacing (table render, "clear all prohibitions"
button) is intentionally NOT shipped here. Admin can't reach a live
ModbusDriver instance without a driver-diagnostics RPC channel that
doesn't exist yet — that's a larger architectural piece. For now the
data is queryable in-process by the Server's health endpoints; once an
RPC channel lands, Admin can wire the existing GetAutoProhibitedRanges
into a Blazor table without further driver changes.
2026-04-25 01:19:10 -04:00
Joseph Doherty
f823c81c96 Task #150 — Modbus coalescing: bisection-style range narrowing
Pre-#150 a coalesced read failure recorded the FULL failed range as
permanently prohibited. Healthy registers around the actual protected
register stayed in per-tag mode forever (until ReinitializeAsync). The
re-probe loop shipped in #151 retried the whole range as a single block,
which would either succeed (clearing everything) or fail (changing
nothing).

Post-#150 the re-probe loop bisects multi-register prohibitions:

- _autoProhibited refactored from Dictionary<key, DateTime> to
  Dictionary<key, ProhibitionState> where ProhibitionState carries
  LastProbedUtc + SplitPending. Multi-register prohibitions enter with
  SplitPending=true; single-register prohibitions enter with
  SplitPending=false (already minimal).
- ReprobeLoopAsync delegates the per-pass work to
  RunReprobeOnceForTestAsync (also exposed for synchronous test driving).
  Each entry routes to BisectAndReprobeAsync (split-pending + multi-reg)
  or StraightReprobeAsync (single-reg / non-split-pending).
- Bisection: split (start, end) at mid = (start+end)/2. Try (start, mid)
  and (mid+1, end) as separate coalesced reads. Each FAILED half re-enters
  the prohibition map with SplitPending = (its end > its start). SUCCEEDED
  halves vanish, freeing the planner to coalesce across them on the next
  scan.
- Convergence: log2(span) re-probe ticks pin the prohibition to the
  actual single offending register(s). For a 100-register block with one
  protected address that's ~7 ticks.

Tests (3 new ModbusCoalescingBisectionTests):
- Bisection_Narrows_Multi_Register_Prohibition_Per_Reprobe — 11 tags
  100..110 with protected address 105. After 4 re-probe passes the
  prohibition collapses from (100..110) → (100..105) → (103..105) →
  (105..105).
- Bisection_Clears_When_Both_Halves_Are_Healthy — transient failure
  scenario; protection lifted before re-probe; both bisection halves
  succeed and the parent vanishes entirely.
- Bisection_Splits_Into_Two_When_Both_Halves_Still_Fail — TwoHoleTransport
  with protected addresses 102 + 108 in the same coalesced range. After
  bisection both halves still fail (each contains one of the protected
  addresses); the prohibition map grows to 2 entries.

236 + 3 = 239 unit tests green. Solution build clean.
2026-04-25 01:16:09 -04:00
Joseph Doherty
9e4aae350b Task #151 — Modbus coalescing: periodic re-probe of auto-prohibitions
#148 introduced auto-prohibited coalesced ranges that persist for the
driver lifetime. Long-running deployments with transient PLC permission
changes (firmware update unlocking a previously-protected register,
operator reconfiguring the device) had no recovery short of operator
restart.

Adds an opt-in background loop that re-probes each prohibition periodically:

- ModbusDriverOptions.AutoProhibitReprobeInterval (TimeSpan?, default null
  = disabled). Set to e.g. TimeSpan.FromHours(1) to opt in.
- _autoProhibited refactored from HashSet<key> to Dictionary<key, DateTime>
  so each entry tracks its last failure / last re-probe timestamp.
- ReprobeLoopAsync runs on the same Task.Run pattern as ProbeLoopAsync;
  cancelled by ShutdownAsync. Each tick snapshots the prohibition set
  and issues a one-shot coalesced read per range. Successful re-probes
  drop the prohibition; failed ones bump the timestamp + leave the
  prohibition in place.
- Communication failures during re-probe (transport-level) are treated
  the same as PLC-exception failures — the prohibition stays, but isn't
  upgraded to "permanent" since transports recover. The driver-instance
  health surface picks up the failure separately.
- ShutdownAsync explicitly clears the prohibition set so a manual restart
  via ReinitializeAsync starts with a clean slate (matches the old
  "restart to clear" semantics).
- Factory DTO + JSON binding extended with AutoProhibitReprobeMs field.

Tests (2 new, additive to the 3 in ModbusCoalescingAutoRecoveryTests):
- Reprobe_Clears_Prohibition_When_Range_Becomes_Healthy — protected
  register at 102 records prohibition; clearing the simulated protection
  + invoking the re-probe drops the prohibition.
- Reprobe_Leaves_Prohibition_When_Range_Is_Still_Bad — re-probe on a
  still-failing range keeps the prohibition in place.

Tests use a new internal RunReprobeOnceForTestAsync helper to fire one
re-probe pass synchronously, so the suite doesn't have to wait on the
background timer (the loop's timer behaviour is exercised implicitly via
the InitializeAsync wire-up + the synchronous helper sharing the actual
re-probe code path).

234 + 2 = 236 unit tests green.
2026-04-25 01:12:48 -04:00
Joseph Doherty
8de152df4f Task #149 — Modbus address-preview page + ImportEquipment help
The original task scope assumed a per-tag editor lived in EquipmentTab.razor
or a similar surface. Reading the codebase confirmed that's not the case:
tags are seeded via SQL (scripts/smoke/*) or arrive at runtime through
ITagDiscovery; the Admin UI has no per-tag CRUD page today. Equipment
import is for equipment metadata (Name / MachineCode / ZTag / SAPID /
Identification) — not tag rows.

Adjusted scope:

1. ModbusAddressPreview.razor — new standalone page at /modbus/address-preview.
   Hosts the ModbusAddressEditor component shipped in #145 + the family
   selector + a copy-pasteable grammar reference. Operators can sanity-check
   address-string syntax (40001:F:CDAB / HR1:I / V2000:F / D100:I etc.)
   without committing it to a config row first.

2. ImportEquipment.razor — appended a secondary alert banner clarifying
   that Modbus per-tag addressing isn't part of equipment import; points
   users at the Drivers tab + the new preview tool.

Builds clean against the existing Admin app. The actual per-tag CRUD UI is
still a separate piece of work — when it ships, it can drop in
ModbusAddressEditor directly. The preview page acts as the canonical
demonstration of how to use the component.

Razor caveat: the grammar reference uses literal `<...>` syntax tokens
that the Razor parser interprets as malformed elements when inlined in a
<pre> block. Held as a string field (_grammarReference) and rendered
through @ binding to sidestep the parser conflict.
2026-04-25 01:09:24 -04:00
Joseph Doherty
3b0e093002 Task #148 — Modbus block-coalescing: auto-recover from protected register holes
Pre-#148 behaviour: a coalesced FC03/FC04 read that crossed a write-only or
PLC-fault register marked every member tag Bad until the operator manually
flagged the offending tag with CoalesceProhibited. Healthy tags around the
hole stayed broken indefinitely.

Post-#148: two-stage recovery, no operator intervention needed.

1. Same-scan fallback: when a coalesced read fails with a Modbus exception
   (IllegalDataAddress, SlaveDeviceFailure, etc.), the planner does NOT
   mark members handled. The per-tag fallback in the same scan reads each
   member individually — non-protected members surface Good values
   immediately, and only the actual protected register stays Bad.

2. Cross-scan prohibition: the failed range (Unit, Region, Start, End) is
   recorded in a per-driver `_autoProhibited` set. On subsequent scans the
   planner checks each candidate merge against the set and refuses to
   re-form any block that overlaps a known-bad range. Net effect: after one
   scan with a failure, the protected range goes "per-tag mode" indefinitely
   while ranges around it keep coalescing normally.

Communication failures (timeouts, socket drops) are NOT auto-prohibited —
they're transport-level, not structural. The same coalesced read can succeed
once the transport recovers; recording it as "permanently bad" would defeat
coalescing for the whole driver instance.

Auto-prohibition state lives for the driver lifetime and clears on
ReinitializeAsync (operator restart). A periodic re-probe is a follow-up if
deployments need it without a restart.

Implementation:
- Added `_autoProhibited` HashSet<(byte, ModbusRegion, ushort, ushort)> +
  `_autoProhibitedLock` on ModbusDriver.
- `RangeIsAutoProhibited(unit, region, start, end)` overlap check called
  from the planner when forming blocks.
- `RecordAutoProhibition(...)` called from the catch (ModbusException)
  branch.
- The catch (Exception) branch (non-Modbus failures) keeps the pre-#148
  "mark all Bad in this scan, don't auto-prohibit" behaviour.
- Internal `AutoProhibitedRangeCount` accessor for tests.

Tests (3 new ModbusCoalescingAutoRecoveryTests):
- First_Failure_Falls_Back_To_PerTag_Same_Scan — three tags around a
  protected register at 102: T100 + T104 surface Good values via the
  per-tag fallback in the SAME scan; T102 surfaces the exception.
- Second_Scan_Skips_Coalesced_Read_Of_Prohibited_Range — confirms scan 2
  doesn't re-attempt the failed merge (no FC03 with quantity > 1 at the
  prohibited start).
- Tags_Outside_Prohibited_Range_Still_Coalesce — separate cluster at HR
  200..202 keeps coalescing normally even after the 100..104 cluster is
  prohibited.

234/234 unit tests green.

Follow-ups intentionally NOT shipped (smaller, independent changes):
- Bisection-style range narrowing — currently the prohibition range is the
  full failed block; the planner doesn't try to find the exact protected
  register. Operator-visible diagnostic + prohibition stays correct.
- Periodic re-probe to clear stale prohibitions.
- Surface auto-prohibited ranges through GetHostStatuses or a new
  diagnostic so the Admin UI can show what's been auto-isolated.
2026-04-25 01:01:42 -04:00
Joseph Doherty
0b7653d3b2 Task #147 — wire ModbusOptionsEditor into DriversTab
Branches the DriversTab driver-add form on driver type:
- For DriverType=Modbus, render the typed <ModbusOptionsEditor> component
  shipped in #145 instead of the generic JSON textarea.
- For other driver types, the existing textarea stays (other drivers ship
  their own typed editors per decision #94).

On Save, when type is Modbus, the form serialises ModbusOptionsViewModel
into the JSON DTO shape ModbusDriverFactoryExtensions consumes (host /
port / unitId / family / keepAlive / reconnect / max*** / writeOnChangeOnly
/ etc.). Other types still pass the textarea contents verbatim.

Drive-by fix: the DriverType dropdown listed "ModbusTcp" but the actual
factory-registered name is "Modbus" — DriverInstanceBootstrapper would
silently skip a row created with the old label because the factory lookup
would miss. Renamed to match.

Tests (2 new in ModbusOptionsViewModelTests):
- DriversTab_Serialized_Defaults_RoundTrip_Through_Factory — unedited
  view-model serializes to a JSON the factory accepts; resulting
  ModbusDriverOptions matches the form defaults bit-for-bit.
- DriversTab_Serializes_Edited_Values_Correctly — flipping Host / Port /
  UnitId / Family / MaxReadGap / WriteOnChangeOnly in the view model
  surfaces in the constructed driver's options.

The serializer in the test mirrors DriversTab.razor's SerializeModbusOptions
helper. If the form's serialization shape drifts, both must be updated
together; that's the cost of testing through the JSON DTO without bUnit.

Follow-up still open: the per-tag editor (ModbusAddressEditor wiring into
EquipmentTab.razor + the bulk-import help-text update) — that's a separate
surface that touches the equipment-row CRUD flow; covered as a follow-up
when the equipment tag editor surface is next touched.
2026-04-25 00:58:03 -04:00
Joseph Doherty
dfd027ebca Task #146 — Modbus addressing: align type codes with Wonderware DASMBTCP + Ignition
Web verification (2026-04-25) against current vendor docs surfaced concrete
grammar conflicts in the v1 suffix grammar shipped in #137. Hard cutover
before the Admin UI rolls out widely so users don't paste `:I` from a
Wonderware spreadsheet and silently get wrong-typed reads.

Sources:
- Wonderware DASMBTCP user guide
  https://cdn.logic-control.com/media/DASMBTCP.pdf
- Ignition Modbus addressing (8.1)
  https://www.docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/opc-ua-drivers/modbus/modbus-addressing

Type-code changes:

| Code   | Pre-#146 | Post-#146  | Vendor reference            |
|--------|----------|------------|------------------------------|
| `:S`   | (n/a)    | Int16      | Wonderware DASMBTCP `S`      |
| `:US`  | (n/a)    | UInt16     | Ignition `HRUS`              |
| `:I`   | Int16    | **Int32**  | Wonderware `I` + Ignition `HRI` |
| `:UI`  | UInt16   | **UInt32** | Ignition `HRUI`              |
| `:I_64`  | (n/a)  | Int64      | Ignition `HRI_64`            |
| `:UI_64` | (n/a)  | UInt64     | Ignition `HRUI_64`           |
| `:BCD_32`| (n/a)  | BCD32      | Ignition `HRBCD_32`          |

Codes REMOVED (no clear vendor precedent + conflict with the new mapping):
`:DI`, `:L`, `:UDI`, `:UL`, `:LI`, `:ULI`, `:LBCD`. Pre-#146 configs that
use them get an "Unknown type code" diagnostic at parse time so users get
a fast surface-level error rather than silent wrong-typed reads.

Codes UNCHANGED (already vendor-aligned): `:BOOL`, `:F`, `:D`, `:BCD`,
`:STR<n>`. Modicon 5/6-digit + mnemonic regions (HR/IR/C/DI) + bit suffix
`.N` are also unchanged.

Defaults:
- Coils / DiscreteInputs → `BOOL` (unchanged)
- HoldingRegisters / InputRegisters with no explicit type → Int16 (matches
  Ignition's bare `HR` default)

Byte-order mnemonics (`:ABCD` / `:CDAB` / `:BADC` / `:DCBA`) are kept but
documented as OtOpcUa-specific — they aren't in any major vendor's per-tag
address string. Ignition uses a `-R` suffix per prefix; Wonderware
configures word-order at the topic level.

Tests:
- 12 Type_Codes_Parse rows updated to assert the new mappings.
- New Removed_Aliases_Are_Rejected (×7) confirms each pre-#146 alias now
  fails fast with "Unknown type code".
- Worked_Example_Int16_Array uses the new `:S` code.
- New Worked_Example_Int32_Array_Via_I_Code documents the `:I = Int32`
  vendor-alignment intent so a future "fix" doesn't accidentally regress.
- Unknown_Type_Code_Rejected_With_Catalog updated to match the new error
  message ("Valid: BOOL, S, US, I, ...").

Docs:
- docs/v2/modbus-addressing.md — table replaced with the post-#146 codes,
  each row cites its Wonderware / Ignition reference. New "Codes removed
  in #146" subsection documents the cutover.
- docs/Driver.Modbus.Cli.md — example grammar list updated; explicit
  type-code reminder appended.

114 addressing tests + 231 driver tests still green. Solution build clean.
2026-04-25 00:51:50 -04:00
Joseph Doherty
5ea57d2d70 Task #138 — Modbus addressing grammar docs + e2e
Closes the docs/e2e end of the Modbus addressing line shipped across
#136-#145.

Docs:

- docs/v2/modbus-addressing.md (new) — full grammar reference.
  Region+offset (Modicon 5-digit / 6-digit / mnemonic), bit suffix,
  type codes (BOOL / I / UI / DI / UDI / LI / ULI / F / D / BCD / LBCD /
  STR<n>), all four byte-order mnemonics (ABCD / CDAB / BADC / DCBA),
  array-count semantics, family-native syntax (DL205 V/Y/C/X/SP and
  MELSEC D/M/X/Y with hex-vs-octal sub-family selection), driver-instance
  options (KeepAlive / Reconnect / IdleDisconnect, MaxCoilsPerRead and
  FC15/16 forcing, Deadband + WriteOnChangeOnly, MaxReadGap +
  CoalesceProhibited, multi-unit IPerCallHostResolver). Includes a worked
  JSON DTO example mixing AddressString + structured tag forms.

- docs/Driver.Modbus.Cli.md — appended a "v2 addressing grammar" section
  pointing users at the full reference, with quick-reference examples.

- Vendor-compatibility caveat documented: type codes and byte-order
  mnemonics were synthesised from training-era vendor docs (Wonderware
  DASMBTCP, Kepware KEPServerEX, Ignition, Matrikon, OAS) and should be
  verified against current vendor manuals before locking for production.

E2E tests (4 new AddressingGrammarTests in IntegrationTests):
- Modicon 5-digit and 6-digit forms map to identical wire offsets.
- Float32 + WordSwap (CDAB) round-trips end-to-end through the
  pymodbus simulator.
- Int16[5] array round-trips as a typed short[] surface.
- Block-read coalescing produces a wire-acceptable PDU when MaxReadGap=5
  bridges three nearby tags.

All tests skip gracefully when the pymodbus simulator at localhost:5020
is unreachable (matches the existing ModbusSimulatorFixture pattern).

Final test count across the Modbus addressing surface:
- 107 ModbusAddressing.Tests (parser + family + Modicon)
- 231 Driver.Modbus.Tests (driver, byte order, array, multi-unit, coalescing,
  protocol, subscribe, connection options)
- 110 Admin.Tests (incl. ModbusOptionsViewModel defaults pinning)
- 4 new AddressingGrammar integration tests (skip when sim down)
2026-04-25 00:32:27 -04:00
Joseph Doherty
858f300a61 Task #145 — Admin UI: expose new Modbus driver config
Two new Blazor components surface every Modbus knob added by #136-#144 so
users can configure the driver without hand-editing DriverConfig JSON.

ModbusAddressEditor.razor (live address-string parser preview):
- Bound to a string AddressString + a Family / MelsecSubFamily hint.
- On every input keystroke, runs ModbusAddressParser.TryParse and surfaces
  the resolved breakdown (Region, Offset, DataType, Bit, ByteOrder,
  ArrayCount, StringLength) inline as a green badge.
- On parse error, shows the parser's diagnostic in red.
- Re-uses the SAME parser the wire driver uses — grammar drift is
  impossible by construction.

ModbusOptionsEditor.razor (driver-instance options panel):
- Connection group (Host / Port / UnitId).
- Family group (#144) with conditional MelsecSubFamily dropdown.
- Keep-alive group (#139): Enabled / Time / Interval / RetryCount.
- Reconnect group (#139): InitialDelay / MaxDelay / BackoffMultiplier.
- Protocol group (#140): MaxRegistersPerRead / Write / Coils / ReadGap.
- Behaviour toggles (#140 + #141): UseFC15 / UseFC16 / WriteOnChangeOnly.
- Bound to ModbusOptionsViewModel — defaults match ModbusDriverOptions
  defaults so unedited rows produce the historical wire output verbatim.

Architecture:
- Admin project gains a ProjectReference to Driver.Modbus.Addressing
  (the shared parser assembly extracted in #136). Admin does NOT take a
  dep on Driver.Modbus itself — the addressing concerns are cleanly
  separated from the wire driver.
- Same-namespace shared assembly means components reference
  ModbusAddressParser / ModbusFamily / etc. without prefix gymnastics.

Tests:
- ModbusOptionsViewModelTests (1 test) — pins every default in the view
  model against the corresponding ModbusDriverOptions default. A
  regression that flips an unedited row to a non-default value gets
  caught here. (Test references both Admin and Driver.Modbus to make the
  cross-assembly comparison.)
- Live Blazor component testing requires bUnit, which isn't currently
  in the test setup; the parser logic the component wraps is fully
  covered by the 91 ModbusAddressParser tests in the addressing project,
  so the glue layer's behaviour is verifiable end-to-end already.

Caveat: the wiring into the existing DriverInstance edit page lives in
DriversTab.razor — that integration is left as a follow-up because it
touches the cluster-edit workflow specifically and the components in
this commit are framework-agnostic enough to drop in. The components
build clean against the existing Admin project; no behavioural change
to other tabs.
2026-04-25 00:26:43 -04:00
Joseph Doherty
366212417c Task #143 — Modbus block-read coalescing (with max-gap knob)
Adds a coalescing read planner that merges nearby tags into single FC03/FC04
PDUs, opt-in via ModbusDriverOptions.MaxReadGap. Default 0 = no coalescing
(every tag gets its own PDU — preserves pre-#143 wire output).

Worked example with MaxReadGap=10:
  T1 @ HR 100 (Int16, 1 reg)
  T2 @ HR 102 (Int16, 1 reg, gap 1 → joins block)
  T3 @ HR 110 (Float32, 2 regs, gap 7 → joins block)
  T4 @ HR 200 (Int16, 1 reg, gap 89 → splits, separate read)
  → 2 PDUs total: FC03 start=100 quantity=12 + FC03 start=200 quantity=1.

Planner:
- Eligible tags: known + register region (HR/IR) + scalar + not String /
  BitInRegister / array + not CoalesceProhibited.
- Groups by (UnitId, Region) — never coalesces across slaves or regions.
- Sorts by start address; merges when (next.start - last.end - 1) ≤ MaxReadGap
  AND the resulting span ≤ MaxRegistersPerRead. Otherwise opens a new block.
- Single-tag blocks are deferred to the per-tag path so WriteOnChange cache
  semantics stay correct without duplication.
- Per-block failure marks every member tag Bad and degrades health — same
  semantics the per-tag path has, but at the block granularity.

Per-tag escape hatch ModbusTagDefinition.CoalesceProhibited (bool, default
false) — when true, the tag is read in isolation regardless of MaxReadGap.
For PLCs with protected register holes between adjacent tags.

Tests (7 new ModbusCoalescingTests):
- MaxReadGap=0 keeps the per-tag behavior (2 reads for 2 tags).
- MaxReadGap=2 merges 3 tags within 5 registers into 1 read of qty=5.
- MaxReadGap=10 splits T1+T2 from T3 when the gap exceeds the threshold.
- CoalesceProhibited tag reads alone even when neighbours are eligible.
- Coalescing never crosses UnitId boundaries (multi-slave gateway safety).
- MaxRegistersPerRead caps a would-be block; planner falls back to separate
  reads when the merged span would exceed the cap.
- Per-tag values surface independently after coalescing (slice-math sanity).

Existing 220 unit tests still green; total 224 pass with the new file (tests
are additive, no regressions).

Follow-up: auto-split-on-protected-hole isn't shipped — a coalesced read
that hits an Illegal Data Address right now marks every member Bad until
the operator sets CoalesceProhibited on the offending tag. Tracked
implicitly by #138's e2e drill against a pymodbus profile with a protected
hole mid-block.
2026-04-25 00:21:18 -04:00
Joseph Doherty
ad7d811f69 Task #142 — Modbus multi-unit-ID per TCP connection (gateway support)
Lifts the previous "one driver = one slave" assumption so a single Modbus
driver instance can front N RTU slaves behind one Ethernet gateway (Anybus,
ProSoft, Lantronix style). Each tag carries an optional UnitId that drives
the MBAP unit-id byte per-PDU, and the IPerCallHostResolver contract surfaces
per-slave host strings so per-PLC circuit breakers fire per-slave (matches
the AB CIP template documented in docs/v2/multi-host-dispatch.md).

Changes:

- ModbusTagDefinition gains optional UnitId (byte?). Null = use driver-level
  ModbusDriverOptions.UnitId (preserves single-slave deployments verbatim).
- ResolveUnitId(tag) helper computed once per ReadOneAsync / WriteOneAsync
  call; passed through ReadRegisterBlockAsync / ReadBitBlockAsync /
  ReadRegisterBlockChunkedAsync / ReadBitBlockChunkedAsync explicitly. The
  probe loop continues using driver-level UnitId (the probe is a
  connection-health check, not slave-specific).
- ModbusDriver implements IPerCallHostResolver. ResolveHost(fullReference)
  returns "host:port/unitN" — distinct strings per slave so the resilience
  pipeline keys breakers on the right granularity. Unknown references fall
  back to the bare HostName (single-slave behaviour).
- BitInRegister RMW path also threads the per-tag UnitId through both the
  read and write halves so a multi-slave deployment stays correct under bit-
  level writes.
- Factory DTO + JSON binding extended with the per-tag UnitId field.

Tests (4 new ModbusMultiUnitTests):
- Per-tag UnitId routes to the correct slave in the MBAP header (driver-level
  UnitId=99 must NOT appear when both tags override).
- Tag without override falls back to driver-level UnitId.
- IPerCallHostResolver returns distinct "host:port/unitN" strings per slave.
- Unknown reference returns the bare HostName fallback.

Existing 220 unit tests + 107 addressing tests still green. Per-PLC breaker
isolation under simulated dead slaves is verifiable via the existing AB CIP
test infra; live coverage lands as an integration test in the #138 docs/e2e
refresh.
2026-04-25 00:16:41 -04:00
Joseph Doherty
4cf0b4eb73 Task #144 — Modbus family-native parser branch (DL205 / MELSEC)
Promotes DirectLogicAddress + MelsecAddress from "utility helpers an engineer
calls manually" to "first-class branch of ModbusAddressParser." Users can now
paste DL205-native (V2000, Y0, C100, X17, SP10) and MELSEC-native (D100, M50,
X20 hex/octal, Y0) addresses directly into TagConfig and the parser handles
the PLC-native → Modbus PDU translation.

Changes:

- Both helper files moved into the shared Driver.Modbus.Addressing assembly
  (same namespace, zero-churn for callers). Required because the parser
  needs to call them and the dependency direction is parser→helpers, not
  the other way.
- New ModbusFamily enum (Generic / DL205 / MELSEC) on
  ModbusDriverOptions.Family. Generic preserves pre-#144 behaviour exactly.
- ModbusDriverOptions.MelsecSubFamily picks the X/Y notation (Q_L_iQR hex
  vs F_iQF octal). Default Q_L_iQR.
- ModbusAddressParser.Parse now takes optional family + sub-family hints.
  When non-Generic, family-native parsing runs FIRST; on miss falls back to
  Modicon / mnemonic. Cross-family ambiguity (C100 = Modicon coil under
  Generic, DL205 control relay under DL205) is unambiguous within one
  driver instance.
- Suffix grammar composes with native addresses: V2000:F:CDAB:5 parses
  end-to-end as DL205 V-memory at PDU 1024 + Float32 + word-swap + array of 5.
- Bit suffix composes too: V2000.7 parses as bit 7 of HR[1024].
- Factory DTO fields Family / MelsecSubFamily flow through to BuildTag so
  the JSON binding can drive everything per-driver.

Tests: 16 new ModbusFamilyParserTests covering DL205 V/Y/C/X/SP, MELSEC
D/M/X/Y, sub-family hex-vs-octal disambiguation, cross-family C100 ambiguity,
fallback to Modicon when native misses, and grammar composition with bit/
byte-order/array modifiers. Existing 91 parser tests still green; 220 driver
tests still green.

Caveat: bank-base offsets for MELSEC X/Y/M default to 0 in the grammar
string. Sites with non-zero "Modbus Device Assignment Parameter" bases must
use the structured tag form to override — addressed in the docs refresh
(#138).
2026-04-25 00:10:43 -04:00
Joseph Doherty
4bffe879c5 Task #141 — Modbus subscribe-side knobs (deadband + write-on-change)
Two driver-side filters that ≥5 of 6 surveyed vendors expose:

1. Per-tag Deadband (double?, on ModbusTagDefinition) — when set, the
   PollGroupEngine onChange callback suppresses publishes whose distance
   from the last-published value is below the threshold. Reduces wire
   traffic to OPC UA clients on noisy analog signals (flow meters,
   temperatures). Numeric scalar types only — Bool / BitInRegister / String
   / array tags publish unconditionally.

2. WriteOnChangeOnly (bool, on ModbusDriverOptions) — when true, the driver
   short-circuits writes whose value matches the most recent successful
   write to that tag. Saves PLC bandwidth on clients that re-publish the
   same setpoint every scan. Cache invalidates on any read that returns a
   different value, so HMI-side changes don't get masked.

Both default off so existing deployments see no behaviour change.

Implementation:
- ShouldPublish guard wraps the existing OnDataChange invocation. First sample
  always passes through (no baseline); subsequent samples compare via
  Convert.ToDouble for the cross-numeric-type math.
- IsRedundantWrite check at the top of WriteAsync; on success the cache is
  populated. Object.Equals handles boxed-numeric equality; arrays are
  excluded (reference-equality would never match anyway).
- ReadAsync invalidates the WriteOnChangeOnly cache when the new value
  differs from the cached last-written value.

Tests (5 new ModbusSubscribeOptionsTests):
- Deadband suppresses sub-threshold changes (100 → 102 → 106 → 107 with
  deadband=5 publishes 100 and 106 only).
- Deadband=null still publishes every change.
- WriteOnChangeOnly suppresses 3 identical 42 writes (only first hits wire).
- WriteOnChangeOnly default false hits the wire every time.
- Read-divergence cache invalidation: external panel write to 99, our
  client's re-write of 42 must NOT be suppressed.

220/220 unit tests green; existing ProtocolOptions tests hardened against
probe-loop noise by disabling the probe in their fixtures.
2026-04-25 00:05:25 -04:00
Joseph Doherty
55f4044a69 Task #140 — Modbus protocol-behavior knobs
Adds ModbusDriverOptions knobs that ≥4 of 6 surveyed vendors expose:

1. MaxCoilsPerRead (ushort, default 2000) — separate from MaxRegistersPerRead
   because coil packing (1 bit per coil) and register packing (16 bits each)
   have different spec ceilings. Coil-array reads above the cap auto-chunk
   the same way register reads have always done. New ReadBitBlockChunkedAsync
   re-assembles per-chunk LSB-first bitmaps into one logical bitmap.

2. UseFC15ForSingleCoilWrites (default false) — forces FC15 (Write Multiple
   Coils with quantity=1) for single-coil writes instead of the default FC05
   (Write Single Coil). Safety / audit PLCs that only accept the multi-write
   codes need this.

3. UseFC16ForSingleRegisterWrites (default false) — same idea for FC16 vs
   FC06 on single holding-register writes.

4. DisableFC23 (default false) — placeholder no-op for the future block-read
   coalescing (#143) work that may opt into FC23 (Read/Write Multiple
   Registers). Lets deployments pre-disable FC23 for PLCs that won't accept
   it, before we ship the optimisation that emits it.

Defaults preserve the historical wire output bit-for-bit (FC05/FC06 for
singles, no chunking under 2000 coils, no FC23). Factory DTO + JSON-binding
extended with parallel fields.

6 new ModbusProtocolOptionsTests covering: defaults, FC05→FC15 forcing,
FC06→FC16 forcing, MaxCoilsPerRead chunking math (2500 coils / 2000 cap →
2 reads of 2000 + 500). Existing 209 unit tests still green.
2026-04-24 23:59:04 -04:00
Joseph Doherty
6cf20131fe Task #139 — Modbus connection-layer config knobs (keep-alive / idle / reconnect)
Promotes the previously hardcoded transport-layer settings to ModbusDriverOptions
so users can tune them through DriverConfig JSON without recompiling.

Three new option groups:

1. KeepAlive (ModbusKeepAliveOptions): Enabled / Time / Interval / RetryCount.
   Defaults preserve the historical PR 53 wire output exactly (Enabled=true,
   Time=30s, Interval=10s, RetryCount=3). Set Enabled=false for PLCs that
   reject SO_KEEPALIVE.

2. IdleDisconnectTimeout (TimeSpan?): when set, the transport tracks last-PDU-
   success and proactively closes + reconnects on the next request after the
   threshold. Defends against silent NAT / firewall socket reaping. Default
   null = disabled (no behaviour change).

3. Reconnect (ModbusReconnectOptions): InitialDelay / MaxDelay /
   BackoffMultiplier for the post-drop reconnect loop. Defaults
   (InitialDelay=0, MaxDelay=30s, Multiplier=2.0) preserve the historical
   immediate-retry behaviour for the first attempt and add geometric backoff
   only if the reconnect itself fails. Capped at 10 attempts before propagating.

ModbusTcpTransport ctor extended with optional keepAlive / idleDisconnect /
reconnect parameters; existing 4-arg call sites continue to compile. Factory
DTO gains parallel KeepAlive / IdleDisconnectMs / Reconnect fields with
default-aware binding.

5 new ModbusConnectionOptionsTests covering the default-preservation contract
(every default field matches pre-#139) and the JSON-binding round-trip for
each knob group. Existing 204 unit tests still green.
2026-04-24 23:53:26 -04:00
Joseph Doherty
850b816873 Task #137 — Modbus per-tag suffix grammar (type / bit / byte-order / array)
Adds the full Wonderware/Kepware/Ignition-style address suffix grammar so
users paste tag spreadsheets without per-tag manual translation:

  <region><offset>[.<bit>][:<type>[<len>]][:<order>][:<count>]

Examples that now parse end-to-end:
  40001                          HoldingRegisters[0], Int16
  400001                         same, 6-digit form
  40001.5                        bit 5 of HR[0]
  40001:F                        Float32 (HR[0..1])
  40001:F:CDAB                   word-swapped Float32
  40001:STR20                    20-char ASCII string
  HR1:DI                         Int32 via mnemonic region
  C100                           Coils[99] (mnemonic)
  40001:F:5                      Float32[5] array (3-field shorthand)
  40001:I:CDAB:10                Int16[10] word-swapped (4-field strict)

Driver-side plumbing:
- ModbusAddressParser + ParsedModbusAddress in the shared Addressing
  assembly. 91 parser tests (every grammar variant + malformed shapes).
- ModbusDataType / ModbusByteOrder moved to shared (with the same namespace
  so callers compile unchanged). ModbusByteOrder gains ByteSwap (BADC) and
  FullReverse (DCBA) alongside the existing BigEndian (ABCD) and WordSwap
  (CDAB).
- NormalizeWordOrder extended to honor all four orders for both 4-byte and
  8-byte values. Old WordSwap behavior preserved bit-for-bit.
- ModbusTagDefinition gains optional ArrayCount.
- ReadOneAsync / WriteOneAsync handle array fan-out: one FC03/04 read covers
  N consecutive register-typed elements, decoded into a typed array (short[],
  float[], etc.). Coil arrays use FC01 reads + FC15 writes (FakeTransport
  in tests gains FC15 support to match).
- DriverAttributeInfo IsArray / ArrayDim flow from ArrayCount so the OPC UA
  address space surfaces ValueRank=1 + ArrayDimensions to clients.
- ModbusDriverFactoryExtensions gains AddressString DTO field. When
  present, the parser drives Region/Address/DataType/ByteOrder/Bit/
  StringLength/ArrayCount; structured fields (Writable, WriteIdempotent,
  StringByteOrder) still come from the DTO. Existing structured tag rows
  keep working unchanged.

Tests: 91 parser unit tests (Driver.Modbus.Addressing.Tests, all green) +
204 driver tests including new ModbusByteOrderTests (BADC/DCBA roundtrips
across Int32/Float32/Float64) and ModbusArrayTests (Int16[5], Float32[3]
CDAB, Coil[10], length-mismatch error, IsArray/ArrayDim discovery).
Solution-wide build clean.

Caveat: grammar names (type codes, byte-order mnemonics, the :count
shorthand) were synthesized from training-era vendor docs. Verify against
current Kepware Modbus Ethernet Driver Help and Ignition Modbus Addressing
manuals before freezing for production deployments — naming may need a
back-compat layer if vendor wording has shifted.
2026-04-24 23:49:22 -04:00
Joseph Doherty
501d8f494b Task #136 — Modicon address-string parser (5/6-digit) + shared addressing assembly
Foundation for the Modbus addressing-grammar work tracked in #137-#145. Adds
ModbusModiconAddress.Parse / TryParse that turns classic Modicon strings
(40001 / 400001 / 30001 / 00001 / 10001) into (Region, ushort PduOffset).

Also extracts ModbusRegion to a new Driver.Modbus.Addressing assembly so the
Admin UI (#145) can reference the addressing surface without taking a dep on
the wire driver. The new assembly intentionally extends the same
ZB.MOM.WW.OtOpcUa.Driver.Modbus namespace as the driver — callers see the
type as if it lived in one place; only the project layout changes. No
existing call site needed editing (zero-churn move).

Behaviour:
- Single leading digit selects region (0=Coils, 1=DiscreteInputs,
  3=InputRegisters, 4=HoldingRegisters).
- 5-digit form: trailing 4 digits are 1-based register, supports 1..9999.
- 6-digit form: trailing 5 digits are 1-based register, supports 1..65536
  (full PDU address space).
- Strict 5-or-6 length check; whitespace trimmed; clear FormatException
  diagnostics for every malformed shape (wrong length, non-digit body,
  illegal leading digit, register zero, register overflow).

29/29 new unit tests pass. Full Driver.Modbus suite (182 tests) and the
solution-wide build still green after the ModbusRegion move.
2026-04-24 23:34:18 -04:00
Joseph Doherty
fb760bc465 Task #135 — update integration-test NodeIds for path-based scheme
7 integration tests in Server.Tests were left behind by the path-based
NodeId rename (#134). Each was constructing test NodeIds in the old
"FullReference" shape ("TestFolder.Var1", "raw.var", "AlphaFolder.Var1",
"plcaddr-temperature"), which the node manager no longer mints — the new
shape is `{driverId}/{folder-path}/{browseName}` per OPC UA Part 3 §5.2.2
NodeId immutability.

Fixed by re-deriving each test NodeId from the actual browse path the test
fixture's driver registers:

- OpcUaServerIntegrationTests: "TestFolder.Var1" → "fake/TestFolder/Var1"
- HistoryReadIntegrationTests (4 tests): "raw.var" → "history-driver/raw",
  "proc.var" → "history-driver/proc" (×2), "atTime.var" → "history-driver/atTime"
- MultipleDriverInstancesIntegrationTests: "AlphaFolder.Var1" →
  "alpha/AlphaFolder/Var1"; "BetaFolder.Var1" → "beta/BetaFolder/Var1"
- OpcUaEquipmentWalkerIntegrationTests: "plcaddr-temperature" →
  "galaxy-prod/warsaw/line-a/oven-3/Temperature" (the walker uses Tag.Name
  as the browseName; the FullReference lives in TagConfig but no longer
  surfaces in the NodeId path)

Server.Tests now 277/277 green excluding LiveLdap. Clears the regression
flagged during the #124 verification run.
2026-04-24 22:03:03 -04:00
Joseph Doherty
75c07149d4 Task #124 — Phase 6.2 multi-user authz interop matrix + close LdapGroups gap
The Phase 6.2 evaluator was wired but received no input in production:
RoleBasedIdentity (the IUserIdentity our LDAP path produces) implemented
IRoleBearer but not ILdapGroupsBearer, so AuthorizationGate.BuildSessionState
always returned null and the gate lax-mode-allowed every request. UserAuthResult
also never carried the resolved LDAP groups, only the role-mapped strings.

Closing the gap so the evaluator gets real data:

- UserAuthResult adds Groups alongside Roles. LdapUserAuthenticator now
  surfaces the raw RDN values (ReadOnly / WriteOperate / ...) it already
  collected during the directory query. Roles stay separate per decision #150
  (control-plane Admin role mapping vs data-plane NodeAcl key).
- RoleBasedIdentity implements ILdapGroupsBearer so AuthorizationGate sees
  the groups via the same seam unit tests already use.

ThreeUserInteropMatrixTests drives the closure end-to-end against the live
GLAuth dev directory:

- 5 distinct group memberships (readonly / writeop / writetune /
  writeconfig / alarmack) plus the multi-group admin user
- Each is bound through the real LdapUserAuthenticator
- Resolved groups feed an LdapBoundIdentity that goes through the strict-mode
  AuthorizationGate against a seeded TriePermissionEvaluator
- 31 InlineData rows assert the role × operation matrix; failures pinpoint
  the exact (user, op) cell

The remaining wire-level leg of #124 — a real OPC UA client driving UserName
tokens through an encrypted endpoint policy — still needs a deployment knob
and stays a manual cross-vendor smoke (#119 / #124 manual scope). The doc
audit note in admin-ui-phase-6-status.md is updated to reflect what's now
auto'd vs what stays manual.

33/33 new tests pass against live GLAuth; existing 270 non-LiveLdap tests
in Server.Tests still pass; Core.Tests 205/205, Admin.Tests 109/109. The 7
integration-test failures observed during this run pre-exist this commit
(NodeId-scheme regression from #134) and are tracked separately as #135.
2026-04-24 20:40:07 -04:00
Joseph Doherty
d11d160395 Admin UI Phase 6 audit — close #128–#131 as already-shipped
Task-by-task audit of the Admin UI quartet shows every page listed in
the task descriptions is already built, routed, DI-wired, SignalR-live,
and covered by Admin.Tests (112/112 green):

- #128 /hosts — Hosts.razor 233 LOC with ConsecutiveFailures +
  LastCircuitBreakerOpenUtc + Stale/Faulted/Running cards
- #129 RoleGrants + AclsTab + Probe — RoleGrants.razor (192 LOC),
  AclsTab.razor (279 LOC) with the embedded Probe form at line 38
- #130 RedundancyTab — RedundancyTab.razor 175 LOC with peer
  reachability / ServiceLevel / apply-lease / failover button
- #131 Draft/Publish/Diff/Identification — DraftEditor (105 LOC) +
  Generations (73 LOC) + DiffViewer (87 LOC) + IdentificationFields
  (49 LOC), all wired to GenerationService / DraftValidationService

Shipping docs/v2/implementation/admin-ui-phase-6-status.md as the
canonical reference. Each task's required features are listed with the
exact file / LOC / routing + DI injection so future auditors don't
need to re-derive the status.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:07:05 -04:00
Joseph Doherty
e5d1c9c9b9 Phase 6.1 multi-host dispatch — document shipped contract + per-driver status
Task #127 / decision #144. The resilience infrastructure for per-PLC
circuit breakers is shipped and fully tested — the task description's
"current pipeline keys on DriverInstanceId only" was stale. The actual
state:

- `DriverResiliencePipelineBuilder` keys on
  `(DriverInstanceId, HostName, DriverCapability)`.
- `CapabilityInvoker.ExecuteAsync` takes `hostName` per call.
- `IPerCallHostResolver` is the driver-side hook; AB CIP implements it.
- `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`
  proves the end-to-end isolation.

Remaining work is per-driver adoption, not shared infrastructure:
- AB CIP: live + tested
- Galaxy / FOCAS / OPC UA Client / AB Legacy: 1 device per instance by
  design, trivially isolated
- Modbus / S7 / TwinCAT: single-device today; multi-device refactor is
  per-driver surgery (Device row + options + resolver + transport
  fan-out), not a shared-infra change

Shipping docs/v2/multi-host-dispatch.md as the canonical reference:
contract + driver-author checklist + current fleet-wide status table.
Future driver authors follow the AB CIP template.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 19:01:47 -04:00
Joseph Doherty
bd6568bcbd Phase 6.1 Stream B.4 — wire ScheduledRecycleHostedService into bootstrap
Task #125 / #137. The hosted service + scheduler classes already shipped;
this commit connects them to the published-generation driver list so a
Tier C driver with `RecycleIntervalSeconds` in its `ResilienceConfig`
actually gets an armed scheduler at bootstrap.

Wiring:

- `DriverFactoryRegistry.Register` gains an optional `DriverTier`
  parameter (default Tier.A). Existing call sites unchanged —
  `GalaxyProxyDriverFactoryExtensions.Register` explicitly passes
  Tier.C so the bootstrapper can identify out-of-process drivers
  without a per-driver-type allow-list.
- `DriverResilienceOptions` + parser grow `RecycleIntervalSeconds`.
  Tier A/B values are rejected with a diagnostic (decision #74 —
  recycling an in-process driver would kill every OPC UA session).
  Non-positive values are rejected the same way.
- `DriverInstanceBootstrapper` auto-arms a `ScheduledRecycleScheduler`
  after a successful driver register when: (1) the registered tier is
  C, (2) the row's ResilienceConfig carries a positive recycle interval,
  (3) DI has an `IDriverSupervisor` keyed by that `DriverInstanceId`.
  Missing supervisor → warn + skip (no crash). That keeps the wiring
  harmless by default: no driver ships a supervisor today, so the
  hosted service runs with zero schedulers out of the box.
- `Program.cs` registers `ScheduledRecycleHostedService` as singleton
  (shared with `DriverInstanceBootstrapper`) + hosted service (drives
  the tick loop). Constructor changes on the bootstrapper ripple into
  DI resolution automatically.

Tests: 4 new parser tests covering RecycleIntervalSeconds on Tier C
happy path, null default, Tier A/B rejection, non-positive rejection.
Existing 283 Server.Tests + 200 Core.Tests all still green.

No behavioural change for existing deployments: Galaxy driver + any
future Tier C driver gain the opt-in automatically; Tier A/B drivers
(FOCAS, Modbus, S7, AB CIP, AB Legacy, TwinCAT) are structurally
excluded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:58:13 -04:00
Joseph Doherty
a52086efc5 Refresh phase-7-e2e-smoke.md to match current wiring
The runbook shipped at phase-7 close (2026-04-20) described the original
`Doubled = Source × 2` virtual tag, Float64 seed, and flat TagId-shaped
NodeIds. Four commits later the wiring has moved:

- Seed now targets `TestMachine_001.TestHistoryValue` (Int32, writable,
  historized) — no placeholder to fill in for the dev box.
- VirtualTag is `MachineStatus` (Boolean, `Source > 0`, historized).
- NodeIds are path-based per OPC UA Part 3 §5.2.2
  (`{driverId}/{folder-path}/{browseName}`).
- Seed inserts the ClusterNodeCredential row — without it the Server
  bootstrap fails `Unauthorized: caller X is not bound to NodeId`.

Changes:

1. Step 3 — replace "edit the placeholder" instructions with the ZB
   Galaxy-Repository query that finds writable historized attributes
   (dpc CTE + HistoryExtension EXISTS + `security_classification > 0`).
2. New step 4a — LDAP + `SecurityProfile = Basic256Sha256-Sign` recipe
   for the reverse-bridge + alarm-fires stages. Anonymous sessions are
   denied writes against `Operate`-classified attributes (PR 26 gate);
   `writeop / writeop123` against the dev-box GLAuth clears it.
3. Step 6 validation commands updated to the new NodeIds + reference
   the path-based scheme's Part-3 rationale.
4. Drive-the-alarm snippet now calls `otopcua-cli write … -U writeop`
   so operators see the explicit auth step.
5. Acceptance checklist updated for the new tag names + the
   test-galaxy.ps1 `-Username` invocation.
6. Added a 2026-04-24 second-run evidence section alongside the original
   — documents the 3/7 anonymous ceiling and what's needed to reach 7/7.

No code or seed changes in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:13:27 -04:00
Joseph Doherty
ec1a5905bf Galaxy E2E — point at live writable historized attribute + MachineStatus
Pick a Galaxy attribute that actually exercises the full driver stack:
TestMachine_001.TestHistoryValue. Verified against the live dev-box ZB:
it's Int32, writable (security_classification = Operate), and historized
(HistoryExtension primitive). The query lives in
`gr/queries/attributes_extended.sql` — swap to any other writable
historized attribute via the same shape
(`WHERE is_historized = 1 AND security_classification > 0`).

Seed changes:
- Tag row: FullName = TestMachine_001.TestHistoryValue (Int32 / ReadWrite)
- VirtualTag renamed: `Doubled` → `MachineStatus` (Boolean), script returns
  `Source > 0`. Historized, so the write/subscribe exercise doubles as a
  historian-sink check once the alarm/write stages are enabled.
- Scripted alarm predicate reads the same Source and fires on `> 50`.
- Added ClusterNodeCredential(sa → p7-smoke-node) row so
  sp_GetCurrentGenerationForCluster's caller-binding check passes. Without
  this the server bootstrap fails with
  `Unauthorized: caller sa is not bound to NodeId p7-smoke-node`.

E2E script:
- Path-based NodeId defaults updated to match the new MachineStatus
  virtual tag.
- Added optional `-Username / -Password` parameters. Anonymous sessions
  still get denied against Operate-classified attributes (PR 26 /
  docs/Security.md); supplying `-Username writeop -Password writeop123`
  against the dev-box GLAuth exercises the reverse-bridge stage.
- Wired those credentials into every Invoke-Cli / Start-Process CLI
  invocation the script drives.

Anonymous smoke remains 3/7 pass (probe + source read + reverse-bridge
marked acl-expected INFO). A fuller run with
`-Username writeop -Password writeop123` requires also enabling LDAP +
a SecurityProfile that carries a UserName UserTokenPolicy — separate
config step tracked alongside #124 (3-user authz matrix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:04:39 -04:00
Joseph Doherty
69e1d320ac Cold-start guard for script engines — skip evaluation with empty upstream
Both VirtualTagEngine and ScriptedAlarmEngine share a pattern: the
BuildReadCache helper iterates the script's declared input set, reading
from _valueCache with a fallback to _upstream.ReadTag. When an upstream
tag hasn't yet delivered its first subscription push, ReadTag returns a
DataValueSnapshot with a null Value and BadNotConnected quality. User
scripts then cast `(double)ctx.GetTag(path).Value` unconditionally and
throw NullReferenceException — once per evaluation tick until the cache
fills, spamming the log with identical stack traces. The existing catch
block recovered (kept the prior state) but didn't silence the churn.

Add AreInputsReady(cache) to both engines: return true only when every
entry has a non-null Value and a non-Bad StatusCode (Good + Uncertain
are both considered ready). Skip script evaluation when the check
returns false — the engine holds the prior state (alarm) or the prior
snapshot (virtual tag) until upstream delivers. Eliminates the cold-
start NRE spam at root without changing the script-engine contract.

Also: fix $changeLines.Count in test-galaxy.ps1 — PowerShell's
Set-StrictMode -Version 3.0 errors on .Count when Where-Object returns
0 or 1 items. Wrap in `@(...)` to force an array; same pattern the
sibling _common.ps1 already uses in Write-Summary for the same reason.

Task #112 — the Galaxy live E2E now passes 3/7 stages (probe + source
read + reverse-bridge-ACL). The remaining 4 stages (virtual-tag,
subscribe-sees-change, alarm-fires, history-read) are deployment-
specific: MoveInBatchID is idle in this Galaxy + its AccessLevel blocks
writes + it's not historized. Cold-start behaviour is now correct, so
once the seed points at a live attribute those stages should light up.

Tests: 36/36 VirtualTags.Tests + 47/47 ScriptedAlarms.Tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 17:43:48 -04:00
Joseph Doherty
8be82e02c2 Path-based NodeIds — decouple client contract from driver address
The pre-refactor design minted OPC UA NodeIds directly from the driver's
FullReference (the native-address string). That had three long-term
problems:

1. OPC UA Part 3 §5.2.2 requires NodeIds to be immutable across a node's
   lifetime. A rename of the underlying device address — Galaxy attribute,
   S7 tag, Modbus register alias — changed the NodeId and broke every
   client that had pinned the previous identifier.
2. Two drivers with coincidentally-matching native addresses (e.g. `temp`
   in Modbus and `temp` in S7 under different Equipment rows) collided on
   the NodeId identifier.
3. TagConfig was being placed verbatim on the wire; for drivers whose
   TagConfig is JSON (every driver shipped today, per the
   CK_Tag_TagConfig_IsJson check constraint), clients saw the raw JSON
   blob as the NodeId string.

Refactor:

* DriverNodeManager.Variable now mints a stable path-based NodeId
  `{driverId}/{folder-path}/{browseName}` and records the driver-side
  FullReference in a new _fullRefByNodeId map. OnReadValue / OnWriteValue
  / ResolveFullRef look the FullReference up via that map instead of
  casting NodeId.Identifier. The old cast path is preserved as a
  fallback so any test fixture that still registers variables with
  FullRef-shaped NodeIds keeps working.

* EquipmentNodeWalker.AddTagVariable now extracts the cross-driver
  `FullName` field from Tag.TagConfig before handing the address to
  DriverAttributeInfo. Every shipped driver stores the wire reference in
  TagConfig[FullName]; falling back to the raw string covers any future
  driver that wants an opaque non-JSON address. ExtractFullName is
  exposed internal for unit coverage.

* scripts/e2e/test-galaxy.ps1 defaults updated to the new path-based
  NodeIds. Verified live against p7-smoke-galaxy on the dev box:
  `ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source` reads
  return Status=0x00000000 with a real Galaxy byte-array value.

Test suite: 195/195 Core.Tests + 283/283 Server.Tests green. Five new
ExtractFullName / FullName-passthrough tests added.

Task #112 GA-3 — golden-path read verified end-to-end; remaining E2E
script stages still blocked on pre-existing issues (ScriptedAlarm
predicate NRE on empty upstream cache, PowerShell $changeLines.Count
guard), tracked separately.
Task #134 — complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:57:20 -04:00
Joseph Doherty
d11dd0520b Galaxy IPC unblock — live dev-box E2E path
Three root-cause fixes to get an elevated dev-box shell past session open
through to real MXAccess reads:

1. PipeAcl — drop BUILTIN\Administrators deny ACE. UAC's filtered token
   carries the Admins SID as deny-only, so the deny fired even from
   non-elevated admin-account shells. The per-connection SID check in
   PipeServer.VerifyCaller remains the real authorization boundary.

2. PipeServer — swap the Hello-read / VerifyCaller order. ImpersonateNamedPipeClient
   returns ERROR_CANNOT_IMPERSONATE until at least one frame has been read
   from the pipe; reading Hello first satisfies that rule. Previously the
   ACL deny-first path masked this race — removing the deny ACE exposed it.

3. GalaxyIpcClient — add a background reader + single pending-response
   slot. A RuntimeStatusChange event between OpenSessionRequest and
   OpenSessionResponse used to satisfy the caller's single ReadFrameAsync
   and fail CallAsync with "Expected OpenSessionResponse, got
   RuntimeStatusChange". The reader now routes response kinds (and
   ErrorResponse) to the pending TCS and everything else to a handler the
   driver registers in InitializeAsync. The Proxy was already set up to
   raise managed events from RaiseDataChange / RaiseAlarmEvent /
   OnHostConnectivityUpdate — those helpers had no caller until now.

4. RedundancyPublisherHostedService — swallow BadServerHalted while
   polling host.Server.CurrentInstance. StandardServer throws that code
   during startup rather than returning null, so the first poll attempt
   crashed the BackgroundService (and the host) before OnServerStarted
   ran. This race was latent behind the Galaxy init failure above.

Updates docs that described the Admins deny ACE + mandatory non-elevated
shells, and drops the admin-skip guards from every Galaxy integration +
E2E fixture that had them (IpcHandshakeIntegrationTests, EndToEndIpcTests,
ParityFixture, LiveStackFixture, HostSubprocessParityTests).

Adds GalaxyIpcClientRoutingTests covering the router's
request/response match, ErrorResponse, event-between-call, idle event,
and peer-close paths.

Verified live on the dev box against the p7-smoke cluster (gen 6):
driver registered=1 failedInit=0, Phase 7 bridge subscribed, OPC UA
server up on 4840, MXAccess read round-trip returns real data with
Status=0x00000000.

Task #112 — partial: Galaxy live stack is functional end-to-end. The
supplied test-galaxy.ps1 script still fails because the UNS walker
encodes TagConfig JSON as the tag's NodeId instead of the seeded TagId
(pre-existing; separate issue from this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:30:16 -04:00
Joseph Doherty
fb6dd3478d Phase 6.2 Stream C wiring — AuthorizationBootstrap + OpcUaApplicationHost.SetAuthorization
Closes task #133 — the "authz gate is inert in production" blocker
surfaced during task #123. Before this commit, every ACL check on the
six dispatch surfaces (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) short-circuited to allow because Program.cs
constructed OpcUaApplicationHost without passing authzGate or
scopeResolver.

New pieces:

- `AuthorizationOptions` — bound to `Node:Authorization` in
  appsettings.json. `Enabled` (default false) is the master switch;
  `StrictMode` (default false) controls the anonymous / no-LDAP-groups
  fallback behaviour.
- `AuthorizationBootstrap` — singleton service that loads `NodeAcl`
  rows for the published generation, builds a `PermissionTrieCache` +
  `AuthorizationGate`, merges every registered driver's
  `EquipmentNamespaceContent` through `ScopePathIndexBuilder` into one
  full-path `NodeScopeResolver`. Returns `(null, null)` when disabled
  or when no generation is Published yet.
- `DriverEquipmentContentRegistry.Snapshot()` — new method returning a
  defensive copy of the driver → content map so the bootstrap can
  iterate without holding the lock.
- `OpcUaApplicationHost.SetAuthorization(gate, resolver)` — late-bind
  method matching the existing `SetPhase7Sources` pattern. Must run
  before `StartAsync`; rejects post-start rebinding with
  InvalidOperationException.
- `OpcUaServerService.ExecuteAsync` calls `AuthorizationBootstrap.BuildAsync`
  after `PopulateEquipmentContentAsync` and before `applicationHost.StartAsync`,
  in the same window that `SetPhase7Sources` runs.

Behaviour change
- Default (Enabled=false): no behaviour change — the gate stays null,
  all six dispatch surfaces run unchanged. Safe for any existing
  deployment on upgrade.
- Enabled=true with StrictMode=false: identities carrying LDAP groups
  are evaluated against the trie; anonymous / no-groups identities
  pass through (v1 legacy-client compatibility).
- Enabled=true with StrictMode=true: everything evaluates. Anonymous
  or no-groups identities are denied.

Follow-up not covered here: rebind the gate+resolver on generation
refresh (the `GenerationRefreshHostedService` that shipped earlier in
this session). Today the gate only reflects the bootstrap generation
— operators publishing new ACL changes need a process restart to see
them. Matches the current driver-hot-reload limitation and is tracked
in the existing 6.3 follow-up bullet.

Docs: v2-release-readiness.md Phase 6.2 Stream C.12 bullet flipped to
Closed with operator-facing config pointer (`Node:Authorization:Enabled`).

All 283/283 Server.Tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:35:46 -04:00
Joseph Doherty
1be0fb5a29 Phase 6.2 Stream C.12 — lock in ScopePathIndexBuilder semantics with tests
Closes task #123 (partial — builder semantics unit-tested; production
wiring is the new task #133).

ScopePathIndexBuilder + NodeScopeResolver indexed mode already exist —
they produce a full Cluster → Namespace → UnsArea → UnsLine → Equipment
→ Tag scope from the published generation's config rows. What was
missing: unit coverage of the Build semantics (the only consumers were
compile-time references) + explicit acknowledgement in the readiness
doc that the gate/resolver aren't yet wired into Program.cs.

Tests — 6 cases in ScopePathIndexBuilderTests.cs:
- Well-formed content emits full hierarchy.
- Tags with null EquipmentId skipped (SystemPlatform-namespace fallback).
- Tags with broken Equipment FK skipped (publish-time validation
  should have caught; builder is defensive).
- Equipment with broken Line FK skipped.
- Duplicate TagConfig throws InvalidOperationException.
- Resolver with index returns full-path scope; un-indexed ref falls
  through to cluster-only scope (pre-ADR-001 behaviour preserved).

Server.Tests 277 → 283.

Critical follow-up (task #133): Program.cs still constructs
OpcUaApplicationHost WITHOUT authzGate or scopeResolver, so all six
dispatch-layer gates (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) are currently inert in production. Wiring
them up — load NodeAcl + EquipmentNamespaceContent at bootstrap,
construct gate + resolver, pass into OpcUaApplicationHost, rebind on
generation refresh — is the last Phase 6.2 GA blocker.

Docs: v2-release-readiness.md Phase 6.2 Stream C hardening list marks
the scope-resolution bullet struck-through with a close-out note that
calls out the gate-inert-in-production gap + task #133.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:28:19 -04:00
Joseph Doherty
ded292ecd7 Phase 6.2 Stream C — Call + Alarm Acknowledge/Confirm gating
Closes task #122 (Acknowledge + Confirm + generic Call — Shelve stays as
a follow-up pending per-instance method-NodeId resolution).

Before this commit any session with a connected channel could invoke
method nodes on driver-materialized equipment — including alarm
Acknowledge / Confirm. Combined with the Browse + CreateMonitoredItems
gates that landed earlier in Stream C, this was the last service-layer
entry point where a session could still affect state without passing
the authz trie.

Implementation on DriverNodeManager:
- `Call` override — pre-iterates methodsToCall, gates each through
  AuthorizationGate with the operation kind returned by
  MapCallOperation. Denied calls get errors[i] = BadUserAccessDenied
  before delegating to base.Call.
- `MapCallOperation(NodeId methodId)` — maps well-known Part 9 method
  NodeIds to dedicated operation kinds:
    MethodIds.AcknowledgeableConditionType_Acknowledge →
        OpcUaOperation.AlarmAcknowledge
    MethodIds.AcknowledgeableConditionType_Confirm →
        OpcUaOperation.AlarmConfirm
    everything else → OpcUaOperation.Call
  Lets the ACL distinguish "can acknowledge alarms" from "can invoke
  arbitrary methods" without conflating the two roles.
- Shelve dispatch paths through per-instance ShelvedStateMachine methods
  with dynamic NodeIds that can't be constant-matched — falls through
  to generic Call. Fine-grained OpcUaOperation.AlarmShelve is a follow-
  up when the method-invocation path grows a "method-role" annotation.

Extracted GateCallMethodRequests + MapCallOperation as static internal
for unit-testability. 8 new tests (MapCallOperation Acknowledge /
Confirm / generic; gate-null no-op, denied-Acknowledge, allowed-
Acknowledge, mixed-batch, pre-populated-error-preserved).
Server.Tests 269 → 277.

Known follow-ups:
- Shelve per-operation gating (see above).
- TranslateBrowsePathsToNodeIds gating (Browse follow-up from #120).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:22:19 -04:00
Joseph Doherty
6a6b0f56f2 Phase 6.2 Stream C — CreateMonitoredItems per-item gating
Closes task #121 (partial — creation-time gate; decision #153 per-item
revocation stamp is a follow-up).

Before this commit a session could subscribe to any node via
CreateMonitoredItems, even nodes where Read was denied — the
subscription would surface BadUserAccessDenied on each data-change
read, but the client saw a successful CreateMonitoredItems response
and held the subscription open, wasting resources and leaking the
address-space shape through the item metadata.

New override on DriverNodeManager.CreateMonitoredItems:
- Pre-iterates itemsToCreate, gates each through AuthorizationGate with
  OpcUaOperation.CreateMonitoredItems at the target node's scope.
- For denied slots: sets errors[i] = new ServiceResult(
  StatusCodes.BadUserAccessDenied). The OPC Foundation base stack
  honours pre-populated non-success errors and skips item creation for
  those slots — the subscription never holds a handle to a denied
  node.
- Preserves prior errors (e.g. BadNodeIdUnknown) — first diagnosis wins.
- Non-string-identifier references (stack-synthesized numeric ids)
  bypass the gate.

Extracted the pure filter logic into
GateMonitoredItemCreateRequests(items, errors, identity, gate,
scopeResolver) — static internal, unit-testable without the OPC UA
server stack.

Tests — 6 new in MonitoredItemGatingTests.cs (gate-null no-op,
denied-gets-BadUserAccessDenied, allowed-passes, mixed-batch-denies-
per-item, pre-populated-error-preserved, numeric-id-bypass). Server.Tests
263 → 269.

Known follow-ups:
- Per-item (AuthGenerationId, MembershipVersion) stamp (decision #153)
  for detecting revocation mid-subscription — needs subscription-layer
  plumbing.
- TransferSubscriptions not yet wired (same pattern, smaller scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:17:40 -04:00
Joseph Doherty
e8b8541554 Phase 6.2 Stream C — Browse gating on DriverNodeManager
Closes task #120 (partial — strict point-check; ancestor-visibility
implication is a follow-up).

Before this commit DriverNodeManager exposed every materialized node to
every browsing session regardless of the user's ACL. Read + Write +
HistoryRead were already gated through AuthorizationGate in Phase 6.2
Stream C core; Browse was the one surface where the session could still
enumerate nodes it had no permission to touch, discovering structure
even when reads failed with BadUserAccessDenied.

Implementation
- New `Browse` override on DriverNodeManager that calls base.Browse
  first (lets the stack populate the reference list normally), then
  post-filters the IList<ReferenceDescription> so denied nodes are
  removed silently. OPC UA convention: Browse filtering is invisible to
  the client; no BadUserAccessDenied surfaces.
- Extracted the filter loop into the static internal
  `FilterBrowseReferences(references, userIdentity, gate, scopeResolver)`
  so the policy is unit-testable without standing up the full OPC UA
  server stack.
- Non-string NodeId identifiers (stack-synthesized standard-type
  references with numeric identifiers) bypass the gate — only driver-
  materialized nodes key into the authz trie.
- When AuthorizationGate or NodeScopeResolver is null, the filter is a
  no-op — preserves the pre-Phase-6.2 dispatch path for integration
  tests that construct DriverNodeManager without authz.

Tests — 6 new in BrowseGatingTests.cs (gate-null no-op, empty-list
no-op, denied-removed, allowed-passes-through, numeric-id bypass,
lax-mode null-identity keeps references). Server.Tests 257 → 263.

Known follow-up (tracked implicitly under #120 re-scope):
- Ancestor-visibility implication (acl-design.md §Browse line 111): a
  user with Read at `Line/Tag` should be able to Browse `Line` even
  without an explicit Browse grant. Current filter does a strict
  point-check. Proper fix needs TriePermissionEvaluator to expose a
  "subtree-has-any-grant" query.
- TranslateBrowsePathsToNodeIds not yet filtered (same extension
  pattern; small follow-up).

Docs: v2-release-readiness.md Phase 6.2 Stream C hardening list marks
the Browse bullet struck-through with "Partial" close-out note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:11:19 -04:00
Joseph Doherty
a23de2a7e4 Phase 6.3 A.2 + D.1 — GenerationRefreshHostedService: poll + lease-wrap apply
Closes tasks #132 + #118 (GA hardening backlog).

Before this commit, the Server only observed the generation in force at
process start (SealedBootstrap). Peer-published generations accumulated
in the shared config DB while the running node kept serving the
generation it had sealed on boot. Two consequences:

1. Operator role-swaps required a process restart — Admin publishes a
   new generation, but the Server's RedundancyCoordinator never re-read
   the topology.
2. ApplyLeaseRegistry had no apply to wrap. ServiceLevelBand sat at
   PrimaryHealthy (255) during every publish because nothing opened a
   lease; PrimaryMidApply (200) was effectively dead code.

New GenerationRefreshHostedService (src/.../Server/Hosting/):
- Polls sp_GetCurrentGenerationForCluster every 5s (tunable).
- On change: opens leases.BeginApplyLease(newGenerationId, Guid.NewGuid()),
  calls coordinator.RefreshAsync inside the `await using`, releases on
  scope exit (success / exception / cancellation via IAsyncDisposable).
- Diagnostic properties: LastAppliedGenerationId, TickCount, RefreshCount.
- Delegate-injected currentGenerationQuery for test drive-through; real
  path is the private static DefaultQueryCurrentGenerationAsync.
- Registered as HostedService in Program.cs alongside the Phase 6.3
  redundancy / peer-probe stack.

Scope intentionally narrow: only the coordinator refreshes today. Driver
re-init, virtual-tag re-bind, script-engine reload remain as follow-up
wiring. The lease wrap is the right seam for those subscribers to hook
once they grow hot-reload support — the doc comments say so.

Tests
- 5 new unit tests in GenerationRefreshHostedServiceTests (first-apply,
  identity no-op, change-triggers-refresh, null-generation-is-no-op,
  lease-is-released-on-exit). Stub generation-query delegate; real
  coordinator backed by EF InMemory DB.
- Server.Tests total 252 → 257.

Docs
- v2-release-readiness.md Phase 6.3 follow-ups list marks the
  sp_PublishGeneration lease wrap bullet struck-through with close-out
  note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:02:33 -04:00
Joseph Doherty
de77d42eab Phase 6.3 Stream B — peer-probe HostedServices populating PeerReachabilityTracker
Closes task #116 (GA hardening backlog). Before this commit the
RedundancyStatePublisher saw PeerReachability.Unknown for every peer
because the tracker had no writers — every healthy peer got
degraded to the Isolated-Primary band (230) even when fully reachable.
Not release-blocking (safe default), but not the full non-transparent-
redundancy UX either.

Two-layer probe model per docs/v2/implementation/phase-6-3-redundancy-runtime.md
§Stream B:

- PeerHttpProbeLoop (Stream B.1) — fast-fail layer at 2 s / 1 s timeout.
  Hits each peer's http://{Host}:{DashboardPort}/healthz via an injected
  IHttpClientFactory. Writes the HTTP bit of PeerReachability while
  preserving the UA bit from the last UA probe so a transient HTTP blip
  doesn't clobber the authoritative UA reading.

- PeerUaProbeLoop (Stream B.2) — authoritative layer at 10 s / 5 s
  timeout. Calls DiscoveryClient.GetEndpoints against opc.tcp://{Host}:
  {OpcUaPort} — cheap compared to a full Session.Create, no cert trust
  required. Short-circuits when the HTTP probe last reported the peer
  unhealthy (no wasted handshakes on a known-dead endpoint), clearing
  the stale UaHealthy bit in that case.

Both inherit from BackgroundService, follow the tick/delay/catch pattern
RedundancyPublisherHostedService + ResilienceStatusPublisherHostedService
established, and expose TickAsync() as internal for test drive-through.

New PeerProbeOptions class carries the four intervals/timeouts so
operators can tune cadence per site. Registered as singleton in Program.cs;
HTTP client registered by name so the OtOpcUa handler chain
(Serilog enrichers, potential future OpenTelemetry instrumentation) isn't
bypassed.

Tests — 9 new unit tests across PeerHttpProbeLoopTests (5) and
PeerUaProbeLoopTests (4). All pass. Server.Tests total 243 → 252.
Full solution build clean.

Docs: v2-release-readiness.md Phase 6.3 follow-ups list marks the
peer-probe bullet struck-through with a close-out note.

Still deferred in Phase 6.3:
  - OPC UA variable-node binding (task #117 — ServiceLevel + ServerUriArray)
  - sp_PublishGeneration lease wrap (task #118)
  - Client interop matrix (task #119)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:53:38 -04:00