Compare commits

...

126 Commits

Author SHA1 Message Date
Joseph Doherty
db56a95819 Phase 3 PR 68 -- OPC UA Client ITagDiscovery via recursive browse (Full strategy). Adds ITagDiscovery to OpcUaClientDriver. DiscoverAsync opens a single Remote folder on the IAddressSpaceBuilder and recursively browses from the configured root (default: ObjectsFolder i=85; override via OpcUaClientDriverOptions.BrowseRoot for scoped discovery). Browse uses non-obsolete Session.BrowseAsync(RequestHeader, ViewDescription, uint maxReferences, BrowseDescriptionCollection, ct) with HierarchicalReferences forward, subtypes included, NodeClassMask Object+Variable, ResultMask pulling BrowseName + DisplayName + NodeClass + TypeDefinition. Objects become sub-folders via builder.Folder; Variables become builder.Variable entries with FullName set to the NodeId.ToString() serialization so IReadable/IWritable can round-trip without re-resolving. Three safety caps added to OpcUaClientDriverOptions to bound runaway discovery: (1) MaxBrowseDepth default 10 -- deep enough for realistic OPC UA information models, shallow enough that cyclic graphs can't spin the browse forever. (2) MaxDiscoveredNodes default 10_000 -- caps memory on pathological remote servers. Once the cap is hit, recursion short-circuits and the partially-discovered tree is still projected into the local address space (graceful degradation rather than all-or-nothing). (3) BrowseRoot as an opt-in scope restriction string per driver-specs.md \u00A78 -- defaults to ObjectsFolder but operators with 100k-node servers can point it at a single subtree. Visited-set tracks NodeIds already visited to prevent infinite cycles on graphs with non-strict hierarchy (OPC UA models can have back-references). Transient browse failures on a subtree are swallowed -- the sub-branch stops but the rest of discovery continues, matching the Modbus driver's 'transient poll errors don't kill the loop' pattern. The driver's health surface reflects the network-level cascade via the probe loop (PR 69). Deferred to a follow-up PR: DataType resolution via a batch Session.ReadAsync(Attributes.DataType) after the browse so DriverAttributeInfo.DriverDataType is accurate instead of the current conservative DriverDataType.Int32 default; AccessLevel-derived SecurityClass instead of the current ViewOnly default; array-type detection via Attributes.ValueRank + ArrayDimensions. These need an extra wire round-trip per batch of variables + a NodeId -> DriverDataType mapping table; out of scope for PR 68 to keep browse path landable. Unit tests (OpcUaClientDiscoveryTests, 3 facts): DiscoverAsync_without_initialize_throws_InvalidOperationException (pre-init hits RequireSession); DiscoverAsync_rejects_null_builder (ArgumentNullException); Discovery_caps_are_sensible_defaults (asserts 10000 / 10 / null defaults documented above). NullAddressSpaceBuilder stub implements the full IAddressSpaceBuilder shape including IVariableHandle.MarkAsAlarmCondition (throws NotSupportedException since this PR doesn't wire alarms). Live-browse coverage against a real remote server is deferred to the in-process-server-fixture PR. 10/10 OpcUaClient.Tests pass. dotnet build clean. 2026-04-19 01:17:21 -04:00
89bd726fa8 Merge pull request 'Phase 3 PR 67 -- OPC UA Client IReadable + IWritable' (#66) from phase-3-pr67-opcua-client-read-write into v2 2026-04-19 01:15:42 -04:00
Joseph Doherty
238748bc98 Phase 3 PR 67 -- OPC UA Client IReadable + IWritable via Session.ReadAsync/WriteAsync. Adds IReadable + IWritable capabilities to OpcUaClientDriver, routing reads/writes through the session's non-obsolete ReadAsync(RequestHeader, maxAge, TimestampsToReturn, ReadValueIdCollection, ct) and WriteAsync(RequestHeader, WriteValueCollection, ct) overloads (the sync and BeginXxx/EndXxx patterns are all [Obsolete] in SDK 1.5.378). Serializes on the shared Gate from PR 66 so reads + writes + future subscribe + probe don't race on the single session. NodeId parsing: fullReferences use OPC UA's standard serialized NodeId form -- ns=2;s=Demo.Counter, i=2253, ns=4;g=... for GUID, ns=3;b=... for opaque. TryParseNodeId calls NodeId.Parse with the session's MessageContext which honours the server-negotiated namespace URI table. Malformed input surfaces as BadNodeIdInvalid (0x80330000) WITHOUT a wire round-trip -- saves a request for a fault the driver can detect locally. Cascading-quality implementation per driver-specs.md \u00A78: upstream StatusCode, SourceTimestamp, and ServerTimestamp pass through VERBATIM. Bad codes from the remote server stay as the same Bad code (not translated to generic BadInternalError) so downstream clients can distinguish 'upstream value unavailable' from 'local driver bug'. SourceTimestamp is preserved verbatim (null on MinValue guard) so staleness is visible; ServerTimestamp falls back to DateTime.UtcNow if the upstream omitted it, never overwriting a non-zero value. Wire-level exceptions in the Read batch -- transport / timeout / session-dropped -- fan out BadCommunicationError (0x80050000) across every tag in the batch, not BadInternalError, so operators distinguish network reachability from driver faults. Write-side same pattern: successful WriteAsync maps each upstream StatusCode.Code verbatim into the local WriteResult.StatusCode; transport-layer failure fans out BadCommunicationError across the whole batch. WriteValue carries AttributeId=Value + DataValue wrapping Variant(writeValue) -- the SDK handles the type-to-Variant mapping for common CLR types (bool, int, float, string, etc.) so the driver doesn't need a per-type switch. Name disambiguation: the SDK has its own Opc.Ua.WriteRequest type which collides with ZB.MOM.WW.OtOpcUa.Core.Abstractions.WriteRequest; method signature uses the fully-qualified Core.Abstractions.WriteRequest. Unit tests (OpcUaClientReadWriteTests, 2 facts): ReadAsync_without_initialize_throws_InvalidOperationException + WriteAsync_without_initialize_throws_InvalidOperationException -- pre-init calls hit RequireSession and fail uniformly. Wire-level round-trip coverage against a live remote server lands in a follow-up PR once we scaffold an in-process OPC UA server fixture (the existing Server project in the solution is a candidate host). 7/7 OpcUaClient.Tests pass (5 scaffold + 2 read/write). dotnet build clean. Scope: ITagDiscovery (browse) + ISubscribable + IHostConnectivityProbe remain deferred to PRs 68-69 which also need namespace-index remapping and reference-counted MonitoredItem forwarding per driver-specs.md \u00A78. 2026-04-19 01:13:34 -04:00
b21d550836 Merge pull request 'Phase 3 PR 66 -- OPC UA Client (gateway) driver scaffold' (#65) from phase-3-pr66-opcua-client-scaffold into v2 2026-04-19 01:10:07 -04:00
Joseph Doherty
91eaf534c8 Phase 3 PR 66 -- OPC UA Client (gateway) driver project scaffold + IDriver session lifecycle. First driver that CONSUMES OPC UA rather than PUBLISHES it -- connects to a remote server and re-exposes its address space through the local OtOpcUa server per driver-specs.md \u00A78. Uses the same OPCFoundation.NetStandard.Opc.Ua.Client package the existing Client.Shared ships (bumped to 1.5.378.106 to match). Builds its own ApplicationConfiguration (cert stores under %LocalAppData%/OtOpcUa/pki so multiple driver instances in one OtOpcUa server process share a trust anchor) rather than reusing Client.Shared -- Client.Shared is oriented at the interactive CLI with different session-lifetime needs (this driver is always-on, needs keep-alive + session transfer on reconnect + multi-year uptime). Navigated the post-refactor 1.5.378 SDK surface: every Session.Create* static is now [Obsolete] in favour of DefaultSessionFactory; CoreClientUtils.SelectEndpoint got the sync overloads deprecated in favour of SelectEndpointAsync with a required ITelemetryContext parameter. Driver passes telemetry: null! to both SelectEndpointAsync + new DefaultSessionFactory(telemetry: null!) -- the SDK's internal default sink handles null gracefully and plumbing a telemetry context through the driver options surface is out of scope (the driver emits its own logs via the DriverHealth surface anyway). ApplicationInstance default ctor is also obsolete; wrapped in #pragma warning disable CS0618 rather than migrate to the ITelemetryContext overload for the same reason. OpcUaClientDriverOptions models driver-specs.md \u00A78 settings: EndpointUrl (default opc.tcp://localhost:4840 IANA-assigned port), SecurityPolicy/SecurityMode/AuthType enums, Username/Password, SessionTimeout=120s + KeepAliveInterval=5s + ReconnectPeriod=5s (defaults from spec), AutoAcceptCertificates=false (production default; dev turns on for self-signed servers), ApplicationUri + SessionName knobs for certificate SAN matching and remote-server session-list identification. OpcUaClientDriver : IDriver: InitializeAsync builds the ApplicationConfiguration, resolves + creates cert if missing via app.CheckApplicationInstanceCertificatesAsync, selects endpoint via CoreClientUtils.SelectEndpointAsync, builds UserIdentity (Anonymous or Username with UTF-8-encoded password bytes -- the legacy string-password ctor went away; Certificate auth deferred), creates session via DefaultSessionFactory.CreateAsync. Health transitions Unknown -> Initializing -> Healthy on success or -> Faulted on failure with best-effort Session.CloseAsync cleanup. ShutdownAsync (async now, not Task.CompletedTask) closes the session + disposes. Internal Session + Gate expose to the test project via InternalsVisibleTo so PRs 67-69 can stack read/write/discovery/subscribe on the same serialization. Scaffold tests (OpcUaClientDriverScaffoldTests, 5 facts): Default_options_target_standard_opcua_port_and_anonymous_auth (4840 + None mode + Anonymous + AutoAccept=false production default), Default_timeouts_match_driver_specs_section_8 (120s/5s/5s), Driver_reports_type_and_id_before_connect (DriverType=OpcUaClient, DriverInstanceId round-trip, pre-init Unknown health), Initialize_against_unreachable_endpoint_transitions_to_Faulted_and_throws, Reinitialize_against_unreachable_endpoint_re_throws. Uses opc.tcp://127.0.0.1:1 as the 'guaranteed-unreachable' target -- RFC 5737 reserved IPs get black-holed and time out only after the SDK's internal retry/backoff fully elapses (~60s), while port 1 on loopback refuses immediately with TCP RST which keeps the test suite snappy (5 tests / 8s). 5/5 pass. dotnet build clean. Scope boundary: ITagDiscovery / IReadable / IWritable / ISubscribable / IHostConnectivityProbe deliberately NOT in this PR -- they need browse + namespace remapping + reference-counted MonitoredItem forwarding + keep-alive probing and land in PRs 67-69. 2026-04-19 01:07:57 -04:00
d33e38e059 Merge pull request 'Phase 3 PR 65 -- S7 ITagDiscovery + ISubscribable + IHostConnectivityProbe' (#64) from phase-3-pr65-s7-discovery-subscribe-probe into v2 2026-04-19 00:18:17 -04:00
Joseph Doherty
d8ef35d5bd Phase 3 PR 65 -- S7 ITagDiscovery + ISubscribable polling overlay + IHostConnectivityProbe. Three more capability interfaces on S7Driver, matching the Modbus driver's capability coverage. ITagDiscovery: DiscoverAsync streams every configured tag into IAddressSpaceBuilder under a single 'S7' folder; builder.Variable gets a DriverAttributeInfo carrying DriverDataType (MapDataType: Bool->Boolean, Byte/Int/UInt sizes->Int32 (until Core.Abstractions adds widths), Float32/Float64 direct, String + DateTime direct), SecurityClass (Operate if tag.Writable else ViewOnly -- matches the Modbus pattern so DriverNodeManager's ACL layer can gate writes per role without S7-specific logic), IsHistorized=false (S7 has no native historian surface), IsAlarm=false (S7 alarms land through TIA Portal's alarm-in-DB pattern which is per-site and out of scope for PR 65). ISubscribable polling overlay: same pattern Modbus established in PR 22. SubscribeAsync spawns a Task.Run loop that polls every tag, diffs against LastValues, raises OnDataChange on changes plus a force-raise on initial-data push per OPC UA Part 4 convention. Interval floored at 100ms -- S7 CPUs scan 2-10ms but process the comms mailbox at most once per scan, so sub-scan polling just queues wire-side with worse latency per S7netplus documented pattern. Poll errors tolerated: first-read fault doesn't kill the loop (caller can't receive initial values but subsequent polls try again); transient poll errors also swallowed so the loop survives a power-cycle + reconnect through the health surface. UnsubscribeAsync cancels the CTS + removes the subscription -- unknown handle is a no-op, not a throw, because the caller's race with server-side cleanup shouldn't crash either side. Shutdown tears down every subscription before disposing the Plc. IHostConnectivityProbe: HostName surfaced as host:port to match Modbus driver convention (Admin /hosts dashboard renders both families uniformly). GetHostStatuses returns one row (single-endpoint driver). ProbeLoopAsync serializes on the shared Gate + calls Plc.ReadStatusAsync (cheap Get-CPU-Status PDU that doubles as an 'is PLC up' check) every Probe.Interval with a Probe.Timeout cap, transitions HostState Unknown/Stopped -> Running on success and -> Stopped on any failure, raises OnHostStatusChanged only on actual transitions (no noise for steady-state probes). Probe loop starts at end of InitializeAsync when Probe.Enabled=true (default); Shutdown cancels the probe CTS. Initial state stays Unknown until first successful probe -- avoids broadcasting a premature Running before any PDU round-trip has happened. Unit tests (S7DiscoveryAndSubscribeTests, 4 facts): DiscoverAsync_projects_every_tag_into_the_address_space (3 tags + mixed writable/read-only -> Operate vs ViewOnly asserted), GetHostStatuses_returns_one_row_with_host_port_identity_pre_init, SubscribeAsync_returns_unique_handles_and_UnsubscribeAsync_accepts_them (diagnosticId uniqueness + idempotent double-unsubscribe), Subscribe_publishing_interval_is_floored_at_100ms (accepts 50ms request without throwing -- floor is applied internally). Uses a RecordingAddressSpaceBuilder stub that implements IVariableHandle.FullReference + MarkAsAlarmCondition (throws NotImplementedException since the S7 driver never calls it -- alarms out of scope). 57/57 S7 unit tests pass. dotnet build clean. All 5 capability interfaces (IDriver/ITagDiscovery/IReadable/IWritable/ISubscribable/IHostConnectivityProbe) now implemented -- the S7 driver surface is on par with the Modbus driver, minus the extended data types (Int64/UInt64/Float64/String/DateTime deferred per PR 64). 2026-04-19 00:16:10 -04:00
5e318a1ab6 Merge pull request 'Phase 3 PR 64 -- S7 IReadable + IWritable via S7.Net' (#63) from phase-3-pr64-s7-read-write into v2 2026-04-19 00:12:59 -04:00
Joseph Doherty
394d126b2e Phase 3 PR 64 -- S7 IReadable + IWritable via S7.Net string-based Plc.ReadAsync/WriteAsync. Adds IReadable + IWritable capability interfaces to S7Driver, routing reads/writes through S7netplus's string-address API (Plc.ReadAsync(string, ct) / Plc.WriteAsync(string, object, ct)). All operations serialize on the class's SemaphoreSlim Gate because S7netplus mandates one Plc connection per PLC with client-side serialization -- parallel reads against a single S7 CPU queue wire-side anyway and just eat connection-resource budget. Supported data types in this PR: Bool, Byte, Int16, UInt16, Int32, UInt32, Float32. S7.Net's string-based read returns UNSIGNED boxed values (DBX=bool, DBB=byte, DBW=ushort, DBD=uint); the driver reinterprets them into the requested S7DataType via the (DataType, Size, raw) switch: unchecked short-cast for Int16, unchecked int-cast for Int32, BitConverter.UInt32BitsToSingle for Float32. Writes inverse the conversion -- Int16 -> unchecked ushort cast, Int32 -> unchecked uint cast, Float32 -> BitConverter.SingleToUInt32Bits -- before handing to S7.Net's WriteAsync. This avoids a second PLC round-trip that a typed ReadAsync(DataType, db, offset, VarType, ...) overload would need. Int64, UInt64, Float64, String, DateTime throw NotSupportedException (-> BadNotSupported StatusCode); S7 STRING has non-trivial header semantics + LReal/DateTime need typed S7.Net API paths, both land in a follow-up PR when scope demands. InitializeAsync now parses every tag's Address string via S7AddressParser at init time. Bad addresses throw FormatException and flip health to Faulted -- callers can't register a broken driver. The parsed form goes into _parsedByName so Read/Write can consult Size/BitOffset without re-parsing per operation. StatusCode mapping in catch chain: unknown tag name -> BadNodeIdUnknown (0x80340000), unsupported data type -> BadNotSupported (0x803D0000), read-only tag write attempt -> BadNotWritable (0x803B0000), S7.Net PlcException (carries PUT/GET-disabled signal on S7-1200/1500) -> BadDeviceFailure (0x80550000) so operators see a TIA-Portal config problem rather than a transient-fault false flag per driver-specs.md \u00A75, any other runtime exception on read -> BadCommunicationError (0x80050000) to distinguish socket/timeout from tag-level faults. Write generic-exception path stays BadInternalError because write failures can legitimately be driver-side value-range problems. Unit tests (S7DriverReadWriteTests, 3 facts): Initialize_rejects_invalid_tag_address_and_fails_fast -- Tags with a malformed address must throw at InitializeAsync rather than producing a half-healthy driver; ReadAsync_without_initialize_throws_InvalidOperationException + WriteAsync_without_initialize_throws_InvalidOperationException -- pre-init calls hit RequirePlc and throw the uniform 'not initialized' message. Wire-level round-trip coverage (integration test against a live S7-1500 or a mock S7 server) is deferred -- S7.Net doesn't ship an in-process fake and a conformant mock is non-trivial. 53/53 Modbus.Driver.S7.Tests pass (50 parser + 3 read/write). dotnet build clean. 2026-04-19 00:10:41 -04:00
0eab1271be Merge pull request 'Phase 3 PR 63 -- S7AddressParser (DB/M/I/Q/T/C grammar)' (#62) from phase-3-pr63-s7-address-parser into v2 2026-04-19 00:08:27 -04:00
Joseph Doherty
d5034c40f7 Phase 3 PR 63 -- S7AddressParser for DB/M/I/Q/T/C address strings. Adds S7AddressParser + S7ParsedAddress + S7Area + S7Size to the Driver.S7 project. Grammar follows driver-specs.md \u00A75 + Siemens TIA Portal / STEP 7 Classic convention: (1) Data blocks: DB{n}.DB{X|B|W|D}{offset}[.bit] where X=bit (requires .bit suffix 0-7), B=byte, W=word (16-bit), D=dword (32-bit). (2) Merkers: MB{n}, MW{n}, MD{n}, or M{n}.{bit} for bit access. (3) Inputs + Outputs: same {B|W|D} prefix or {n}.{bit} pattern as M. (4) Timers: T{n}. (5) Counters: C{n}. Output is an immutable S7ParsedAddress record struct with Area (DataBlock / Memory / Input / Output / Timer / Counter), DbNumber (only meaningful for DataBlock), Size (Bit / Byte / Word / DWord), ByteOffset (also timer/counter number when Area is Timer/Counter), BitOffset (0-7 for Size=Bit; 0 otherwise). Case-insensitive via ToUpperInvariant, whitespace trimmed on entry. Parse throws FormatException with the offending input echoed in the message; TryParse returns bool for config-validation callers that can't afford exceptions (e.g. Admin UI tag-editor live validation). Strict rejection policy -- 16 garbage cases covered in the theory test: empty/whitespace input, unknown area letter (Z0), DB without number/tail, DB bit size without .bit suffix, bit offset 8+, word/dword with .bit suffix, DB number 0 (must be >=1), non-numeric DB number, unknown size letter (Q), M without offset, M bit access without .bit, bit 8, negative offset, non-digit offset, non-numeric timer. Strict rejection surfaces config errors at driver-init time rather than as BadInternalError on every Read against the bad tag. No driver code wires through yet -- PR 64 is where IReadable/IWritable consume S7ParsedAddress and translate into S7netplus Plc.ReadAsync calls (the S7.Net address grammar is a strict subset of what we accept, and the parser's S7ParsedAddress is the bridge). Unit tests (S7AddressParserTests, 50 facts): parse-valid theories for DB/M/I/Q/T/C covering all size variants + edge bit offsets 0 and 7; case-insensitive + whitespace-trim theory; reject-invalid theory with 16 garbage cases; TryParse round-trip for valid and invalid inputs. 50/50 pass, dotnet build clean. 2026-04-19 00:06:24 -04:00
5e67c49f7c Merge pull request 'Phase 3 PR 62 -- Siemens S7 native driver project scaffold' (#61) from phase-3-pr62-s7-driver-scaffold into v2 2026-04-19 00:05:17 -04:00
Joseph Doherty
0575280a3b Phase 3 PR 62 -- Siemens S7 native driver project scaffold (S7comm via S7netplus). First non-Modbus in-process driver. Creates src/ZB.MOM.WW.OtOpcUa.Driver.S7 (.NET 10, x64 -- S7netplus is managed, no bitness constraint like MXAccess) + tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Tests + slnx entries. Depends on S7netplus 0.20.0 which is the latest version on NuGet resolvable in this cache (0.21.0 per driver-specs.md is not yet published; 0.20.0 covers the same Plc+CpuType+ReadAsync surface). S7DriverOptions captures the connection settings documented in driver-specs.md \u00A75: Host, Port (default 102 ISO-on-TCP), CpuType (default S71500 per most-common deployment), Rack=0, Slot=0 (S7-1200/1500 onboard PN convention; S7-300/400 operators must override to slot 2 or 3), Timeout=5s, Tags list + Probe settings with default MW0 probe address. S7TagDefinition uses S7.Net-style address strings (DB1.DBW0, M0.0, I0.0, QD4) with an S7DataType enum (Bool, Byte, Int16, UInt16, Int32, UInt32, Int64, UInt64, Float32, Float64, String, DateTime -- the full type matrix from the spec); StringLength defaults to 254 (S7 STRING max). S7Driver implements the IDriver-only subset per the PR plan: InitializeAsync opens a managed Plc with the configured CpuType + Host + Rack + Slot, pins WriteTimeout / ReadTimeout on the underlying TcpClient, awaits Plc.OpenAsync with a linked CTS bounded by Options.Timeout so the ISO handshake itself respects the configured bound; health transitions Unknown -> Initializing -> Healthy on success or Unknown -> Initializing -> Faulted on handshake failure, with a best-effort Plc.Close() on the faulted path so retries don't leak the TcpClient. ShutdownAsync closes the Plc and flips health back to Unknown. DisposeAsync routes through ShutdownAsync + disposes the SemaphoreSlim. Internal Gate + Plc accessors are exposed to the test project (InternalsVisibleTo) so PRs 63-65 can stack read/write/subscribe on the same serialization semaphore per the S7netplus documented 'one Plc per PLC, SemaphoreSlim-serialized' pattern. ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe are all deliberately omitted from this PR -- they depend on the S7AddressParser (PR 63) and land sequenced in PRs 64-65. Unit tests (S7DriverScaffoldTests, 5 facts): default options target S7-1500 / port 102 / slot 0, default probe interval 5s, tag defaults to writable with StringLength 254, driver reports DriverType=S7 + Unknown health pre-init, Initialize against RFC-5737 reserved IP 192.0.2.1 with 250ms timeout transitions to Faulted and throws (tests the connect-failure path doesn't leave the driver in an ambiguous state). 5/5 pass. dotnet build ZB.MOM.WW.OtOpcUa.slnx: 0 errors. No regression in Modbus / Galaxy suites. PR 63 ships S7AddressParser next, PR 64 wires IReadable/IWritable over S7netplus, PR 65 adds discovery + polling-overlay subscribe + probe. 2026-04-19 00:03:09 -04:00
8150177296 Merge pull request 'Phase 2 PR 61 -- Close V1_ARCHIVE_STATUS.md: Streams D + E done' (#60) from phase-2-pr61-scrub-v1-archive-residue into v2 2026-04-18 23:22:58 -04:00
Joseph Doherty
56d8af8bdb Phase 2 PR 61 -- Close V1_ARCHIVE_STATUS.md; Phase 2 Streams D + E done. Purely a documentation-closure PR. The v1 archive deletion itself happened across earlier PRs: PR 2 on phase-2-stream-d archive-marked the four v1 projects (IsTestProject=false so dotnet test slnx bypassed them); Phase 3 PR 18 deleted the archived project source trees. What remained on disk was stale bin/obj residue from pre-deletion builds -- git never tracked those, so removing them from the working tree is cosmetic only (no source-file diff in this PR). What this PR actually changes: V1_ARCHIVE_STATUS.md is rewritten from 'Deletion plan (Phase 2 PR 3)' pre-work prose to a CLOSED retrospective that (a) lists all five v1 directories as deleted with check-marks (src/OtOpcUa.Host, src/Historian.Aveva, tests/Historian.Aveva.Tests, tests/Tests.v1Archive, tests/IntegrationTests), (b) names the parity-bar tests that now fill the role the 494 v1 tests originally held (Driver.Galaxy.E2E cross-FX subprocess parity + stability-findings regression, per-component *.Tests projects, Driver.Modbus.IntegrationTests, LiveStack/ smoke tests), and (c) gives the closure timeline connecting PR 2 -> Phase 3 PR 18 -> this PR 61. Also added the Modbus TCP driver family as parity coverage that didn't exist in v1 (DL205 + S7-1500 + Mitsubishi MELSEC via pymodbus sim). Stream D (retire legacy Host) has been effectively done since Phase 3 PR 18; Stream E (parity validation) is done since PR 2 landed the Driver.Galaxy.E2E project with HostSubprocessParityTests + HierarchyParityTests + StabilityFindingsRegressionTests. This PR exists to definitively close the two pending Phase 2 tasks on the task list and give future-me (or anyone picking up Phase 2 retrospectives) a single 'what actually happened' doc instead of a 'what we plan to do' prose that didn't match reality. dotnet build ZB.MOM.WW.OtOpcUa.slnx: 0 errors, 200 warnings (all xunit1051 cancellation-token analyzer advisories, unchanged from v2 tip). No test regressions -- no source code changed. 2026-04-18 23:20:54 -04:00
be8261a4ac Merge pull request 'Phase 3 PR 60 -- Mitsubishi MELSEC quirk integration tests' (#59) from phase-3-pr60-mitsubishi-quirk-tests into v2 2026-04-18 23:10:36 -04:00
65de2b4a09 Merge pull request 'Phase 3 PR 59 -- MelsecAddress helper with family selector (hex vs octal X/Y)' (#58) from phase-3-pr59-melsec-address-helper into v2 2026-04-18 23:10:29 -04:00
fccb566a30 Merge pull request 'Phase 3 PR 58 -- Mitsubishi MELSEC pymodbus profile + smoke' (#57) from phase-3-pr58-mitsubishi-sim-profile into v2 2026-04-18 23:10:21 -04:00
9ccc7338b8 Merge pull request 'Phase 3 PR 57 -- S7 byte-order + fingerprint integration tests' (#56) from phase-3-pr57-s7-quirk-tests into v2 2026-04-18 23:10:14 -04:00
e33783e042 Merge pull request 'Phase 3 PR 56 -- Siemens S7-1500 pymodbus profile + smoke' (#55) from phase-3-pr56-s7-sim-profile into v2 2026-04-18 23:10:07 -04:00
Joseph Doherty
a44fc7a610 Phase 3 PR 60 -- Mitsubishi MELSEC quirk integration tests against mitsubishi pymodbus profile. Seven facts in MitsubishiQuirkTests covering the quirks documented in docs/v2/mitsubishi.md that are testable end-to-end via pymodbus: (1) Mitsubishi_D0_fingerprint_reads_0x1234 -- MELSEC operators reserve D0 as a fingerprint word so Modbus clients can verify they're hitting the right Device Assignment block; test reads HR[0]=0x1234 via DRegisterToHolding('D0') helper. (2) Mitsubishi_Float32_CDAB_decodes_1_5f_from_D100 -- reads HR[100..101] with WordSwap AND BigEndian; asserts WordSwap==1.5f AND BigEndian!=1.5f, proving (a) MELSEC uses CDAB default same as DL260, (b) opposite of S7 ABCD, (c) driver flag is not a no-op. (3) Mitsubishi_D10_is_binary_not_BCD -- reads HR[10]=0x04D2 as Int16 and asserts value 1234 (binary decode), contrasting with DL205's BCD-by-default convention. (4) Mitsubishi_D10_as_BCD_throws_because_nibble_is_non_decimal -- reads same HR[10] as Bcd16 and asserts StatusCode != 0 because nibble 0xD fails BCD validation; proves the BCD decoder fails loud when the tag config is wrong rather than silently returning garbage. (5) Mitsubishi_QLiQR_X210_hex_maps_to_DI_528_reads_ON -- reads FC02 at the MelsecAddress.XInputToDiscrete('X210', Q_L_iQR)-resolved address (=528 decimal) and asserts ON; proves the hex-parsing path end-to-end. (6) Mitsubishi_family_trap_X20_differs_on_Q_vs_FX -- unit-level proof in the integration file so the headline family trap is visible to anyone filtering by Device=Mitsubishi. (7) Mitsubishi_M512_maps_to_coil_512_reads_ON -- reads FC01 at MRelayToCoil('M512')=512 (decimal) and asserts ON; proves the decimal M-relay path. Test fixture pattern: single MitsubishiQuirkTests class with a shared ShouldRun + NewDriverAsync helper rather than per-quirk classes (contrast with DL205's per-quirk splits). MELSEC per-model differentiation is handled by MelsecFamily enum on the helper rather than per-PR -- so one quirk file + one family enum covers Q/L/iQ-R/FX/iQ-F, and a new PLC family just adds an enum case instead of a new test class. 8/8 Mitsubishi integration tests pass (1 smoke + 7 quirk). 176/176 Modbus.Tests unit suite still green. S7 + DL205 integration tests can be run against their respective profiles by swapping MODBUS_SIM_PROFILE and restarting the pymodbus sim -- each family gates on its profile env var so no cross-family test pollution. 2026-04-18 23:07:00 -04:00
Joseph Doherty
d4c1873998 Phase 3 PR 59 -- MelsecAddress helper for MELSEC X/Y hex-vs-octal family trap + D/M bank bases. Adds MelsecAddress static class with XInputToDiscrete, YOutputToCoil, MRelayToCoil, DRegisterToHolding helpers and a MelsecFamily enum {Q_L_iQR, F_iQF} that drives whether X/Y addresses are parsed as hex (Q-series convention) or octal (FX-series convention). This is the #1 MELSEC driver bug source per docs/v2/mitsubishi.md: the string 'X20' on a MELSEC-Q means DI 32 (hex 0x20) while the same string on an FX3U means DI 16 (octal 0o20). The helper forces the caller to name the family explicitly; no 'sensible default' because wrong defaults just move the bug. Key design decisions: (1) Family is an enum argument, not a helper-level static-selector, because real deployments have BOTH Q-series and FX-series PLCs on the same gateway -- one driver instance per device means family must be per-tag, not per-driver. (2) Bank base is a ushort argument defaulting to 0. Real QJ71MT91/LJ71MT91 assignment blocks commonly place X at DI 8192+, Y at coil 8192+, etc. to leave the low-address range for D-registers; the helper takes the site's configured base as runtime config rather than a compile-time constant. Matches the 'driver opt-in per tag' pattern DirectLogicAddress established for DL260. (3) M-relay and D-register are DECIMAL on every MELSEC family -- docs explicitly; the MELSEC confusion is only about X/Y, not about data registers or internal relays. Helpers reject non-numeric M/D addresses and honor bank bases the same way. (4) Parser walks digits manually for both hex and octal (instead of int.Parse with NumberStyles) so non-hex / non-octal characters give a clear ArgumentException with the offending char + family name. Prevents a subtle class of bugs where int.Parse('X20', Hex) silently returns 32 even for F_iQF callers. Unit tests (MelsecAddressTests, 34 facts): XInputToDiscrete_QLiQR_parses_hex theory (X0, X9, XA, XF, X10, X20, X1FF + lowercase); XInputToDiscrete_FiQF_parses_octal theory (X0, X7, X10, X20, X777); YOutputToCoil equivalents; Same_address_string_decodes_differently_between_families (the headline trap, X20 => 32 on Q vs 16 on FX); reject-non-octal / reject-non-hex / reject-empty / overflow facts; honors-bank-base for X and M and D. 176/176 Modbus.Tests pass (143 prior + 34 new Melsec). No driver core changes -- this is purely a new helper class in the Driver.Modbus project. PR 60 wires it into integration tests against the mitsubishi pymodbus profile. 2026-04-18 23:04:52 -04:00
Joseph Doherty
f52b7d8979 Phase 3 PR 58 -- Mitsubishi MELSEC pymodbus profile + smoke integration test. Adds tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/Pymodbus/mitsubishi.json modelling a representative MELSEC Modbus Device Assignment block: D0..D1023 -> HR[0..1023], M-relay marker at coil 512 (cell 32) and X-input marker at DI 528 (cell 33). Covers the canonical MELSEC quirks from docs/v2/mitsubishi.md: D0 fingerprint at HR[0]=0x1234 so clients can verify the assignment parameter block is in effect, scratch HR 200..209 mirroring dl205/s7_1500/standard scratch range for uniform smoke tests, Float32 1.5f at HR[100..101] in CDAB word order (HR[100]=0, HR[101]=0x3FC0) -- same as DL260, OPPOSITE of S7 ABCD, confirms MELSEC-family driver profile default must be ByteOrder.WordSwap. Int32 0x12345678 CDAB at HR[300..301]. D10 = binary 1234 (0x04D2) proves MELSEC is BINARY-by-default (opposite of DL205 BCD-by-default quirk) -- reading D10 with Bcd16 data type would throw InvalidDataException on nibble 0xD. M-relay marker cell moved to address 32 (coil 512) to avoid shared-block collision with D0 uint16 marker at cell 0; pymodbus shared-blocks=true semantics allow only one type per cell index, so Modbus-coil-0 can't coexist with Modbus-HR-0 on the same sim. Same pattern we applied to dl205 profile (X-input bank at cell 1, not cell 0, to coexist with V0 marker). Adds Mitsubishi/ test directory with MitsubishiProfile.cs (SmokeHoldingRegister=200, SmokeHoldingValue=7890, BuildOptions with probe-disabled + 2s timeout) and MitsubishiSmokeTests.cs (Mitsubishi_roundtrip_write_then_read_of_holding_register single fact that writes 7890 at HR[200] then reads back, gated on MODBUS_SIM_PROFILE=mitsubishi). csproj copies Mitsubishi/** as PreserveNewest. Per-model differences (FX5U firmware gate, QJ71MT91 FC22/23 absence, FX/iQ-F octal vs Q/L/iQ-R hex X-addressing) are handled in the MelsecAddress helper (PR 59) + per-model test classes (PR 60). Verified: smoke 1/1 passes against live mitsubishi sim. Prior S7 tests 4/4 still green when swapped back. Modbus.Tests unit suite 143/143. 2026-04-18 23:02:29 -04:00
Joseph Doherty
b54724a812 Phase 3 PR 57 -- S7 byte-order + fingerprint integration tests against s7_1500 pymodbus profile. Three facts in new S7_ByteOrderTests class: (1) S7_Float32_ABCD_decodes_1_5f_from_HR100 reads HR[100..101] with ModbusByteOrder.BigEndian AND with WordSwap on the same wire bytes; asserts BigEndian==1.5f AND WordSwap!=1.5f -- proving both that Siemens S7 stores Float32 in ABCD word order (opposite of DL260 CDAB) and that the ByteOrder flag is not a no-op on the same wire buffer. (2) S7_Int32_ABCD_decodes_0x12345678_from_HR300 reads HR[300]=0x1234 + HR[301]=0x5678 with BigEndian and asserts the reassembled Int32 = 0x12345678; documents the contrast with DL260 CDAB Int32 encoding. (3) S7_DB1_fingerprint_marker_at_HR0_reads_0xABCD reads HR[0]=0xABCD -- real MB_SERVER deployments reserve DB1.DBW0 as a fingerprint so clients can verify they're pointing at the right DB, protecting against typos in the MB_SERVER.MB_HOLD_REG.DB_number parameter. No driver code changes -- the ByteOrder.BigEndian path has existed since PR 24; this PR exists to lock in the S7-specific semantics at the integration level so future refactors of NormalizeWordOrder can't silently break S7. All 3 tests gate on MODBUS_SIM_PROFILE=s7_1500 so they skip cleanly against dl205 or standard profiles. Verified end-to-end: 4/4 S7 integration tests pass (1 smoke from PR 56 + 3 new). No regression in driver unit tests. Per the per-quirk-PR plan: the S7 quirks NOT testable via pymodbus sim (MB_SERVER STATUS 0x8383 optimized-DB behavior, port-per-connection semantics, CP 343-1 Lean license rejection, STOP-mode non-determinism) remain in docs/v2/s7.md as design guidance for driver users rather than automated tests -- they're TIA-Portal-side or CP-hardware-side behaviors that pymodbus cannot reproduce without custom Python actions. 2026-04-18 22:58:44 -04:00
Joseph Doherty
10c724b5b6 Phase 3 PR 56 -- Siemens S7-1500 pymodbus profile + smoke integration test. Adds tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/Pymodbus/s7_1500.json modelling the SIMATIC S7-1500 + MB_SERVER default deployment documented in docs/v2/s7.md: DB1.DBW0 = 0xABCD fingerprint marker (operators reserve this so clients can verify they're talking to the right DB), scratch HR range 200..209 for write-roundtrip tests mirroring dl205.json + standard.json, Float32 1.5f at HR[100..101] in ABCD word order (high word first -- OPPOSITE of DL260 CDAB), Int32 0x12345678 at HR[300..301] in ABCD. Also seeds a coil at bit-addr 400 (= cell 25 bit 0) and a discrete input at bit-addr 500 (= cell 31 bit 0) so future S7-specific tests for FC01/FC02 have stable markers. shared blocks=true to match the proven dl205.json pattern (pymodbus's bits/uint16 cells coexist cleanly when addresses don't collide). Write list references cells (0, 25, 100-101, 200-209, 300-301), not bit addresses -- pymodbus's write-range entries are cell-indexed, not bit-indexed. Adds tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/S7/ directory with S7_1500Profile.cs (mirrors DL205Profile pattern: SmokeHoldingRegister=200, SmokeHoldingValue=4321, BuildOptions tags + probe-disabled + 2s timeout) and S7_1500SmokeTests.cs (single fact S7_1500_roundtrip_write_then_read_of_holding_register that writes SmokeHoldingValue then reads it back, asserting both write status 0 and read status 0 + value equality). Gates on MODBUS_SIM_PROFILE=s7_1500 so the test skips cleanly against other profiles. csproj updated to copy S7/** to test output as PreserveNewest (pattern matching DL205/**). Pymodbus/serve.ps1 ValidateSet extended from {standard,dl205} to {standard,dl205,s7_1500,mitsubishi} -- mitsubishi.json lands in PR 58 but the validator slot is claimed now so the serve.ps1 diff is one line in this PR and zero lines in future PRs. Verified end-to-end: smoke test 1/1 passes against the running pymodbus s7_1500 profile (localhost:5020 FC06 write of 4321 at HR[200] + FC03 read back). 143/143 Modbus.Tests pass, no regression in driver code because this PR is purely test-asset. Per-quirk S7 integration tests (ABCD word order default, FC23 IllegalFunction, MB_SERVER STATUS 0x8383 behaviour, port-per-connection semantics) land in PR 57+. 2026-04-18 22:57:03 -04:00
8c89d603e8 Merge pull request 'Phase 3 PR 55 -- Mitsubishi MELSEC Modbus TCP quirks research doc' (#54) from phase-3-pr55-mitsubishi-research-doc into v2 2026-04-18 22:54:09 -04:00
299bd4a932 Merge pull request 'Phase 3 PR 54 -- Siemens S7 Modbus TCP quirks research doc' (#53) from phase-3-pr54-s7-research-doc into v2 2026-04-18 22:54:02 -04:00
Joseph Doherty
c506ea298a Phase 3 PR 55 -- Mitsubishi MELSEC Modbus TCP quirks research document. 451-line doc at docs/v2/mitsubishi.md mirroring the docs/v2/dl205.md template for the MELSEC family (Q-series + QJ71MT91, L-series + LJ71MT91, iQ-R + RJ71EN71, iQ-R built-in Ethernet, iQ-F FX5U built-in, FX3U + FX3U-ENET / FX3U-ENET-P502, FX3GE built-in). Like Siemens S7, MELSEC Modbus is a patchwork of per-site-configured add-on modules rather than a fixed firmware stack, but the MELSEC-specific traps are different enough to warrant their own document. Key findings worth flagging for the PR 58+ implementation track: (1) MODULE NAMING TRAP -- QJ71MB91 is SERIAL RTU, not TCP. The Q-series TCP module is QJ71MT91. Driver docs + config UI should surface this clearly because the confusion costs operators hours when they try to connect to an RS-232 module via Ethernet. (2) NO CANONICAL MAPPING -- every MELSEC Modbus site has a unique 'Modbus Device Assignment Parameter' block of up to 16 assignments (each binding a MELSEC device range like D0..D1023 to a Modbus-address range); the driver must treat the mapping as runtime config, not device-family profile. (3) X/Y BASE DEPENDS ON FAMILY -- Q/L/iQ-R use HEX notation for X/Y (X20 = decimal 32), FX/iQ-F use OCTAL (X20 = decimal 16, same as DL260); iQ-F has a GX Works3 project toggle that can flip this. Single biggest off-by-N source in MELSEC driver code -- driver address helper must take a family selector. (4) Word order CDAB across Q/L/iQ-R/iQ-F by default (CPU-level, not module-level) -- no user-configurable swap on the server side. FX5U's SWAP instruction is for CLIENT mode only. Driver Mitsubishi profile default must be ByteOrder.WordSwap, matching DL260 but OPPOSITE of Siemens S7. (5) D-registers are BINARY by default (opposite of DL205's BCD-by-default). FNC 18 BCD / FNC 19 BIN instructions confirm binary-by-default in the ladder. Caller must explicitly opt-in to Bcd16/Bcd32 tags when the ladder stores BCD, same pattern as DL205 but the default is inverted. (6) FX5U FIRMWARE GATE -- needs firmware >= 1.060 for native Modbus TCP server; older firmware is client-only. Surface a clear capability error on connect. (7) FX3U PORT 502 SPLIT -- the standard FX3U-ENET cannot bind port 502 (lower port range restricted on the firmware); only FX3U-ENET-P502 can. FX3U-ENET-ADP has no Modbus at all and is a common operator mis-purchase -- driver should surface 'module does not support Modbus' as a distinct error, not 'connection refused'. (8) QJ71MT91 does NOT support FC22 (Mask Write) or FC23 (Read-Write Multiple). iQ-R and iQ-F do. Driver bulk-read optimization must gate on module capability. (9) MAX CONNECTIONS -- 16 simultaneous on Q/L/iQ-R, 8 on FX5U and FX3U-ENET. (10) STOP-mode writes -- configurable on Q/L/iQ-R/iQ-F (default = accept writes even in STOP), always rejected with exception 04 on FX3U-ENET. Per-model test differentiation section names the tests Mitsubishi_QJ71MT91_*, Mitsubishi_FX5U_*, Mitsubishi_FX3U_ENET_*, with a shared Mitsubishi_Common_* fixture for CDAB-word-order + binary-not-BCD + standard-exception-codes tests. 17 cited references including primary Mitsubishi manuals (SH-080446 for QJ71MT91, JY997D56101 for FX5, SH-081259 for iQ-R Ethernet, JY997D18101 for FX3U-ENET) plus Ignition / Kepware / Fernhill / HMS third-party driver release notes. Three unconfirmed rumours flagged explicitly: iQ-R RJ71EN71 early firmware rumoured ABCD word order (no primary source), QJ71MT91 firmware < 2010-05 FC15 odd-byte-count truncation (forum report only), FX3U-ENET firmware < 1.14 out-of-order TxId echoes under load (unreproducible on bench). Pure documentation PR -- no code, no tests. Per-quirk implementation lands in PRs 58+. Research conducted 2026-04-18. 2026-04-18 22:51:28 -04:00
Joseph Doherty
9e2b5b330f Phase 3 PR 54 -- Siemens S7 Modbus TCP quirks research document. 485-line doc at docs/v2/s7.md mirroring the docs/v2/dl205.md template for the Siemens SIMATIC S7 family (S7-1200 / S7-1500 / S7-300 / S7-400 / ET 200SP / CP 343-1 / CP 443-1 / CP 343-1 Lean / MODBUSPN). Siemens S7 is fundamentally different from DL260: there is no fixed Modbus memory map baked into firmware -- every deployment runs MB_SERVER (S7-1200/1500/ET 200SP), MODBUSCP (S7-300/400 + CP), or MODBUSPN (S7-300/400 PN) library blocks wired up to user DBs via the MB_HOLD_REG / ADDR parameters. The driver's job is therefore to handle per-site CONFIG rather than per-family QUIRKS, and the doc makes that explicit. Key findings worth flagging for the PR 56+ implementation track: (1) S7 has no fixed memory map -- must accept per-site DriverConfig, cannot assume vendor-standard layout. (2) MB_SERVER requires NON-optimized DBs in TIA Portal; optimized DBs cause the library to return STATUS 0x8383 on every access -- the single most common S7 Modbus deployment bug in the field. (3) Word order is ABCD by default (big-endian bytes + big-endian words) across all Siemens S7 Modbus paths, which is the OPPOSITE of DL260 CDAB -- the Modbus driver's S7 profile default must be ByteOrder.BigEndian, not WordSwap. (4) MB_SERVER listens on ONE port per FB instance; multi-client support requires running MB_SERVER on 502 / 503 / 504 / ... simultaneously -- most clients assume port 502 multiplexes, which is wrong on S7. (5) CP 343-1 Lean is SERVER-ONLY and requires the separate 2XV9450-1MB00 MODBUS TCP CP library license; client mode calls return immediate error on Lean. (6) MB_SERVER does NOT filter Unit ID, accepts any value. Means the driver can't use Unit ID to detect 'direct vs gateway' topology. (7) FC23 Read-Write Multiple, FC22 Mask Write, FC20/21 File Records, FC43 Device Identification all return exception 01 Illegal Function on every S7 variant -- the driver MUST NOT attempt bulk-read optimisation via FC23 when talking to S7. (8) STOP-mode read/write behaviour is non-deterministic across firmware bands: reads may return cached data (library internal buffer), writes may succeed-silently or return exception 04 depending on CPU firmware version -- flagged as 'driver treats both as unavailable, do not distinguish'. Unconfirmed rumours flagged separately: 'V2.0+ reverses float byte order' claim (cited but not reproduced), STOP-mode caching location (folklore, no primary source). Per-model test differentiation section names the tests as S7_<model>_<behavior> matching the DL205 template convention (e.g. S7_1200_MB_SERVER_requires_non_optimized_DB, S7_343_1_Lean_rejects_client_mode, S7_FC23_returns_IllegalFunction). 31 cited references across the Siemens Industry Online Support entry-ID system (68011496 for MB_SERVER FAQ, etc.), TIA Portal library manuals, and three third-party driver vendor release notes (Kepware, Ignition, FactoryTalk). This is a pure documentation PR -- no code, no tests, no csproj changes. Per-quirk implementation lands in PRs 56+. Research conducted 2026-04-18 against latest publicly-available Siemens documentation; STOP-mode behaviour and MB_SERVER versioning specifically cross-checked against Siemens forum answers from 2024-2025. 2026-04-18 22:50:51 -04:00
d5c6280333 Merge pull request 'Phase 3 PR 53 -- Transport reconnect-on-drop + SO_KEEPALIVE (DL260 no-keepalive quirk)' (#52) from phase-3-pr53-dl205-reconnect into v2 2026-04-18 22:35:40 -04:00
476ce9b7c5 Merge pull request 'Phase 3 PR 52 -- Modbus exception-code -> OPC UA StatusCode translation' (#51) from phase-3-pr52-dl205-exception-codes into v2 2026-04-18 22:35:33 -04:00
954bf55d28 Merge pull request 'Phase 3 PR 51 -- DL260 X-input FC02 discrete-input mapping end-to-end test' (#50) from phase-3-pr51-dl205-xinput into v2 2026-04-18 22:35:25 -04:00
9fb3cf7512 Merge pull request 'Phase 3 PR 50 -- DL260 bit-memory helpers (Y/C/X/SP) + coil integration tests' (#49) from phase-3-pr50-dl205-coil-mapping into v2 2026-04-18 22:35:18 -04:00
Joseph Doherty
793c787315 Phase 3 PR 53 -- Transport reconnect-on-drop + SO_KEEPALIVE for DL205 no-keepalive quirk. AutomationDirect H2-ECOM100 does NOT send TCP keepalives per docs/v2/dl205.md behavioral-oddities section -- any NAT/firewall device between the gateway and the PLC can silently close an idle socket after 2-5 minutes of inactivity. The PLC itself never notices and the first SendAsync after the drop would previously surface as IOException / EndOfStreamException / SocketException to the caller even though the PLC is perfectly healthy. PR 53 makes ModbusTcpTransport survive mid-session socket drops: SendAsync wraps the previous body as SendOnceAsync; on the first attempt, if the failure is a socket-layer error (IOException, SocketException, EndOfStreamException, ObjectDisposedException) AND autoReconnect is enabled (default true), the transport tears down the dead socket, calls ConnectAsync to re-establish, and resends the PDU exactly once. Deliberately single-retry -- further failures propagate so the driver health surface reflects the real state, no masking a dead PLC. Protocol-layer failures (e.g. ModbusException with exception code 02) are specifically NOT caught by the reconnect path -- they would just come back with the same exception code after the reconnect, so retrying is wasted wire time. Socket-level vs protocol-level is a discriminator inside IsSocketLevelFailure. Also enables SO_KEEPALIVE on the TcpClient with aggressive timing: TcpKeepAliveTime=30s, TcpKeepAliveInterval=10s, TcpKeepAliveRetryCount=3. Total time-to-detect-dead-socket = 30 + 10*3 = 60s, vs the Windows default 2-hour idle + 9 retries = 2h40min. Best-effort: older OSes that don't expose the fine-grained keepalive knobs silently skip them (catch {}). New ModbusDriverOptions.AutoReconnect bool (default true) threads through to the default transport factory in ModbusDriver -- callers wanting the old 'fail loud on drop' behavior can set AutoReconnect=false, or use a custom transportFactory that ignores the option. Unit tests: ModbusTcpReconnectTests boots a FlakeyModbusServer in-process (real TcpListener on loopback) that serves one valid FC03 response then forcibly shuts down the socket. Transport_recovers_from_mid_session_drop_and_retries_successfully issues two consecutive SendAsync calls and asserts both return valid PDUs -- the second must trigger the reconnect path transparently. Transport_without_AutoReconnect_propagates_drop_to_caller asserts the legacy behavior when the opt-out is taken. Validates real socket semantics rather than mocked exceptions. 142/142 Modbus.Tests pass (113 prior + 2 mapper + 2 reconnect + 25 accumulated across PRs 45-52); 11/11 DL205 integration tests still pass with MODBUS_SIM_PROFILE=dl205 -- no regression from the transport change. 2026-04-18 22:32:13 -04:00
Joseph Doherty
cde018aec1 Phase 3 PR 52 -- Modbus exception-code -> OPC UA StatusCode translation. Before this PR every server-side Modbus exception AND every transport-layer failure collapsed to BadInternalError (0x80020000) in the driver's Read/Write results, making field diagnosis 'is this a tag misconfig or a driver bug?' impossible from the OPC UA client side. PR 52 adds a MapModbusExceptionToStatus helper that translates per spec: 01 Illegal Function -> BadNotSupported (0x803D0000); 02 Illegal Data Address -> BadOutOfRange (0x803C0000); 03 Illegal Data Value -> BadOutOfRange; 04 Server Failure -> BadDeviceFailure (0x80550000); 05/06 Acknowledge/Busy -> BadDeviceFailure; 0A/0B Gateway -> BadCommunicationError (0x80050000); unknown -> BadInternalError fallback. Non-Modbus failures (socket drop, timeout, malformed frame) in ReadAsync are now distinguished from tag-level faults: they map to BadCommunicationError so operators check network/PLC reachability rather than tag definitions. Why per-DL205: docs/v2/dl205.md documents DL205/DL260 returning only codes 01-04 with specific triggers -- exception 04 specifically means 'CPU in PROGRAM mode during a protected write', which is operator-recoverable by switching the CPU to RUN; surfacing it as BadDeviceFailure (not BadInternalError) makes the fix obvious. Changes in ModbusDriver: Read catch-chain now ModbusException first (-> mapper), generic Exception second (-> BadCommunicationError); Write catch-chain same pattern but generic Exception stays BadInternalError because write failures can legitimately come from EncodeRegister (out-of-range value) which is a driver-layer fault. Unit tests: MapModbusExceptionToStatus theory exercising every code in the table including the 0xFF fallback; Read_surface_exception_02_as_BadOutOfRange with an ExceptionRaisingTransport that forces code 02; Write_surface_exception_04_as_BadDeviceFailure for CPU-mode faults; Read_non_modbus_failure_maps_to_BadCommunicationError with a NonModbusFailureTransport that raises EndOfStreamException. 115/115 Modbus.Tests pass. Integration test: DL205ExceptionCodeTests.DL205_FC03_at_unmapped_register_returns_BadOutOfRange reads HR[16383] which is beyond the seeded uint16 cells on the dl205.json profile; pymodbus returns exception 02 and the driver surfaces BadOutOfRange. 11/11 DL205 integration tests pass with MODBUS_SIM_PROFILE=dl205. 2026-04-18 22:28:37 -04:00
Joseph Doherty
9892a0253d Phase 3 PR 51 -- DL260 X-input FC02 discrete-input mapping end-to-end test. Integration test DL205XInputTests reads FC02 at the DirectLogicAddress.XInputToDiscrete-resolved address and asserts two behaviors against the dl205.json pymodbus profile: (1) X20 octal (=decimal 16 = Modbus DI 16) reads ON, proving the helper correctly octal-parses the trailing number and adds it to the 0 base; (2) X21 octal reads OFF (not exception) -- per docs/v2/dl205.md §I/O-mapping, 'reading a non-populated X input returns zero, not an exception' on DL260, because the CPU sizes the discrete-input table to the configured I/O not the installed hardware. Pymodbus models this by returning the default 0 value for any DI bit in the configured 'di size' range that wasn't explicitly seeded, matching real DL260 behaviour. Test uses X20 rather than X0 to sidestep a shared-blocks conflict: pymodbus places FC01/FC02 bit-address 0..15 into cell 0, but cell 0 is already uint16-typed (V0 marker = 0xCAFE) per the register-zero quirk test, and shared-blocks semantics allow only one type per cell. X20 octal = DI 16 lands in cell 1 which is free, so both the V0 quirk AND the X-input quirk can coexist in one profile. dl205.json: bits cell 1 seeded value=9 (bits 0 and 3 set -> X20, X23 octal = ON), write-range extended to include cell 1 (though X-inputs are read-only; the write-range entry is required by pymodbus for ANY cell referenced in a bits section even if only reads are expected -- pymodbus validates write-access uniformly). 10/10 DL205 integration tests pass with MODBUS_SIM_PROFILE=dl205. No driver code changes -- the XInputToDiscrete helper + FC02 read path already landed in PRs 50 and 21 respectively. This PR closes the integration-test gap that docs/v2/dl205.md called out under test name DL205_Xinput_unpopulated_reads_as_zero. 2026-04-18 22:25:13 -04:00
Joseph Doherty
b5464f11ee Phase 3 PR 50 -- DL260 bit-memory address helpers (Y/C/X/SP) + live coil integration tests. Adds four new static helpers to DirectLogicAddress covering every discrete-memory bank on the DL260: YOutputToCoil (Y0=coil 2048), CRelayToCoil (C0=coil 3072), XInputToDiscrete (X0=DI 0), SpecialToDiscrete (SP0=DI 1024). Each helper takes the DirectLOGIC ladder-logic address (e.g. 'Y0', 'Y17', 'C1777') and adds the octal-decoded offset to the bank's Modbus base per the DL260 user manual's I/O-configuration chapter table. Uses the same 'octal-walk + reject 8/9' pattern as UserVMemoryToPdu so misaligned addresses fail loudly with a clear ArgumentException rather than silently hitting the wrong coil. Fixes a pymodbus-config bug surfaced during integration-test validation: dl205.json had bits entries at cell indices 2048 / 3072 / 4000, but pymodbus's ModbusSimulatorContext.validate divides bit addresses by 16 before indexing into the shared cell array -- so Modbus coil 2048 reads cell 128, not cell 2048. The sim was returning Illegal Data Address (exception 02) for every bit read in the Y/C/scratch range. Moved bits entries to cells 128 (Y bank marker = 0b101 for Y0=ON, Y1=OFF, Y2=ON), 192 (C bank marker = 0b101 for C0/C1/C2), 250 (scratch cell covering coils 4000..4015). write list updated to the correct cell addresses. Unit tests: YOutputToCoil theory sweep (Y0->2048, Y1->2049, Y7->2055, Y10->2056 octal-to-decimal, Y17->2063, Y777->2559 top of DL260 Y range), CRelayToCoil theory (C0->3072 through C1777->4095), XInputToDiscrete theory, SpecialToDiscrete theory (with case-insensitive 'SP' prefix). Bit_address_rejects_non_octal_digits (Y8/C9/X18), Bit_address_rejects_empty, accepts_lowercase_prefix, accepts_bare_octal_without_prefix. 48/48 Modbus.Tests pass. Integration tests: DL205CoilMappingTests with three facts -- DL260_Y0_maps_to_coil_2048 (FC01 at Y0 returns ON), DL260_C0_maps_to_coil_3072 (FC01 at C0 returns ON), DL260_scratch_Crelay_supports_write_then_read (FC05 write + FC01 read round-trip at coil 4000 proves the DL-mapped coil bank is fully read/write capable end-to-end). 9/9 DL205 integration tests pass against the pymodbus dl205 profile with MODBUS_SIM_PROFILE=dl205. Caller opts into the helpers per tag the same way as PR 47's V-memory helper -- pass DirectLogicAddress.YOutputToCoil("Y0") as the ModbusTagDefinition Address; no driver-wide DL-family flag. PR 51 adds the X-input read-side integration test (there's nothing to write since X-inputs are FC02 discrete inputs, read-only); PR 52 exception-code translation; PR 53 transport reconnect-on-drop since DL260 doesn't send TCP keepalives. 2026-04-18 22:22:42 -04:00
dae29f14c8 Merge pull request 'Phase 3 PR 49 -- Per-device FC03/FC16 register caps with auto-chunking' (#48) from phase-3-pr49-dl205-fc-caps into v2 2026-04-18 22:13:46 -04:00
f306793e36 Merge pull request 'Phase 3 PR 48 -- DL205 CDAB float word order end-to-end test' (#47) from phase-3-pr48-dl205-cdab-float into v2 2026-04-18 22:13:39 -04:00
9e61873cc0 Merge pull request 'Phase 3 PR 47 -- DL205 V-memory octal-address helper' (#46) from phase-3-pr47-dl205-vmemory into v2 2026-04-18 22:13:32 -04:00
1a60470d4a Merge pull request 'Phase 3 PR 46 -- DL205 BCD decoder' (#45) from phase-3-pr46-dl205-bcd into v2 2026-04-18 22:13:24 -04:00
635f67bb02 Merge pull request 'Phase 3 PR 45 -- DL205 string byte-order quirk' (#44) from phase-3-pr45-dl205-string-byte-order into v2 2026-04-18 22:12:15 -04:00
Joseph Doherty
a3f2f95344 Phase 3 PR 49 -- Per-device FC03/FC16 register caps with auto-chunking. Adds MaxRegistersPerRead (default 125, spec max) + MaxRegistersPerWrite (default 123, spec max) to ModbusDriverOptions. Reads that exceed the cap automatically split into consecutive FC03 requests: the driver dispatches chunks of [cap] regs at incrementing addresses, copies each response into an assembled byte[] buffer, and hands the full payload to DecodeRegister. From the caller's view a 240-char string read against a cap-100 device is still one Read() call returning one string -- the chunking is invisible, the wire shows N requests of cap-sized quantity plus one tail chunk. Writes are NOT auto-chunked. Splitting an FC16 across two transactions would lose atomicity -- mid-split crash leaves half the value written, which is strictly worse than rejecting upfront. Instead, writes exceeding MaxRegistersPerWrite throw InvalidOperationException with a message naming the tag + cap + the caller's escape hatch (shorten StringLength or split into multiple tags). The driver catches the exception internally and surfaces it to IWritable as BadInternalError so the caller pattern stays symmetric with other failure modes. Per-family cap cheat-sheet (documented in xml-doc on the option): Modbus-TCP spec = 125 read / 123 write, AutomationDirect DL205/DL260 = 128 read / 100 write (128 exceeds spec byte-count capacity so in practice 125 is the working ceiling), Mitsubishi Q/FX3U = 64 / 64, Omron CJ/CS = 125 / 123. Not all PLCs reject over-cap requests cleanly -- some drop the connection silently -- so having the cap enforced client-side prevents the hard-to-diagnose 'driver just stopped' failure mode. Unit tests: Read_within_cap_issues_single_FC03_request (control: no unnecessary chunking), Read_above_cap_splits_into_two_FC03_requests (120 regs / cap 100 -> 100+20, asserts exact per-chunk (Address,Quantity) and end-to-end payload continuity starting with register[100] high byte = 'A'), Read_cap_honors_Mitsubishi_lower_cap_of_64 (100 regs / cap 64 -> 64+36), Write_exceeding_cap_throws_instead_of_splitting (110 regs / cap 100 -> status != 0 AND Fc16Requests.Count == 0 to prove nothing was sent), Write_within_cap_proceeds_normally (control: cap honored on short writes too). Tests use a new RecordingTransport that captures the (Address, Quantity) tuple of every FC03/FC16 request so the chunk layout is directly assertable -- the existing FakeTransport does not expose request history. 103/103 Modbus.Tests pass; 6/6 DL205 integration tests still pass against the live pymodbus dl205 profile with MODBUS_SIM_PROFILE=dl205. 2026-04-18 21:58:49 -04:00
Joseph Doherty
463c5a4320 Phase 3 PR 48 -- DL205 CDAB word order for Float32 end-to-end test. The driver has supported ModbusByteOrder.WordSwap (CDAB) since PR 24 for all multi-register types -- the underlying word-swap code path was already there. PR 48 closes the loop with an integration test that validates it end-to-end against the dl205 pymodbus profile: HR[1056..1057] stores IEEE-754 1.5f with the low word at the lower address (0x0000 at HR[1056], 0x3FC0 at HR[1057]). Reading with WordSwap returns 1.5f; reading with BigEndian returns a tiny denormal (~5.74e-41) -- a silent "value is 0" bug that typically surfaces in the field only when an operator notices a setpoint readout stuck at 0 while the PLC display shows the real value. Test asserts both: WordSwap==1.5f AND BigEndian!=1.5f, proving the flag is not a no-op. No driver code changes -- the word-swap normalization at NormalizeWordOrder() has handled Float32/Int32/UInt32 correctly since PR 24 and the unit test suite already covers it (Int32_WordSwap_decodes_CDAB_layout + Float32 equivalent). This PR exists primarily to lock in the integration-level validation so future refactors of the codec don't silently break DL205/DL260 floats. 6/6 DL205 integration tests pass with MODBUS_SIM_PROFILE=dl205. 2026-04-18 21:51:15 -04:00
Joseph Doherty
2b5222f5db Phase 3 PR 47 -- DL205 V-memory octal-address helper. Adds DirectLogicAddress static class with two entry points: UserVMemoryToPdu(string) parses a DirectLOGIC V-address (V-prefixed or bare, whitespace tolerated) as OCTAL and returns the 0-based Modbus PDU address. V2000 octal = decimal 1024 = PDU 0x0400, which is the canonical start of the user V-memory bank on DL205/DL260. SystemVMemoryBasePdu + SystemVMemoryToPdu(ushort offset) handle the system bank (V40400 and up) which does NOT follow the simple octal-to-decimal formula -- the CPU relocates the system bank to PDU 0x2100 in H2-ECOM100 absolute mode. A naive caller converting 40400 octal would land at PDU 0x4100 (decimal 16640) and miss the system registers entirely; the helper routes the correct 0x2100 base. Why this matters: DirectLOGIC operators think in OCTAL (the ladder-logic editor, the Productivity/Do-more UI, every AutomationDirect manual addresses V-memory octally) while the Modbus wire is DECIMAL. Integrators routinely copy V-addresses from the PLC documentation into client configs and read garbage because they treated V2000 as decimal 2000 (HR[2000] = 0 in the dl205 sim, zero in most PLCs). The helper makes the translation explicit per the D2-USER-M appendix + H2-ECOM-M \u00A76.5 references cited in docs/v2/dl205.md. Unit tests: UserVMemoryToPdu_converts_octal_V_prefix (V0, V1, V7, V10, V2000, V7777, V10000, V17777 -- the exact sweep documented in dl205.md), UserVMemoryToPdu_accepts_bare_or_prefixed_or_padded (case + whitespace tolerance), UserVMemoryToPdu_rejects_non_octal_digits (V8/V19/V2009 must throw ArgumentException with 'octal' in the message -- .NET has no base-8 int.Parse so we hand-walk digits to catch 8/9 instead of silently accepting them), UserVMemoryToPdu_rejects_empty_input, UserVMemoryToPdu_overflow_rejected (200000 octal = 0x10000 overflows ushort), SystemVMemoryBasePdu_is_0x2100_for_V40400, SystemVMemoryToPdu_offsets_within_bank, SystemVMemoryToPdu_rejects_overflow. 23/23 Modbus.Tests pass. Integration tests against dl205.json pymodbus profile: DL205_V2000_user_memory_resolves_to_PDU_0x0400_marker (reads HR[0x0400]=0x2000), DL205_V40400_system_memory_resolves_to_PDU_0x2100_marker (reads HR[0x2100]=0x4040). 5/5 DL205 integration tests pass. Caller opts into the helper per tag by calling DirectLogicAddress.UserVMemoryToPdu("V2000") as the ModbusTagDefinition Address -- no driver-wide "DL205 mode" flag needed, because users mix DL and non-DL tags in a single driver instance all the time. 2026-04-18 21:49:58 -04:00
Joseph Doherty
8248b126ce Phase 3 PR 46 -- DL205 BCD decoder (binary-coded-decimal numeric encoding). Adds ModbusDataType.Bcd16 and Bcd32 to the driver. Bcd16 is 1 register wide, Bcd32 is 2 registers wide; Bcd32 respects ModbusByteOrder (BigEndian/WordSwap) the same way Int32 does so the CDAB-style families (including DL205/DL260 themselves) can be configured. DecodeRegister uses the new internal DecodeBcd helper: walks each nibble from MSB to LSB, multiplies the running result by 10, adds the nibble as a decimal digit. Explicitly rejects nibbles > 9 with InvalidDataException -- hardware sometimes produces garbage during write-in-progress transitions and silently returning wrong numeric values would quietly corrupt the caller's data. EncodeRegister's new EncodeBcd inverts the operation (mod/div by 10 nibble-by-nibble) with an up-front overflow check against 10^nibbles-1. Why this matters for DL205/DL260: AutomationDirect DirectLOGIC uses BCD as the default numeric encoding for timers, counters, and operator-display numerics (not binary). A plain Int16 read of register 0x1234 returns 4660; the BCD path returns 1234. The two differ enough that silently defaulting to Int16 would give wildly wrong HMI values -- the caller must opt in to Bcd16/Bcd32 per tag. Unit tests: DecodeBcd (theory: 0,1,9,10,1234,9999), DecodeBcd_rejects_nibbles_above_nine, EncodeBcd (theory), Bcd16_decodes_DL205_register_1234_as_decimal_1234 (control: same bytes as Int16 decode to 4660), Bcd16_encode_round_trips_with_decode, Bcd16_encode_rejects_out_of_range_values, Bcd32_decodes_8_digits_big_endian, Bcd32_word_swap_handles_CDAB_layout, Bcd32_encode_round_trips_with_decode, Bcd_RegisterCount_matches_underlying_width. 66/66 Modbus.Tests pass. Integration test: DL205BcdQuirkTests.DL205_BCD16_decodes_HR1072_as_decimal_1234 against dl205.json pymodbus profile (HR[1072]=0x1234). Asserts Bcd16 decode=1234 AND Int16 decode=0x1234 on the same wire bytes to prove the paths are distinct. 3/3 DL205 integration tests pass with MODBUS_SIM_PROFILE=dl205. 2026-04-18 21:46:25 -04:00
Joseph Doherty
cd19022d19 Phase 3 PR 45 -- DL205 string byte-order quirk (low-byte-first ASCII packing). Adds ModbusStringByteOrder enum {HighByteFirst, LowByteFirst} + StringByteOrder field on ModbusTagDefinition (default HighByteFirst, the standard Modbus convention). DecodeRegister + EncodeRegister String branches now respect per-tag byte order. Under LowByteFirst each register packs the first char in the low byte instead of the high byte -- the AutomationDirect DirectLOGIC DL205/DL260/DL350 family's headline string quirk. Without the flag the driver decodes 'eHllo' garbage from HR[1040..1042] even though wire bytes are identical. Unit tests: String_LowByteFirst_decodes_DL205_packed_Hello (5 chars across 3 regs with nul pad), String_LowByteFirst_decode_truncates_at_first_nul, String_LowByteFirst_encode_round_trips_with_decode (asserts exact DL205-documented byte sequence {0x65,0x48,0x6C,0x6C,0x00,0x6F} + symmetric encode->decode), String_HighByteFirst_and_LowByteFirst_differ_on_same_wire (control: same wire, different flag => different decode). 56/56 Modbus.Tests pass. Integration test: DL205StringQuirkTests.DL205_string_low_byte_first_decodes_Hello_from_HR1040 against the dl205.json pymodbus profile; reads HR[1040..1042] with both flags on the same tag map and asserts LowByteFirst='Hello' + HighByteFirst!='Hello'. Gated on MODBUS_SIM_PROFILE=dl205 since the standard profile doesn't seed HR[1040..1042]. Verified 2/2 integration tests pass against running pymodbus dl205 simulator. Baseline for PR 46 (BCD decoder), PR 47 (V-memory octal helper), PR 48 (CDAB float order), PR 49 (FC03/FC16 per-device caps) -- each lands its own DL205_<behavior> test class in tests/.../DL205/. 2026-04-18 21:43:32 -04:00
5ee9acb255 Merge pull request 'Phase 3 PR 44 -- pymodbus validation + IPv4-explicit transport bugfix' (#43) from phase-3-pr44-pymodbus-validation-fixes into v2 2026-04-18 21:39:24 -04:00
Joseph Doherty
02fccbc762 Phase 3 PR 43 — followup commit: validate pymodbus simulator end-to-end + fix three real bugs surfaced by running it. winget-installed Python 3.12.10 + pip-installed pymodbus[simulator]==3.13.0 on the dev box; both profiles boot cleanly, the integration-suite smoke test passes against either profile.
Three substantive issues caught + fixed during the validation pass:
1. pymodbus rejects unknown keys at device-list / setup level. My PR 43 commit had `_layout_note`, `_uint16_layout`, `_bits_layout`, `_write_note` device-level JSON-comment fields that crashed pymodbus startup with `INVALID key in setup`. Removed all device-level _* fields. Inline `_quirk` keys WITHIN individual register entries are tolerated by pymodbus 3.13.0 — kept those in dl205.json since they document the byte math per quirk and the README + git history aren't enough context for a hand-author reading raw integer values. Documented the constraint in the top-level _comment of each profile.
2. pymodbus rejects sweeping `write` ranges that include any cell not assigned a type. My initial standard.json had `write: [[0, 2047]]` but only seeded HR[0..31] + HR[100] + HR[200..209] + bits[1024..1109] — pymodbus blew up on cell 32 (gap between HR[31] and HR[100]). Fixed by listing per-block write ranges that exactly mirror the seeded ranges. Same fix in dl205.json (was `[[0, 16383]]`).
3. pymodbus simulator stores all 4 standard Modbus tables in ONE underlying cell array — each cell can only be typed once (BITS or UINT16, not both). My initial standard.json had `bits[0..31]` AND `uint16[0..31]` overlapping at the same addresses; pymodbus crashed with `ERROR "uint16" <Cell> used`. Fixed by relocating coils to address 1024+, well clear of the uint16 entries at 0..209. Documented the layout constraint in the standard.json top-level _comment.
Substantive driver bug fixed: ModbusTcpTransport.ConnectAsync was using `new TcpClient()` (default constructor — dual-stack, IPv6 first) then `ConnectAsync(host, port)` with the user's hostname. .NET's TcpClient default-resolves "localhost" to ::1 first, fails to connect to pymodbus (which binds 0.0.0.0 IPv4-only), and only then retries IPv4 — the failure surfaces as the entire ConnectAsync timeout (2s by default) before the IPv4 attempt even starts. PR 30's smoke test silently SKIPPED because the fixture's TCP probe hit the same dual-stack ordering and timed out. Both fixed: ModbusSimulatorFixture probe now resolves Dns.GetHostAddresses, prefers AddressFamily.InterNetwork, dials IPv4 explicitly. ModbusTcpTransport does the same — resolves first, prefers IPv4, falls back to whatever Dns returns (handles IPv6-only hosts in the future). This is a real production-readiness fix because most Modbus PLCs are IPv4-only — a generic dual-stack TcpClient would burn the entire connect timeout against any IPv4-only PLC, masquerading as a connection failure when the PLC is actually fine.
Smoke-test address shifted HR[100] -> HR[200]. Standard.json's HR[100] is the auto-incrementing register that drives subscribe-and-receive tests, so write-then-read against it would race the increment. HR[200] is the first cell of a writable scratch range present in BOTH simulator profiles. DL205Profile.cs xml-doc updated to explain the shift; tag name "DL205_Smoke_HReg100" -> "Smoke_HReg200" + smoke test references updated. dl205.json gains a matching scratch HR[200..209] range so the smoke test runs identically against either profile.
Validation matrix:
- standard.json boot: clean (TCP 5020 listening within ~3s of pymodbus.simulator launch).
- dl205.json boot: clean.
- pymodbus client direct FC06 to HR[200]=1234 + FC03 read: round-trip OK.
- raw-bytes PowerShell TcpClient FC06 + 12-byte response: matches FC06 spec (echo of address + value).
- DL205SmokeTest against standard.json: 1/1 pass (was failing as 'BadInternalError' due to the dual-stack timeout + tag-name typo — both fixed).
- DL205SmokeTest against dl205.json: 1/1 pass.
- Modbus.Tests Unit suite: 52/52 pass — dual-stack transport fix is non-breaking.
- Solution build clean.
Memory + future-PR setup: pymodbus install + activation pattern is now bullet-pointed at the top of Pymodbus/README.md so future PRs (the per-quirk DL205_<behavior> tests in PR 44+) don't have to repeat the trial-and-error of getting the simulator + integration tests cooperating. The three bugs above are documented inline in the JSON profiles + ModbusTcpTransport so they don't bite again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:14:02 -04:00
faeab34541 Merge pull request 'Phase 3 PR 43 — Swap ModbusPal to pymodbus for the integration-test simulator' (#42) from phase-3-pr43-pymodbus-swap into v2 2026-04-18 20:52:46 -04:00
Joseph Doherty
a05b84858d Phase 3 PR 43 — Swap ModbusPal to pymodbus for the integration-test simulator. Replaces the .xmpp profiles shipped in PR 42 with pymodbus 3.13.0 ModbusSimulatorServer JSON configs in tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/Pymodbus/. Substantive reasons for the swap (rationale block in the test-plan doc): ModbusPal 1.6b is abandoned (last release ~2019), Java GUI-only with no headless mode in the official JAR, and only exposes 2 of the 4 standard Modbus tables (holding_registers + coils — no input_registers, no discrete_inputs). pymodbus is current stable, pure Python CLI (pip install pymodbus[simulator]==3.13.0), exposes all four tables, has built-in declarative actions (increment / random / timestamp / uptime) for dynamic registers, supports custom Python actions for anything more complex, and ships an optional aiohttp-based web UI / REST API for live inspection. Pip-installable on Windows; sidesteps the privileged-port admin requirement by defaulting to TCP 5020.
ModbusSimulatorFixture default port bumped from 502 to 5020 to match the pymodbus convention. Override via MODBUS_SIM_ENDPOINT for a real PLC on its native 502. Skip-message updated to point at the new Pymodbus\serve.ps1 wrapper instead of 'start ModbusPal'. csproj <None Update> rule swapped from ModbusPal/** to Pymodbus/** so the new JSON profiles + serve.ps1 + README copy to test-output as PreserveNewest.
standard.json — generic Modbus TCP server, slave id 1, port 5020, shared blocks=false (independent coils + HR address spaces, more textbook-PLC-like). HR[0..31] seeded with address-as-value via per-register uint16 entries, HR[100] auto-increments via the built-in increment action with parameters minval=0/maxval=65535 (drives subscribe-and-receive integration tests so they have a register that ticks without a write — pymodbus's increment ticks per-access not wall-clock, which is good enough for a 250ms-poll test), HR[200..209] scratch range left at 0 for write tests, coils 0..31 alternating, coils 100..109 scratch. write list covers 0..1023 so any test address is mutable.
dl205.json — AutomationDirect DirectLOGIC DL205/DL260 quirk simulator, slave id 1, port 5020, shared blocks=true (matches DL series memory model where coils/DI/HR overlay the same word address space). Each quirky register seeded with the pre-computed raw uint16 value documented in docs/v2/dl205.md, with an inline _quirk JSON-comment naming the behavior so future-me reading the file knows why HR[1040]=25928 means 'H' lo / 'e' hi (the user's headline string-byte-order finding). Encoded quirks: V0 marker at HR[0]=0xCAFE; V2000 at HR[1024]=0x2000; V40400 at HR[8448]=0x4040; 'Hello' string at HR[1040..1042] first-char-low-byte; Float32 1.5f at HR[1056..1057] in CDAB word order (low word first); BCD register at HR[1072]=0x1234; FC03-128-cap block at HR[1280..1407]; Y0/C0 coil markers at 2048/3072; scratch C-relays at 4000..4007.
serve.ps1 wrapper — pwsh script with a -Profile {standard|dl205} parameter switch. Validates pymodbus.simulator is on PATH (clearer message than the raw CommandNotFoundException), validates the profile JSON exists, builds the right --modbus_server/--modbus_device/--json_file/--http_port arg list, and execs pymodbus.simulator in the foreground. -HttpPort 0 disables the web UI. Foreground exec lets the operator Ctrl+C to stop without an extra control script.
README.md fully rewritten for pymodbus: install command (pip install 'pymodbus[simulator]==3.13.0' — pinned for reproducibility, [simulator] extra pulls aiohttp), per-profile reference tables, the same DL205 quirk → register table from PR 42 but adjusted for pymodbus paths, what's-NEW-vs-ModbusPal section (all four tables, raw uint16 seeding, declarative actions, custom Python action modules, headless, web UI, maintained), trade-offs section (float32-as-two-uint16s for explicit CDAB control, increment ticks per-access not wall-clock, shared-blocks mode for DL205 vs separate for Standard), file-format quick reference for hand-authoring more profiles. References pinned to the pymodbus readthedocs simulator/config + REST API pages.
docs/v2/modbus-test-plan.md harness section rewritten with the swap rationale; PR-history list updated to mark PR 42 SUPERSEDED by PR 43 and call out PR 44+ as the per-quirk implementation track. Test-conventions bullet about 'don't depend on ModbusPal state between tests' generalized to 'don't depend on simulator state' and a note added that pymodbus's REST API can reset state between facts if a test ever needs it.
DL205Profile.cs and DL205SmokeTests.cs xml-doc updated to reference pymodbus / dl205.json instead of ModbusPal / DL205.xmpp.
Functional validation deferred — Python isn't installed on this dev box (winget search returned no matches for Python.Python.3 exact). JSON parses structurally (PowerShell ConvertFrom-Json clean on both files), build clean, .json + serve.ps1 + README all copy to test-output as expected. User installs pymodbus when they want to actually run the simulator end-to-end; if pymodbus rejects the config the README's reference link to pymodbus's simulator/config schema doc is the right next stop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 20:35:26 -04:00
c59ac9e52d Merge pull request 'Phase 3 PR 42 — ModbusPal simulator profiles for Standard + DL205/DL260' (#41) from phase-3-pr42-modbuspal-profiles into v2 2026-04-18 20:12:39 -04:00
Joseph Doherty
02a0e8efd1 Phase 3 PR 42 — ModbusPal simulator profiles for Standard Modbus + DL205/DL260 quirks. Two hand-authored .xmpp profiles in tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/ModbusPal/ that integration tests load via the GUI to drive the suite without a real PLC. Both well-formed XML (verified via PowerShell [xml] cast); both copied to test-output as PreserveNewest content per the existing csproj rule.
Standard.xmpp — generic Modbus TCP server on port 502, slave id 1. HR[0..31] seeded with address-as-value (HR[5]=5 — easy mental map for diagnostics), HR[100] auto-incrementing via a 1Hz LinearGenerator binding (drives subscribe-and-receive integration tests so they have a register that actually changes without a write), HR[200..209] scratch range for write-roundtrip tests, coils 0..31 alternating on/off, coils 100..109 scratch. The Tick automation runs 0..65535 over 60s looping; bound to HR[100] via Binding_SINT16 — slow enough that a 250ms-poll integration test sees discrete jumps, fast enough that a 5s subscribe test sees several change notifications.
DL205.xmpp — AutomationDirect DirectLOGIC DL205/DL260 quirk simulator on port 502, slave id 1, modeling the behaviors documented in docs/v2/dl205.md as concrete register values so DL205 integration tests can assert each quirk WITHOUT a live PLC. Per-quirk encoding: V0 marker at HR[0]=0xCAFE proves register 0 is valid (rejects-register-0 rumour disproved); V2000 marker at HR[1024]=0x2000 proves V-memory octal-to-decimal mapping; V40400 marker at HR[8448]=0x4040 proves V40400→PDU 0x2100 (NOT register 0, contrary to the widespread shorthand); 'Hello' string at HR[1040..1042] packed first-char-low-byte (HR[1040]=0x6548 = 'H' lo + 'e' hi, HR[1041]=0x6C6C, HR[1042]=0x006F) — the headline string-byte-order quirk the user flagged; Float32 1.5f at HR[1056..1057] in CDAB word order (low word first: 0, then 0x3FC0); BCD register at HR[1072]=0x1234 representing decimal 1234 in BCD nibbles (NOT binary 0x04D2); 128-register block at HR[1280..1407] for FC03-128-cap testing; Y0 marker at coil 2048, C0 marker at coil 3072, scratch C-coils at 4000..4007 for write tests.
Critical limitation flagged inline + in README: ModbusPal 1.6b CANNOT represent the DL205 quirks semantically — it has no string binding, no BCD binding, no arbitrary-byte-layout binding (only SINT16/SINT32/FLOAT32 with word-order). So every DL205 quirk is encoded as a pre-computed raw 16-bit integer with the math worked out in inline comments above each register. Becomes unreadable past ~50 quirky registers; the README's 'alternatives' section recommends switching to pymodbus when that threshold approaches (pymodbus's ModbusSimulatorServer has first-class headless + scriptable callbacks for byte-level layouts).
Other ModbusPal 1.6b limitations called out in README: only holding_registers + coils sections in the official build (no input_registers / discrete_inputs — DL260 X-input markers can't be encoded faithfully here, FC02/FC04 tests wait for a fork or pymodbus); abandoned project (last release 1.6b, active forks at SCADA-LTS/ModbusPal, ControlThings-io/modbuspal, mrhenrike/ModbusPalEnhanced); no headless mode in the official JAR (-loadFile / -hide flags only in source-built forks); CVE-2018-10832 XXE on .xmpp import (don't import untrusted profiles — the in-repo ones are author-controlled).
README.md updated with: per-profile description tables, getting-started (download jar + java -jar + GUI File>Load>Run), MODBUS_SIM_ENDPOINT env-var override doc, two reference tables documenting which HR / coil address encodes which DL205 quirk + which test name asserts it (the same DL205_<behavior> naming convention from docs/v2/modbus-test-plan.md), 4-row alternatives comparison (pymodbus / diagslave / ModbusMechanic / ModRSsim2) for when ModbusPal can no longer carry the load, and a quick-reference XML format table at the bottom for future-me hand-authoring more profiles.
Pure documentation + test-asset PR — no code changes. The integration tests that consume these profiles (the actual DL205_<behavior> facts) land one at a time in PR 43+ as user validates each quirk via ModbusPal on the bench.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 20:05:20 -04:00
7009483d16 Merge pull request 'Phase 3 PR 41 — Document AutomationDirect DL205 / DL260 Modbus quirks' (#40) from phase-3-pr41-dl205-quirks-doc into v2 2026-04-18 19:52:20 -04:00
Joseph Doherty
9de96554dc Phase 3 PR 41 — Document AutomationDirect DL205 / DL260 Modbus quirks. Adds docs/v2/dl205.md (~300 lines, 8 H2 sections, primary-source citations) covering every place the DL205/DL260 family diverges from textbook Modbus or has non-obvious behavior a generic client gets wrong. Replaces the placeholder _pending_ list in modbus-test-plan.md with a confirmed-behaviors table that doubles as the integration-test roadmap.
The user explicitly flagged that DL205/DL260 strings don't follow Modbus convention; research turned up that and a lot more. Headline findings:
String packing — TWO chars per V-memory register but the FIRST char is in the LOW byte (opposite of the big-endian Modbus convention generic drivers default to). 'Hello' in V2000 reads back as 'eHll o\0' on a textbook decoder. Kepware's DirectLogic driver exposes a per-tag 'String Byte Order = Low/High' toggle specifically for this; we'll need the same. Null-terminated, no length prefix, no dedicated KSTR address space — strings live wherever ladder allocates them in V-memory.
V-memory addressing — DirectLOGIC's native V-memory is OCTAL (V2000, V40400) but Modbus is decimal. The CPU translates: V2000 octal = decimal 1024 = Modbus PDU 0x0400. The widespread 'V40400 = register 0' shorthand is wrong on modern firmware (that was DL05/DL06 relative mode); on H2-ECOM100 absolute mode (factory default) V40400 = PDU 0x2100. We'd surface this with an address-format helper in the device profile so operators write V2000 instead of computing 1024 by hand.
Word order CDAB for all 32-bit values — DL205 and DL260 agree, ECOM modules don't re-swap. Already supported via ModbusByteOrder.WordSwap; just needs to be the default in the DL205 profile.
BCD-as-default numeric storage — bit one I didn't expect. DirectLOGIC stores 'V2000 = 1234' as 0x1234 on the wire (BCD nibbles), not as 0x04D2 (decimal 1234). IEEE 754 Float32 only works when ladder used the explicit R type (LDR/OUTR instructions). We need a new decoder mode for BCD-encoded registers — current code assumes binary integers.
FC quantity caps — FC03/04 cap at 128 (above spec's 125 — Bonus territory, current code already respects 125), FC16 caps at 100 (BELOW spec's 123 — important bulk-write batching gotcha). Quantity overrun returns exception 03 IllegalDataValue.
Coil/discrete mappings — DL260: X0->discrete input 0, Y0->coil 2048, C0->coil 3072. SP specials at discrete input 1024-1535 RO. These are CPU-wired constants and cannot be remapped; need to be hardcoded in the DL205/DL260 device profile.
Register 0 — accepted on DL205/DL260 with ECOM in absolute mode, contrary to the widespread internet claim that 'DirectLOGIC rejects register 0'. That rumour was an older DL05/DL06 relative-mode artefact. Our ModbusProbeOptions.ProbeAddress default of 0 is therefore safe for DL205/DL260.
Exception codes — only the standard 01-04. Write-to-protected-bit returns 02 on newer firmware, 04 on older (firmware-transition revision unconfirmed); driver should map both to BadNotWritable. No proprietary exception codes.
Behavioral oddities — H2-ECOM100 accepts MAX 4 simultaneous TCP connections (5th refused at TCP accept). No TCP keepalive (intermediate NAT/firewall drops idle sockets after 2-5 min — periodic probe required). No mid-stream resync on malformed MBAP — driver must reconnect + replay. TxId-drop-under-load forum rumour is unconfirmed; our single-flight + TxId-match guard handles it either way.
Each H2 section ends with the integration-test names we'd ship per the modbus-test-plan.md DL205_<behavior> convention — twelve named test slots ready for PR 42+ to fill in one at a time. References (8) cited inline, primarily D2-USER-M, HA-ECOM-M, and the Kepware DirectLogic Ethernet driver manual which documents these vendor quirks explicitly because they have to cope with them.
modbus-test-plan.md DL205 section rewritten as a priority-ordered table with three columns (quirk / driver impact / test name), pointing the reader at dl205.md for the full reference. Operator-reported items separated into a tail subsection so future-me knows which behaviors are documented vs reproduced-on-hardware.
Pure documentation PR — no code changes. The actual driver work (string-byte-order option, BCD decoder mode, V-memory address helper, FC16 cap-per-device-family, multi-client TCP handling) lands one PR per quirk in PR 42+ as ModbusPal validation completes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:49:35 -04:00
af35fac0ef Merge pull request 'Phase 3 PR 40 — LiveStack write + subscribe tests against TestMachine_001' (#39) from phase-3-pr40-livestack-write-subscribe into v2 2026-04-18 19:41:55 -04:00
Joseph Doherty
aa8834a231 Phase 3 PR 40 — LiveStackSmokeTests: write-roundtrip + subscribe-receives-OnDataChange against the live Galaxy. Finishes LMX #5 by exercising the IWritable + ISubscribable capability paths end-to-end through the Proxy → OtOpcUaGalaxyHost service → MXAccess → real Galaxy.
Two new facts target DelmiaReceiver_001.TestAttribute — the writable Boolean UDA on the TestMachine_001 hierarchy in this dev Galaxy. The user nominated TestMachine_001 (the deployed test-target object) as a scratch surface for live testing; ZB query showed DelmiaReceiver_001 carries one dynamic_attribute named TestAttribute (mx_data_type=1=Boolean, lock_type=0=writable, security_classification=1=Operate). Naming makes the intent obvious — the attribute exists for exactly this kind of integration testing — and Boolean keeps the assertions simple (invert, write, read back).
Write_then_read_roundtrips_a_writable_Boolean_attribute_on_TestMachine_001: reads the current value as the baseline (Galaxy may return Uncertain quality until the Engine has scanned the attribute at least once — we don't read into a typed bool until Status is Good), inverts it, writes via IWritable, then polls reads in a 5s loop until either the new value comes back or the budget expires. The scan-window poll (rather than a single read after a fixed delay) accommodates Galaxy's variable scan latency on a fresh service start. Restore-on-finally writes the original value back so re-running the test doesn't accumulate a flipped TestAttribute on the dev box (Galaxy holds UDA values across runs since they're deployed). Best-effort restore — swallows exceptions so a failure in restore doesn't mask the primary assertion.
Subscribe_fires_OnDataChange_with_initial_value_then_again_after_a_write: subscribes to the same attribute with a 250ms publishing interval, captures every OnDataChange notification onto a thread-safe ConcurrentQueue (MXAccess advisory fires on its own thread per Galaxy's COM apartment model — must not block it), waits up to 5s for the initial-value callback (per ISubscribable's contract: 'driver MAY fire OnDataChange immediately with the current value'), records the queue depth as a baseline, writes the toggled value, waits up to 8s for at least one MORE notification, then searches the queue tail for the notification carrying the toggled value (initial value may appear multiple times before the write commits — looking at the tail finds the post-write delta even if the queue grew during the wait window). Unsubscribes on finally + restores baseline.
Both tests use Convert.ToBoolean(value ?? false) to defensively handle the Boxed-vs-typed quirk in MessagePack-deserialized Galaxy values — depending on the wire encoding the Boolean might come back as System.Boolean or System.Object boxing one. Convert.ToBoolean handles both. Same pattern in OnReadValue's existing usage.
WaitForAsync helper does the loop+budget pattern shared by both tests.
PR 40 is the code side of LMX #5's final two deferred facts. To actually run them green requires re-executing from a normal (non-admin) PowerShell — the elevated-shell skip from PR 39 fires correctly under bash + sc.exe-context (verified). lmx-followups.md #5 updated to note the new facts + the run command + the one remaining genuine follow-up (alarm-condition fact when an alarm-flagged attribute is deployed on TestMachine_001).
Test posture from elevated bash: 7 LiveStackSmokeTests facts discovered (was 5; +2 new), all skip cleanly with the elevation message. Build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:38:34 -04:00
976e73e051 Merge pull request 'Phase 3 PR 39 — LiveStackFixture skip-with-reason for elevated shells' (#38) from phase-3-pr39-elevated-shell-skip into v2 2026-04-18 19:31:30 -04:00
Joseph Doherty
8fb3dbe53b Phase 3 PR 39 — LiveStackFixture pre-flight detect for elevated shell. The OtOpcUaGalaxyHost named-pipe ACL allows the configured SID but explicitly DENIES Administrators per decision #76 / PipeAcl.cs (production-hardening — keeps an admin shell on a deployed box from connecting to the IPC channel without going through the configured service principal). A test process running with a high-integrity elevated token carries the Administrators group in its security context regardless of whose user it 'is', so the deny rule trumps the user's allow and the pipe connect returns UnauthorizedAccessException at the prerequisite-probe stage. Functionally correct but operationally confusing — when this hit during the PR 38 install workflow it took five steps to diagnose ('the user IS in the allow list, why is the pipe denying access?'). The pre-existing ParityFixture (PR 18) already documents this with an explicit early-skip; LiveStackFixture (PR 37) didn't.
PR 39 closes the gap. New IsElevatedAdministratorOnWindows static helper (Windows-only via RuntimeInformation.IsOSPlatform; non-Windows hosts return false and let the prerequisite probe own the skip-with-reason path) checks WindowsPrincipal.IsInRole(WindowsBuiltInRole.Administrator) on the current process token. When true, InitializeAsync short-circuits to a SkipReason that names the cause directly: 'elevated token's Admins group membership trumps the allow rule — re-run from a NORMAL (non-admin) PowerShell window'. Catches and swallows any probe-side exception so a Win32 oddity can't crash the test fixture; failed probe falls through to the regular prerequisite path.
The check fires BEFORE AvevaPrerequisites.CheckAllAsync runs because the prereq probe's own pipe connect hits the same admin-deny and surfaces UnauthorizedAccessException with no context. Short-circuiting earlier saves the 10-second probe + produces a single actionable line.
Tests — verified manually from an elevated bash session against the just-installed OtOpcUaGalaxyHost service: skip message reads 'Test host is running with elevated (Administrators) privileges, but the OtOpcUaGalaxyHost named-pipe ACL explicitly denies Administrators per the IPC security design (decision #76 / PipeAcl.cs). Re-run from a NORMAL (non-admin) PowerShell window — even when your user is already in the pipe's allow list, the elevated token's Admins group membership trumps the allow rule.' Proxy.Tests Unit: 17 pass / 0 fail (unchanged — fixture change is non-breaking; existing tests don't run as admin in normal CI flow). Build clean.
Bonus: gitignored .local/ directory (a previous direct commit on local v2 that I'm now landing here) so per-install secrets like the Galaxy.Host shared-secret file don't leak into the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 19:17:43 -04:00
Joseph Doherty
a61e637411 Gitignore .local/ directory for dev-only secrets like the Galaxy.Host shared secret. Created during the PR 38 / install-services workflow to keep per-install secrets out of the repo. 2026-04-18 19:15:13 -04:00
e4885aadd0 Merge pull request 'Phase 3 PR 38 — DriverNodeManager HistoryRead override (LMX #1 finish)' (#37) from phase-3-pr38-historyread-servicehandler into v2 2026-04-18 17:53:24 -04:00
Joseph Doherty
52a29100b1 Phase 3 PR 38 — DriverNodeManager HistoryRead override (LMX #1 finish). Wires the OPC UA HistoryRead service through CustomNodeManager2's four protected per-kind hooks — HistoryReadRawModified / HistoryReadProcessed / HistoryReadAtTime / HistoryReadEvents — each dispatching to the driver's IHistoryProvider capability (PR 35 for ReadAtTime + ReadEvents on top of PR 19-era ReadRaw + ReadProcessed). Was the last missing piece of the end-to-end HistoryRead path: PR 10 + PR 11 shipped the Galaxy.Host IPC contracts, PR 35 surfaced them on IHistoryProvider + GalaxyProxyDriver, but no server-side handler bridged OPC UA HistoryRead service requests onto the capability interface. Now it does.
Per-kind override shape: each hook receives the pre-filtered nodesToProcess list (NodeHandles for nodes this manager claimed), iterates them, resolves handle.NodeId.Identifier to the driver-side full reference string, and dispatches to the right IHistoryProvider method. Write back into the outer results + errors slots at handle.Index (not the local loop counter — nodesToProcess is a filtered subset of nodesToRead, so indexing by the loop counter lands in the wrong slot for mixed-manager batches). WriteResult helper sets both results[i] AND errors[i]; this matters because MasterNodeManager merges them and leaving errors[i] at its default (BadHistoryOperationUnsupported) overrides a Good result with Unsupported on the wire — this was the subtle failure mode that masked a correctly-constructed HistoryData response during debugging. Failure-isolation per node: NotSupportedException from a driver that doesn't implement a particular HistoryProvider method translates to BadHistoryOperationUnsupported in that slot; generic exceptions log and surface BadInternalError; unresolvable NodeIds get BadNodeIdUnknown. The batch continues unconditionally.
Aggregate mapping: MapAggregate translates ObjectIds.AggregateFunction_Average / Minimum / Maximum / Total / Count to the driver's HistoryAggregateType enum. Null for anything else (e.g. TimeAverage, Interpolative) so the handler surfaces BadAggregateNotSupported at the batch level — per Part 13, one unsupported aggregate means the whole request fails since ReadProcessedDetails carries one aggregate list for all nodes. BuildHistoryData wraps driver DataValueSnapshots as Opc.Ua.HistoryData in an ExtensionObject; BuildHistoryEvent wraps HistoricalEvents as Opc.Ua.HistoryEvent with the canonical BaseEventType field list (EventId, SourceName, Message, Severity, Time, ReceiveTime — the order OPC UA clients that didn't customize the SelectClause expect). ToDataValue preserves null SourceTimestamp (Galaxy historian rows often carry only ServerTimestamp) — synthesizing a SourceTimestamp would lie about actual sample time.
Two address-space changes were required to make the stack dispatch reach the per-kind hooks at all: (1) historized variables get AccessLevels.HistoryRead added to their AccessLevel byte — the base's early-gate check on (variable.AccessLevel & HistoryRead != 0) was rejecting requests before our override ever ran; (2) the driver-root folder gets EventNotifiers.HistoryRead | SubscribeToEvents so HistoryReadEvents can target it (the conventional pattern for alarm-history browse against a driver-owned object). Document the 'set both bits' requirement inline since it's not obvious from the surface API.
OpcHistoryReadResult alias: Opc.Ua.HistoryReadResult (service-layer per-node result) collides with Core.Abstractions.HistoryReadResult (driver-side samples + continuation point) by type name; the alias 'using OpcHistoryReadResult = Opc.Ua.HistoryReadResult' keeps the override signatures unambiguous and the test project applies the mirror pattern for its stub driver impl.
Tests — DriverNodeManagerHistoryMappingTests (12 new Category=Unit cases): MapAggregate translates each supported aggregate NodeId via reflection-backed theory (guards against the stack renaming AggregateFunction_* constants); returns null for unsupported NodeIds (TimeAverage) and null input; BuildHistoryData wraps samples with correct DataValues + SourceTimestamp preservation; BuildHistoryEvent emits the 6-element BaseEventType field list in canonical order (regression guard for a future 'respect the client's SelectClauses' change); null SourceName / Message translate to empty-string Variants (nullable-Variant refactor trap); ToDataValue preserves StatusCode + both timestamps; ToDataValue leaves SourceTimestamp at default when the snapshot omits it. HistoryReadIntegrationTests (5 new Category=Integration): drives a real OPC UA client Session.HistoryRead against a fake HistoryDriver through the running server. Covers raw round-trip (verifies per-node DataValue ordering + values); processed with Average aggregate (captures the driver's received aggregate + interval, asserting MapAggregate routed correctly); unsupported aggregate (TimeAverage → BadAggregateNotSupported); at-time (forwards the per-timestamp list); events (BaseEventType field list shape, SelectClauses populated to satisfy the stack's filter validator). Server.Tests Unit: 55 pass / 0 fail (43 prior + 12 new mapping). Server.Tests Integration: 14 pass / 0 fail (9 prior + 5 new history). Full solution build clean, 0 errors.
lmx-followups.md #1 updated to 'DONE (PRs 35 + 38)' with two explicit deferred items: continuation-point plumbing (driver returns null today so pass-through is fine) and per-SelectClause evaluation in HistoryReadEvents (clients with custom field selections get the canonical BaseEventType layout today).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:50:23 -04:00
19bcf20fbe Merge pull request 'Phase 3 PR 37 — End-to-end live-stack Galaxy smoke test' (#36) from phase-3-pr37-live-stack-smoke into v2 2026-04-18 16:56:50 -04:00
Joseph Doherty
8adc8f5ab8 Phase 3 PR 37 — End-to-end live-stack Galaxy smoke test. Closes the code side of LMX follow-up #5; once OtOpcUaGalaxyHost is installed + started on the dev box, the suite exercises the full topology GalaxyProxyDriver in-process → named-pipe IPC → running OtOpcUaGalaxyHost Windows service → MxAccessGalaxyBackend → live MXAccess runtime → real deployed Galaxy objects. Never spawns the Host process itself — connects to the already-running service per project_galaxy_host_service.md, which is the only way to exercise the production COM-apartment + service-account + pipe-ACL configuration.
LiveStackConfig resolves the pipe name + per-install shared secret from two sources in order: OTOPCUA_GALAXY_PIPE + OTOPCUA_GALAXY_SECRET env vars first (for CI / benchwork overrides), then the service's per-process Environment registry values under HKLM\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost (what Install-Services.ps1 writes at install time). Registry read requires the test host to run elevated on most boxes — the skip message says so explicitly so operators see the right remediation. Hard-coded secrets are deliberately avoided: the installer generates 32 fresh random bytes per install, a committed secret would diverge from production the moment the service is re-installed.
LiveStackFixture is an IAsyncLifetime that (1) runs AvevaPrerequisites.CheckAllAsync with CheckGalaxyHostPipe=true + CheckHistorian=false — produces a structured PrerequisiteReport whose SkipReason is the exact operator-facing 'here's what you need to fix' text, (2) resolves LiveStackConfig and surfaces a clear skip when the secret isn't discoverable, (3) instantiates GalaxyProxyDriver + calls InitializeAsync (the IPC handshake), capturing a skip with the exception detail + common-cause hints (secret mismatch, SID not in pipe ACL, Host's backend couldn't connect to ZB) rather than letting a NullRef cascade through every subsequent test. SkipIfUnavailable() translates the captured SkipReason into Assert.Skip at the top of every fact so tests read as cleanly-skipped with a visible reason, not silently-passed or crashed.
LiveStackSmokeTests (5 facts, Collection=LiveStack, Category=LiveGalaxy): Fixture_initialized_successfully (cheapest possible end-to-end assertion — if this passes, the IPC handshake worked); Driver_reports_Healthy_after_IPC_handshake (DriverHealth.State post-connect); DiscoverAsync_returns_at_least_one_variable_from_live_galaxy (captures every Variable() call from DiscoverAsync via CapturingAddressSpaceBuilder and asserts > 0 — zero here usually means the Host couldn't read ZB, the skip message names OTOPCUA_GALAXY_ZB_CONN to check); GetHostStatuses_reports_at_least_one_platform (IHostConnectivityProbe surface — zero means the probe loop hasn't fired or no Platform is deployed locally); Can_read_a_discovered_variable_from_live_galaxy (reads the first discovered attribute's full reference, asserts status != BadInternalError — Galaxy's Uncertain-quality-until-first-Engine-scan is intentionally NOT treated as failure since it depends on runtime state that varies across test runs). Read-only by design; writes need an agreed scratch tag to avoid mutating a process-critical attribute — deferred to a follow-up PR that reuses this fixture.
CapturingAddressSpaceBuilder is a minimal IAddressSpaceBuilder that flattens every Variable() call into a list so tests can inspect what discovery produced without booting the full OPC UA node-manager stack; alarm annotation + property calls are no-ops. Scoped private to the test class.
Galaxy.Proxy.Tests csproj gains a ProjectReference to Driver.Galaxy.TestSupport (PR 36) for AvevaPrerequisites. The NU1702 warning about the Host project being net48-referenced-by-net10 is pre-existing from the HostSubprocessParityTests — Proxy.Tests only needs the Host EXE path for that parity scenario, not type surface.
Test run on THIS machine (OtOpcUaGalaxyHost not yet installed): Skipped! Failed 0, Passed 0, Skipped 5 — each skip message includes the full prerequisites report pointing at the missing service. Once the service is installed + started (scripts\install\Install-Services.ps1), the 5 facts will execute against live Galaxy. Proxy.Tests Unit: 17 pass / 0 fail (unchanged — new tests are Category=LiveGalaxy, separate suite). Full Proxy build clean. Memory already captures the 'live tests run via already-running service, don't spawn' convention (project_galaxy_host_service.md).
lmx-followups.md #5 updated: status is 'IN PROGRESS' across PRs 36 + 37 with the explicit remaining work (install + start services, subscribe-and-receive, write round-trip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:49:51 -04:00
261869d84e Merge pull request 'Phase 3 PR 36 — AVEVA prerequisites test-support library' (#35) from phase-3-pr36-aveva-prerequisites into v2 2026-04-18 16:44:41 -04:00
Joseph Doherty
08c90d19fd Phase 3 PR 36 — AVEVA prerequisites test-support library. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport multi-targeted class library (net10.0 + net48 so both the modern and the MXAccess-COM x86 test projects can consume it) that probes every piece of the AVEVA System Platform + OtOpcUa stack a live-Galaxy test depends on and returns a structured PrerequisiteReport. Closes the gap where live-smoke tests silently returned 'unreachable' without telling operators which specific piece failed.
AvevaPrerequisites.CheckAllAsync walks eight probe categories producing PrerequisiteCheck rows each with Name (e.g. 'service:aaBootstrap', 'sql:ZB', 'com:LMXProxy', 'registry:ArchestrA.Framework'), Category (AvevaCoreService / AvevaSoftService / AvevaInstall / MxAccessCom / GalaxyRepository / AvevaHistorian / OtOpcUaService / Environment), Status (Pass / Warn / Fail / Skip), and operator-facing Detail message. Report aggregates them: IsLivetestReady (no Fails anywhere) and IsAvevaSideReady (AVEVA-side categories pass, our v2 services can be absent while still considering the environment AVEVA-ready) so different test tiers can use the right threshold.
Individual probes: ServiceProbe.Check queries the Windows Service Control Manager via System.ServiceProcess.ServiceController — treats DemandStart+Stopped as Warn (NmxSvc is DemandStart by design; master pulls it up) but AutoStart+Stopped as Fail; not-installed is Fail for hard-required services, Warn for soft ones; non-Windows hosts get Skip; transitional states like StartPending get Warn with a 'try again' hint. RegistryProbe reads HKLM\SOFTWARE\WOW6432Node\ArchestrA\{Framework,Framework\Platform,MSIInstall} — Framework key presence + populated InstallPath/RootPath values mean System Platform installed; PfeConfigOptions in the Platform subkey (format 'PlatformId=N,EngineId=N,...') indicates a Platform has been deployed from the IDE (PlatformId=0 means never deployed — MXAccess will connect but every subscription will be Bad quality); RebootRequired='True' under MSIInstall surfaces as a loud warn since post-patch behavior is undefined. MxAccessComProbe resolves the LMXProxy.LMXProxyServer ProgID → CLSID → HKLM\SOFTWARE\Classes\WOW6432Node\CLSID\{guid}\InprocServer32, verifying the registered file exists on disk (catches the orphan-registry case where a previous uninstall left the ProgID registered but the DLL is gone — distinguishes it from the 'totally not installed' case by message); also emits a Warn when the test process is 64-bit (MXAccess COM activation fails with REGDB_E_CLASSNOTREG 0x80040154 regardless of registration, so seeing this warning tells operators why the activation would fail even on a fully-installed machine). SqlProbe tests Galaxy Repository via Microsoft.Data.SqlClient using the Windows-auth localhost connection string the repo code defaults to — distinguishes 'SQL Server unreachable' (connection fails) from 'ZB database does not exist' (SELECT DB_ID('ZB') returns null) because they have different remediation paths (sc.exe start MSSQLSERVER vs. restore from .cab backup); a secondary CheckDeployedObjectCountAsync query on 'gobject WHERE deployed_version > 0' warns when the count is zero because discovery smoke tests will return empty hierarchies. NamedPipeProbe opens a 2s NamedPipeClientStream against OtOpcUaGalaxyHost's pipe ('OtOpcUaGalaxy' per the installer default) — pipe accepting a connection proves the Host service is listening; disconnects immediately so we don't consume a session slot.
Service lists kept as internal static data so tests can inspect + override: CoreServices (aaBootstrap + aaGR + NmxSvc + MSSQLSERVER — hard fail if missing), SoftServices (aaLogger + aaUserValidator + aaGlobalDataCacheMonitorSvr — warn only; stack runs without them but diagnostics/auth are degraded), HistorianServices (aahClientAccessPoint + aahGateway — opt-in via Options.CheckHistorian, only matters for HistoryRead IPC paths), OtOpcUaServices (our OtOpcUaGalaxyHost hard-required for end-to-end live tests + OtOpcUa warn + GLAuth warn). Narrower entry points CheckRepositoryOnlyAsync and CheckGalaxyHostPipeOnlyAsync for tests that only care about specific subsystems — avoid paying the full probe cost on every GalaxyRepositoryLiveSmokeTests fact.
Multi-targeting mechanics: System.ServiceProcess.ServiceController + Microsoft.Win32.Registry are NuGet packages on net10 but in-box BCL references on net48; csproj conditions Package vs Reference by TargetFramework. Microsoft.Data.SqlClient v6 supports both frameworks so single PackageReference. Net48Polyfills.cs provides IsExternalInit shim (records/init-only setters) and SupportedOSPlatformAttribute stub so the same Probe sources compile on both frameworks without per-callsite preprocessor guards — lets Roslyn's platform-compatibility analyzer stay useful on net10 without breaking net48 builds.
Existing GalaxyRepositoryLiveSmokeTests updated to delegate its skip decision to AvevaPrerequisites.CheckRepositoryOnlyAsync (legacy ZbReachableAsync kept as a compatibility adapter so the in-test 'if (!await ZbReachableAsync()) return;' pattern keeps working while the surrounding fixtures gradually migrate to Assert.Skip-with-reason). Slnx file registers the new project.
Tests — AvevaPrerequisitesLiveTests (8 new Integration cases, Category=LiveGalaxy): the helper correctly reports Framework install (registry pass), aaBootstrap Running (service pass), aaGR Running (service pass), MxAccess COM registered (com pass), ZB database reachable (sql pass), deployed-object count > 0 (warn-upgraded-to-pass because this box has 49 objects deployed), the AVEVA side is ready even when our own services (OtOpcUaGalaxyHost) aren't installed yet (IsAvevaSideReady=true), and the helper emits rows for OtOpcUaGalaxyHost + OtOpcUa + GLAuth even when not installed (regression guard — nobody can accidentally ship a check that omits our own services). Full Galaxy.Host.Tests Category=LiveGalaxy suite: 13 pass (5 prior smoke + 8 new prerequisites). Full solution build clean, 0 errors.
What's NOT in this PR: end-to-end Galaxy stack smoke (Proxy → Host pipe → MXAccess → real Galaxy tag). That's the next PR — this one is the gate the end-to-end smoke will call first to produce actionable skip messages instead of silent returns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:36:13 -04:00
5cc120d836 Merge pull request 'Phase 3 PR 35 — IHistoryProvider gains ReadAtTime + ReadEvents; Proxy implements both' (#34) from phase-3-pr35-history-readtime-readevents into v2 2026-04-18 16:12:43 -04:00
Joseph Doherty
bf329b05d8 Phase 3 PR 35 — IHistoryProvider gains ReadAtTimeAsync + ReadEventsAsync; GalaxyProxyDriver implements both. Extends Core.Abstractions.IHistoryProvider with two new methods that round out the OPC UA Part 11 HistoryRead surface (HistoryReadAtTime + HistoryReadEvents are the last two modes not covered by the PR 19-era ReadRawAsync + ReadProcessedAsync) and wires GalaxyProxyDriver to call the existing PR-10/PR-11 IPC contracts the Host already implements.
Interface additions use C# default interface implementations that throw NotSupportedException — existing IHistoryProvider implementations keep compiling, only drivers whose backend carries the relevant capability override. This matches the 'capabilities are optional per driver' design already used by IHistoryProvider.ReadProcessedAsync's docs (Modbus / OPC UA Client drivers never had an event historian and the default-throw path lets callers see BadHistoryOperationUnsupported naturally). New HistoricalEvent record models one historian row (EventId, SourceName, EventTimeUtc + ReceivedTimeUtc — process vs historian-persist timestamps, Message, Severity mapped to OPC UA's 1-1000 range); HistoricalEventsResult pairs the event list with a continuation-point token for future batching. Both live in Core.Abstractions so downstream (Proxy, Host, Server) reference a single domain shape — no Shared-contract leak into the driver-facing interface.
GalaxyProxyDriver.ReadAtTimeAsync maps the domain DateTime[] to Unix-ms longs, calls CallAsync on the existing MessageKind.HistoryReadAtTimeRequest, and trusts the Host's one-sample-per-requested-timestamp contract (the Host pads with bad-quality snapshots for timestamps it can't interpolate; re-aligning on the Proxy side would duplicate the Host's interpolation policy logic). ReadEventsAsync does the same for HistoryReadEventsRequest; ToHistoricalEvent translates GalaxyHistoricalEvent (MessagePack-annotated, Unix-ms) to the domain record, explicitly tagging DateTimeKind.Utc on both timestamp fields so downstream serializers (JSON, OPC UA types) don't apply an unexpected local-time offset.
Tests — HistoricalEventMappingTests (3 new Proxy.Tests unit cases): every field maps correctly from wire to domain; null SourceName and null DisplayText preserve through the mapping (system events without a source come out with null so callers can distinguish them from alarm events); both timestamps come out as DateTimeKind.Utc (regression guard against a future refactor using DateTime.FromFileTimeUtc or similar that defaults to Unspecified). Driver.Galaxy.Proxy.Tests Unit suite: 17 pass / 0 fail (14 prior + 3 new). Full solution build clean, 0 errors.
Scope exclusions — DriverNodeManager HistoryRead service-handler wiring (on the OPC UA Server side, where HistoryReadAtTime and HistoryReadEvents service requests land) and the full-loop integration test (OPC UA client → server → IPC → Host → HistorianDataSource → back) are deferred to a focused follow-up PR. The capability surface is the load-bearing change; wiring the service handlers is mechanical in comparison and worth its own PR for reviewability. docs/v2/lmx-followups.md #1 updated with the split.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:08:27 -04:00
2584379e75 Merge pull request 'Phase 3 PR 34 — Host-status publisher (Server) + /hosts drill-down page (Admin)' (#33) from phase-3-pr34-host-status-publisher-page into v2 2026-04-18 16:04:20 -04:00
Joseph Doherty
ef2a810b2d Phase 3 PR 34 — Host-status publisher (Server) + /hosts drill-down page (Admin). Closes LMX follow-up #7 by wiring together the data layer from PR 33. Server.HostStatusPublisher is a BackgroundService that walks every driver registered in DriverHost every 10 seconds, skips drivers that don't implement IHostConnectivityProbe, calls GetHostStatuses() on each probe-capable driver, and upserts one DriverHostStatus row per (NodeId, DriverInstanceId, HostName) into the central config DB. Upsert path: SingleOrDefaultAsync on the composite PK; if no row exists, Add a new one; if a row exists, LastSeenUtc advances unconditionally (heartbeat) and State + StateChangedUtc update only on transitions so Admin UI can distinguish 'still reporting, still Running' from 'freshly transitioned to Running'. MapState translates Core.Abstractions.HostState to Configuration.Enums.DriverHostState (intentional duplicate enum — Configuration project stays free of driver-runtime deps per PR 33's choice). If a driver's GetHostStatuses throws, log warning and skip that driver this tick — never take down the Server on a publisher failure. If the DB is unreachable, log warning + retry next heartbeat (no buffering — next tick's current-state snapshot is more useful than replaying stale transitions after a long outage). 2-second startup delay so NodeBootstrap's RegisterAsync calls land before the first publish tick, then tick runs immediately so a freshly-started Server surfaces its host topology in the Admin UI without waiting a full interval.
Polling chosen over event-driven for initial scope: simpler, matches Admin UI consumer cadence, avoids DriverHost lifecycle-event plumbing that doesn't exist today. Event-driven push for sub-heartbeat latency is a straightforward follow-up.
Admin.Services.HostStatusService left-joins DriverHostStatus against ClusterNode on NodeId so rows persist even when the ClusterNode entry doesn't exist yet (first-boot bootstrap case). StaleThreshold = 30s — covers one missed publisher heartbeat plus a generous buffer for clock skew and GC pauses. Admin Components/Pages/Hosts.razor — FleetAdmin-visible page grouped by cluster (handles the '(unassigned)' case for rows without a matching ClusterNode). Four summary cards (Hosts / Running / Stale / Faulted); per-cluster table with Node / Driver / Host / State + Stale-badge / Last-transition / Last-seen / Detail columns; 10s auto-refresh via IServiceScopeFactory timer pattern matching FleetStatusPoller + Fleet dashboard (PR 27). Row-class highlighting: Faulted → table-danger, Stale → table-warning, else default. State badge maps DriverHostState enum to bootstrap color classes. Sidebar link added between 'Fleet status' and 'Clusters'.
Server csproj adds Microsoft.EntityFrameworkCore.SqlServer 10.0.0 + registers OtOpcUaConfigDbContext in Program.cs scoped via NodeOptions.ConfigDbConnectionString (no Admin-style manual SQL raw — the DbContext is the only access path, keeps migrations owner-of-record).
Tests — HostStatusPublisherTests (4 new Integration cases, uses per-run throwaway DB matching the FleetStatusPollerTests pattern): publisher upserts one row per host from each probe-capable driver and skips non-probe drivers; second tick advances LastSeenUtc without creating duplicate rows (upsert pattern verified end-to-end); state change between ticks updates State AND StateChangedUtc (datetime2(3) rounds to millisecond precision so comparison uses 1ms tolerance — documented inline); MapState translates every HostState enum member. Server.Tests Integration: 4 new tests pass. Admin build clean, Admin.Tests Unit still 23 / 0. docs/v2/lmx-followups.md item #7 marked DONE with three explicit deferred items (event-driven push, failure-count column, SignalR fan-out).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:51:55 -04:00
a7764e50f3 Merge pull request 'Phase 3 PR 33 — DriverHostStatus entity + migration (LMX #7 data layer)' (#32) from phase-3-pr33-driverhoststatus-entity into v2 2026-04-18 15:43:37 -04:00
Joseph Doherty
8464e3f376 Phase 3 PR 33 — DriverHostStatus entity + EF migration (data-layer for LMX #7). New DriverHostStatus entity with composite key (NodeId, DriverInstanceId, HostName) persists each server node's per-host connectivity view — one row per (server node, driver instance, probe-reported host), which means a redundant 2-node cluster with one Galaxy driver reporting 3 platforms produces 6 rows because each server node owns its own runtime view of the shared host topology, not 3. Fields: NodeId (64), DriverInstanceId (64), HostName (256 — fits Galaxy FQDNs and Modbus host:port strings), State (DriverHostState enum — Unknown/Running/Stopped/Faulted, persisted as nvarchar(16) via HasConversion<string> so DBAs inspecting the table see readable state names not ordinals), StateChangedUtc + LastSeenUtc (datetime2(3) — StateChangedUtc tracks actual transitions while LastSeenUtc advances on every publisher heartbeat so the Admin UI can flag stale rows from a crashed Server independent of State), Detail (nullable 1024 — exception message from the driver's probe when Faulted, null otherwise).
DriverHostState enum lives in Configuration.Enums/ rather than reusing Core.Abstractions.HostState so the Configuration project stays free of driver-runtime dependencies (it's referenced by both the Admin process and the Server process, so pulling in the driver-abstractions assembly to every Admin build would be unnecessary weight). The server-side publisher hosted service (follow-up PR 34) will translate HostStatusChangedEventArgs.NewState to this enum on every transition.
No foreign key to ClusterNode — a Server may start reporting host status before its ClusterNode row exists (first-boot bootstrap), and we'd rather keep the status row than drop it. The Admin-side service that renders the dashboard will left-join on NodeId when presenting. Two indexes declared: IX_DriverHostStatus_Node drives the per-cluster drill-down (Admin UI joins ClusterNode on ClusterId to pick which NodeIds to fetch), IX_DriverHostStatus_LastSeen drives the stale-row query (now - LastSeen > threshold).
EF migration AddDriverHostStatus creates the table + PK + both indexes. Model snapshot updated. SchemaComplianceTests expected-tables list extended. DriverHostStatusTests (3 new cases, category SchemaCompliance, uses the shared fixture DB): composite key allows same (host, driver) across different nodes AND same (node, host) across different drivers — both real-world cases the publisher needs to support; upsert-in-place pattern (fetch-by-composite-PK, mutate, save) produces one row not two — the pattern the publisher will use; State enum persists as string not int — reading the DB via ADO.NET returns 'Faulted' not '3'.
Configuration.Tests SchemaCompliance suite: 10 pass / 0 fail (7 prior + 3 new). Configuration build clean. No Server or Admin code changes yet — publisher + /hosts page are PR 34.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:38:41 -04:00
a9357600e7 Merge pull request 'Phase 3 PR 32 — Multi-driver integration test' (#31) from phase-3-pr32-multi-driver-integration into v2 2026-04-18 15:34:16 -04:00
Joseph Doherty
2f00c74bbb Phase 3 PR 32 — Multi-driver integration test. Closes LMX follow-up #6 with Server.Tests/MultipleDriverInstancesIntegrationTests.cs: registers two StubDriver instances (alpha + beta) with distinct DriverInstanceIds on one DriverHost, boots the full OpcUaApplicationHost, and exercises three behaviors end-to-end via a real OPC UA client session. (1) Each driver's namespace URI resolves to a distinct index in the client's NamespaceUris (alpha → urn:OtOpcUa:alpha, beta → urn:OtOpcUa:beta) — proves DriverNodeManager's namespaceUris-per-driver base-ctor wiring actually lands two separate INodeManager registrations. (2) Browsing one subtree returns only that driver's folder; the other driver's folder does NOT leak into the wrong subtree. This is the test that catches a cross-driver routing regression the v1 single-driver code path couldn't surface — if a future refactor flattens both drivers into a shared namespace, the 'shouldNotContain' assertion fails cleanly. (3) Reads route to the owning driver by namespace — alpha's ReadAsync returns 42 while beta's returns 99; a misroute would surface as 99 showing up on an alpha node id or vice versa. StubDriver is parameterized on (DriverInstanceId, folderName, readValue) so the same class constructs both instances without copy-paste.
No production code changes — pure additive test. Server.Tests Integration: 3 new tests pass; existing OpcUaServerIntegrationTests stays green (single-driver case still exercised there). Full Server.Tests Unit still 43 / 0. Deferred: multi-driver alarm-event case (two drivers each raising a GalaxyAlarmEvent, assert each condition lands on its owning instance's condition node) — needs a stub IAlarmSource and is worth its own focused PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:29:49 -04:00
5d5e1f9650 Merge pull request 'Phase 3 PR 31 — Live-LDAP integration test + Active Directory compatibility' (#30) from phase-3-pr31-live-ldap-ad-compat into v2 2026-04-18 15:27:54 -04:00
Joseph Doherty
4886a5783f Phase 3 PR 31 — Live-LDAP integration test + Active Directory compatibility. Closes LMX follow-up #4 with 6 live-bind tests in Server.Tests/LdapUserAuthenticatorLiveTests.cs against the dev GLAuth instance at localhost:3893 (skipped cleanly when unreachable via Assert.Skip + a clear SkipReason — matches the GalaxyRepositoryLiveSmokeTests pattern). Coverage: valid credentials bind + surface DisplayName; wrong password fails; unknown user fails; empty credentials fail pre-flight without touching the directory; writeop user's memberOf maps through GroupToRole to WriteOperate (the exact string WriteAuthzPolicy.IsAllowed expects); admin user surfaces all four mapped roles (WriteOperate + WriteTune + WriteConfigure + AlarmAck) proving memberOf parsing doesn't stop after the first match. While wiring this up, the authenticator's hard-coded user-lookup filter 'uid=<name>' didn't match GLAuth (which keys users by cn and doesn't populate uid) — AND it doesn't match Active Directory either, which uses sAMAccountName. Added UserNameAttribute to LdapOptions (default 'uid' for RFC 2307 backcompat) so deployments override to 'cn' / 'sAMAccountName' / 'userPrincipalName' as the directory requires; authenticator filter now interpolates the configured attribute. The default stays 'uid' so existing test fixtures and OpenLDAP installs keep working without a config change — a regression guard in LdapUserAuthenticatorAdCompatTests.LdapOptions_default_UserNameAttribute_is_uid_for_rfc2307_compat pins this so a future 'helpful' default change can't silently break anyone.
Active Directory compatibility. LdapOptions xml-doc expanded with a cheat-sheet covering Server (DC FQDN), Port 389 vs 636, UseTls=true under AD LDAP-signing enforcement, dedicated read-only service account DN, sAMAccountName vs userPrincipalName vs cn trade-offs, memberOf DN shape (CN=Group,OU=...,DC=... with the CN= RDN stripped to become the GroupToRole key), and the explicit 'nested groups NOT expanded' call-out (LDAP_MATCHING_RULE_IN_CHAIN / tokenGroups is a future authenticator enhancement, not a config change). docs/security.md §'Active Directory configuration' adds a complete appsettings.json snippet with realistic AD group names (OPCUA-Operators → WriteOperate, OPCUA-Engineers → WriteConfigure, OPCUA-AlarmAck → AlarmAck, OPCUA-Tuners → WriteTune), LDAPS port 636, TLS on, insecure-LDAP off, and operator-facing notes on each field. LdapUserAuthenticatorAdCompatTests (5 unit guards): ExtractFirstRdnValue parses AD-style 'CN=OPCUA-Operators,OU=...,DC=...' DNs correctly (case-preserving — operators' GroupToRole keys stay readable); also handles mixed case and spaces in group names ('Domain Users'); also works against the OpenLDAP ou=<group>,ou=groups shape (GLAuth) so one extractor tolerates both memberOf formats common in the field; EscapeLdapFilter escapes the RFC 4515 injection set (\, *, (, ), \0) so a malicious login like 'admin)(cn=*' can't break out of the filter; default UserNameAttribute regression guard.
Test posture — Server.Tests Unit: 43 pass / 0 fail (38 prior + 5 new AD-compat guards). Server.Tests LiveLdap category: 6 pass / 0 fail against running GLAuth (would skip cleanly without). Server build clean, 0 errors, 0 warnings.
Deferred: the session-identity end-to-end check (drive a full OPC UA UserName session, then read a 'whoami' node to verify the role landed on RoleBasedIdentity). That needs a test-only address-space node and is scoped for a separate PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:23:22 -04:00
d70a2e0077 Merge pull request 'Phase 3 PR 30 — Modbus integration-test project scaffold + DL205 smoke test' (#29) from phase-3-pr30-modbus-integration-scaffold into v2 2026-04-18 15:08:45 -04:00
Joseph Doherty
cb7b81a87a Phase 3 PR 30 — Modbus integration-test project scaffold. New tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests project is the harness modbus-test-plan.md called for: a skip-when-unreachable fixture that TCP-probes a Modbus simulator endpoint (MODBUS_SIM_ENDPOINT, default localhost:502) once per test session, a DL205 device profile stub (single writable holding register at address 100, probe disabled to avoid racing with assertions), and one happy-path smoke test that initializes the real ModbusDriver + real ModbusTcpTransport, writes a known Int16 value, reads it back, and asserts status=0 + value round-trip. No DL205 quirk assertions yet — those land one-per-PR as the user validates each behavior in ModbusPal (word order for 32-bit, register-zero access, coil addressing base, max registers per FC03, response framing under load, exception code on protected-bit coil write).
ModbusSimulatorFixture is a collection fixture so the 2s TCP probe runs once per run, not per test; SkipReason gets a clear operator-facing message ('start ModbusPal or override MODBUS_SIM_ENDPOINT'). Tests call Assert.Skip(sim.SkipReason) rather than silently returning — matches the test-plan convention and reads cleanly in CI logs. DL205Profile.BuildOptions deliberately disables the background probe loop since integration tests drive reads explicitly and the probe would race with assertions. Tag naming uses the DL205_ prefix so filter 'DisplayName~DL205' surfaces device-specific failures at a glance.
Project references: xunit.v3 + Shouldly + Microsoft.NET.Test.Sdk + xunit.runner.visualstudio (matches the existing Driver.Modbus.Tests unit project), project ref to src/Driver.Modbus. Registered in ZB.MOM.WW.OtOpcUa.slnx under tests/. ModbusPal/README.md documents the dev loop (install ModbusPal jar, load profile, start simulator, dotnet test), explains MODBUS_SIM_ENDPOINT override for real-PLC benchwork, and flags DL205.xmpp as the first profile to add in a follow-up PR.
dotnet test run against the scaffold (no simulator running) skips cleanly: 0 failed, 0 passed, 1 skipped, with the SkipReason surfaced. dotnet build clean (0 warnings, 0 errors). Updated docs/v2/modbus-test-plan.md to mark the scaffold PR done and renumbered future PRs from 'PR 27+' to 'PR 31+' to stay in sync with the actual PR chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:02:39 -04:00
901d2b8019 Merge pull request 'Phase 3 PR 29 — Account/session page with roles + capabilities' (#28) from phase-3-pr29-account-page into v2 2026-04-18 14:46:45 -04:00
Joseph Doherty
d5fa1f450e Phase 3 PR 29 — Account/session page expanding the minimal sidebar role display into a dedicated /account route. Shows the authenticated operator's identity (username from ClaimTypes.NameIdentifier, display name from ClaimTypes.Name), their Admin roles as badges (from ClaimTypes.Role), the raw LDAP groups that mapped to those roles (from the 'ldap_group' claim added by Login.razor at sign-in), and a capability table listing each Admin capability with its required role and a Yes/No badge showing whether this session has it. Capability list mirrors the Program.cs authorization policies + each page's [Authorize] attribute so operators can self-service check whether their session has access without trial-and-error navigation — capabilities covered: view clusters + fleet status (all roles), edit configuration drafts (ConfigEditor or FleetAdmin per CanEdit policy), publish generations (FleetAdmin per CanPublish policy), manage certificate trust (FleetAdmin per PR 28 Certificates page attribute), manage external-ID reservations (ConfigEditor or FleetAdmin per Reservations page attribute).
Sidebar's 'Signed in as' line now wraps the display name in a link to /account so the existing sidebar-compact view becomes the entry point for the fuller page — keeps the sign-out button where it was for muscle memory, just adds the detail page one click away. Page is gated with [Authorize] (any authenticated admin) rather than a specific role — the capability table deliberately works for every signed-in user so they can see what they DON'T have access to, which helps them file the right ticket with their LDAP admin instead of getting a plain Access Denied when navigating blindly.
Capability → required-role table is defined as a private readonly record list in the page rather than pulled from a service because it's a UI-presentation concern, not runtime policy state — the runtime policy IS Program.cs's AddAuthorizationBuilder + each page's [Authorize] attribute, and this table just mirrors it for operator readability. Comment on the list reminds future-me to extend it when a new policy or [Authorize] page lands. No behavior change if roles are empty, but the page surfaces a hint ('Sign-in would have been blocked, so if you're seeing this, the session claim is likely stale') that nudges the operator toward signing out + back in.
No new tests added — the page is pure display over claims; its only logic is the 'has-capability' Any-overlap check which is exactly what ASP.NET's [Authorize(Roles=...)] does in-framework, and duplicating that in a unit test would test the framework rather than our code. Admin.Tests Unit stays 23 pass / 0 fail. Admin build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:43:35 -04:00
6fdaee3a71 Merge pull request 'Phase 3 PR 28 — Admin UI cert-trust management page' (#27) from phase-3-pr28-cert-trust into v2 2026-04-18 14:42:52 -04:00
Joseph Doherty
ed88835d34 Phase 3 PR 28 — Admin UI cert-trust management page. New /certificates route (FleetAdmin-only) surfaces the OPC UA server's PKI store rejected + trusted certs and gives operators Trust / Delete / Revoke actions so rejected client certs can be promoted without touching disk. CertTrustService reads $PkiStoreRoot/{rejected,trusted}/certs/*.der files directly via X509CertificateLoader — no Opc.Ua dependency in the Admin project, which keeps the Admin host runnable on a machine that doesn't have the full Server install locally (only needs the shared PKI directory reachable; typical deployment has Admin + Server side-by-side on the same box and PkiStoreRoot defaults match so a plain-vanilla install needs no override). CertTrustOptions bound from the Admin's 'CertTrust:PkiStoreRoot' section, default %ProgramData%\OtOpcUa\pki (matches OpcUaServerOptions.PkiStoreRoot default). Trust action moves the .der from rejected/certs/ to trusted/certs/ via File.Move(overwrite:true) — idempotent, tolerates a concurrent operator doing the same move. Delete wipes the file. Revoke removes from trusted/certs/ (Opc.Ua re-reads the Directory store on each new client handshake, so no explicit reload signal is needed; operators retry the rejected connection after trusting). Thumbprint matching is case-insensitive because X509Certificate2.Thumbprint is upper-case hex but operators copy-paste from logs that sometimes lowercase it. Malformed files in the store are logged + skipped — a single bad .der can't take the whole management page offline. Missing store directories produce empty lists rather than exceptions so a pristine install (Server never run yet, no rejected/trusted dirs yet) doesn't crash the page.
Razor page layout: two tables (Rejected / Trusted) with Subject / Issuer / Thumbprint / Valid-window / Actions columns, status banner after each action with success or warning kind ('file missing' = another admin handled it), FleetAdmin-only via [Authorize(Roles=AdminRoles.FleetAdmin)]. Each action invokes LogActionAsync which Serilog-logs the authenticated admin user + thumbprint + action for an audit trail — DB-level ConfigAuditLog persistence is deferred because its schema is cluster-scoped and cert actions are cluster-agnostic; Serilog + CertTrustService's filesystem-op info logs give the forensic trail in the meantime. Sidebar link added to MainLayout between Reservations and the future Account page.
Tests — CertTrustServiceTests (9 new unit cases): ListRejected parses Subject + Thumbprint + store kind from a self-signed test cert written into rejected/certs/; rejected and trusted stores are kept separate; TrustRejected moves the file and the Rejected list is empty afterwards; TrustRejected with a thumbprint not in rejected returns false without touching trusted; DeleteRejected removes the file; UntrustCert removes from trusted only; thumbprint match is case-insensitive (operator UX); missing store directories produce empty lists instead of throwing DirectoryNotFoundException (pristine-install tolerance); a junk .der in the store is logged + skipped and the valid certs still surface (one bad file doesn't break the page). Full Admin.Tests Unit suite: 23 pass / 0 fail (14 prior + 9 new). Full Admin build clean — 0 errors, 0 warnings.
lmx-followups.md #3 marked DONE with a cross-reference to this PR and a note that flipping AutoAcceptUntrustedClientCertificates to false as the production default is a deployment-config follow-up, not a code gap — the Admin UI is now ready to be the trust gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:37:55 -04:00
5389d4d22d Phase 3 PR 27 � Fleet status dashboard page (#26) 2026-04-18 14:07:16 -04:00
Joseph Doherty
b5f8661e98 Phase 3 PR 27 — Fleet status dashboard page. New /fleet route shows per-node apply state (ClusterNodeGenerationState joined with ClusterNode for the ClusterId) in a sortable table with summary cards for Total / Applied / Stale / Failed node counts. Stale detection: LastSeenAt older than 30s triggers a table-warning row class + yellow count card. Failed rows get table-danger + red card. Badge classes per LastAppliedStatus: Applied=bg-success, Failed=bg-danger, Applying=bg-info, unknown=bg-secondary. Timestamps rendered as relative-age strings ('42s ago', '15m ago', '3h ago', then absolute date for >24h). Error column is truncated to 320px with the full message in a tooltip so the table stays readable on wide fleets. Initial data load on OnInitializedAsync; auto-refresh every 5s via a Timer that calls InvokeAsync(RefreshAsync) — matches the FleetStatusPoller's 5s cadence so the dashboard sees the most recent state without polling ahead of the broadcaster. A Refresh button also kicks a manual reload; _refreshing gate prevents double-runs when the timer fires during an in-flight query. IServiceScopeFactory (matches FleetStatusPoller's pattern) creates a fresh DI scope per refresh so the per-page DbContext can't race the timer with the render thread; no new DI registrations needed. Live SignalR hub push is deliberately deferred to a follow-up PR — the existing FleetStatusHub + NodeStateChangedMessage already works for external JavaScript clients; wiring an in-process Blazor Server consumer adds HubConnectionBuilder plumbing that's worth its own focused change. Sidebar link added to MainLayout between Overview and Clusters. Full Admin.Tests Unit suite 14 pass / 0 fail — unchanged, no tests regressed. Full Admin build clean (0 errors, 0 warnings). Closes the 'no per-driver dashboard' gap from lmx-followups item #7 at the fleet level; per-host (platform/engine/Modbus PLC) granularity still needs a dedicated page that consumes IHostConnectivityProbe.GetHostStatuses from the Server process — that's the live-SignalR follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:12:31 -04:00
4058b88784 Merge pull request 'Phase 3 PR 26 — server-layer write authorization by role' (#25) from phase-3-pr26-server-write-authz into v2 2026-04-18 13:04:35 -04:00
Joseph Doherty
6b04a85f86 Phase 3 PR 26 — server-layer write authorization gating by role. Per the user's ACL-at-server-layer directive (saved as feedback_acl_at_server_layer.md in memory), write authorization is enforced in DriverNodeManager.OnWriteValue and never delegated to the driver or to driver-specific auth (the v1 Galaxy-provided security path is explicitly not part of v2 — drivers report SecurityClassification as discovery metadata only). New WriteAuthzPolicy static class in Server/Security/ maps SecurityClassification → required role per the table documented in docs/Configuration.md: FreeAccess = no role required (anonymous sessions can write), Operate + SecuredWrite = WriteOperate, Tune = WriteTune, VerifiedWrite + Configure = WriteConfigure, ViewOnly = deny regardless of roles. Role matching is case-insensitive and role requirements do NOT cascade — a session with WriteConfigure can write Configure attributes but needs WriteOperate separately to write Operate attributes; this is deliberate so escalation is an explicit LDAP group assignment, not a hierarchy the policy silently grants. DriverNodeManager gains a _securityByFullRef Dictionary populated during Variable() registration (parallel to the existing _variablesByFullRef) so OnWriteValue can look up the classification in O(1) on the hot path. OnWriteValue casts the session's context.UserIdentity to the new IRoleBearer interface (implemented by OtOpcUaServer.RoleBasedIdentity from PR 19) — empty Roles collection when the session is anonymous; the same WriteAuthzPolicy.IsAllowed check then either short-circuits true (FreeAccess), false (ViewOnly), or walks the roles list looking for the required one. On deny, OnWriteValue logs 'Write denied for {FullRef}: classification=X userRoles=[...]' at Information level (readable trail for operator complaints) and returns BadUserAccessDenied without touching IWritable.WriteAsync — drivers never see a request we'd have refused. IRoleBearer kept as a minimal server-side interface rather than reusing some abstraction from Core.Abstractions because the concept is OPC-UA-session-scoped and doesn't generalize (the driver side has no notion of a user session). Tests — WriteAuthzPolicyTests (17 new cases): FreeAccess allows write with empty role set + arbitrary roles; ViewOnly denies write even with every role; Operate requires WriteOperate; role match is case-insensitive; Operate denies empty role set + wrong role; SecuredWrite shares Operate's requirement; Tune requires WriteTune; Tune denies WriteOperate-only (asserts roles don't cascade — this is the test that catches a future regression where someone 'helpfully' adds a role-escalation table); Configure requires WriteConfigure; VerifiedWrite shares Configure's requirement; multi-role session allowed when any role matches; unrelated roles denied; RequiredRole theory covering all 5 classified-and-mapped rows + null for FreeAccess/ViewOnly special cases. lmx-followups.md follow-up #2 marked DONE with a back-reference to this PR and the memory note. Full Server.Tests Unit suite: 38 pass / 0 fail (17 new WriteAuthz + 14 SecurityConfiguration from PR 19 + 2 NodeBootstrap + 5 others). Server.Tests Integration (Category=Integration) 2 pass — existing PR 17 anonymous-endpoint smoke tests stay green since the read path doesn't hit OnWriteValue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:01:01 -04:00
cd8691280a Merge pull request 'Phase 3 PR 25 — Modbus test plan + DL205 quirk catalog' (#24) from phase-3-pr25-modbus-test-plan into v2 2026-04-18 12:49:19 -04:00
Joseph Doherty
77d09bf64e Phase 3 PR 25 — modbus-test-plan.md: integration-test playbook with per-device quirk catalog. ModbusPal is the chosen simulator; AutomationDirect DL205 is the first target device class with 6 pending quirks to document and cover with named tests (word order for 32-bit values, register-zero access policy, coil addressing base, maximum registers per FC03, response framing under sustained load, exception code on protected-bit coil write). Each quirk placeholder has a proposed test name so the user's validation work translates directly into integration tests. Test conventions section codifies the named-per-quirk pattern, skip-when-unreachable guard, real ModbusTcpTransport usage, and inter-test isolation. Sets up the harness-and-catalog structure future device families (Allen-Bradley Micrologix, Siemens S7-1200 Modbus gateway, Schneider M340, whatever the user hits) will slot into — same per-device catalog shape, cross-device patterns section for recurring quirks that can get promoted into driver defaults. Next concrete PRs proposed: PR 26 for the integration test project scaffold + DL205 profile + fixture with skip-guard + one smoke test, PR 27+ for the individual confirmed quirks one-per-PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:45:21 -04:00
163c821e74 Merge pull request 'Phase 3 PR 24 — Modbus PLC data type extensions' (#23) from phase-3-pr24-modbus-types into v2 2026-04-18 12:32:55 -04:00
Joseph Doherty
eea31dcc4e Phase 3 PR 24 — Modbus PLC data type extensions. Extends ModbusDataType beyond the textbook Int16/UInt16/Int32/UInt32/Float32 set with Int64/UInt64/Float64 (4-register types), BitInRegister (single bit within a holding register, BitIndex 0-15 LSB-first), and String (ASCII packed 2 chars per register with StringLength-driven sizing). Adds ModbusByteOrder enum on ModbusTagDefinition covering the two word-orderings that matter in the real PLC population: BigEndian (ABCD — Modbus TCP standard, Schneider PLCs that follow it strictly) and WordSwap (CDAB — Siemens S7 family, several Allen-Bradley series, some Modicon families). NormalizeWordOrder helper reverses word pairs in-place for 32-bit values and reverses all four words for 64-bit values (keeps bytes big-endian within each register, which is universal; swaps only the word positions). Internal codec surface switched from (bytes, ModbusDataType) pairs to (bytes, ModbusTagDefinition) because the tag carries the ByteOrder + BitIndex + StringLength context the codec needs; RegisterCount similarly takes the tag so strings can compute ceil(StringLength/2). DriverDataType mapping in MapDataType extended to cover the new logical types — Int64/UInt64 widen to Int32 (PR 25 follow-up: extend DriverDataType enum with Int64 to avoid precision loss), Float64 maps to DriverDataType.Float64, String maps to DriverDataType.String, BitInRegister surfaces as Boolean, all other mappings preserved. BitInRegister writes throw a deliberate InvalidOperationException with a 'read-modify-write' hint — to atomically flip a single bit the driver needs to FC03 the register, OR/AND in the bit, then FC06 it back; that's a separate PR because the bit-modify atomicity story needs a per-register mutex and optional compare-and-write semantics. Everything else (decoder paths for both byte orders, Int64/UInt64/Float64 encode + decode, bit-index extraction across both register halves, String nul-truncation on decode, String nul-padding on encode) ships here. Tests (21 new ModbusDataTypeTests): RegisterCount_returns_correct_register_count_per_type theory (10 rows covering every numeric type); RegisterCount_for_String_rounds_up_to_register_pair theory (5 rows including the 0-char edge case that returns 0 registers); Int32_BigEndian_decodes_ABCD_layout + Int32_WordSwap_decodes_CDAB_layout + Float32_WordSwap_encode_decode_roundtrips (covers the two most-common 32-bit orderings); Int64_BigEndian_roundtrips + UInt64_WordSwap_reverses_four_words (word-swap on 64-bit reverses the four-word layout explicitly, with the test computing the expected wire shape by hand rather than trusting the implementation) + Float64_roundtrips_under_word_swap (3.14159265358979 survives the round-trip with 1e-12 tolerance); BitInRegister_extracts_bit_at_index theory (6 rows including LSB, MSB, and arbitrary bits in a multi-bit mask); BitInRegister_write_is_not_supported_in_PR24 (asserts the exception message steers the reader to the 'read-modify-write' follow-up); String_decodes_ASCII_packed_two_chars_per_register (decodes 'HELLO!' from 3 packed registers with the 'HELLO!'u8 test-only UTF-8 literal which happens to equal the ASCII bytes for this ASCII input); String_decode_truncates_at_first_nul ('Hi' padded with nuls reads back as 'Hi'); String_encode_nul_pads_remaining_bytes (short input writes remaining bytes as 0). Full solution: 0 errors, 217 unit + integration tests pass (22 + 30 new Modbus = 52 Modbus total, 165 pre-existing). ModbusDriver capability footprint now matches the most common industrial PLC workloads — Siemens S7 + Allen-Bradley + Modicon all supported via ByteOrder config without driver forks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:27:12 -04:00
8a692d4ba8 Merge pull request 'Phase 3 PR 23 — Modbus IHostConnectivityProbe' (#22) from phase-3-pr23-modbus-probe into v2 2026-04-18 12:23:04 -04:00
Joseph Doherty
268b12edec Phase 3 PR 23 — Modbus IHostConnectivityProbe. ModbusDriver now implements 6 of 8 capability interfaces (adds IHostConnectivityProbe alongside IDriver + ITagDiscovery + IReadable + IWritable + ISubscribable from the earlier PRs). Background probe loop kicks off in InitializeAsync when ModbusProbeOptions.Enabled is true, sends a cheap FC03 read-1-register at ProbeAddress (default 0) every Interval (default 5s) with a per-tick Timeout (default 2s), and tracks the single Modbus endpoint's state in the HostState machine. Initial state = Unknown; first successful probe transitions to Running; any transport/timeout failure transitions to Stopped; recovery transitions back to Running. OnHostStatusChanged fires exactly on transitions (not on repeat successes — prevents event-spam on a healthy connection). HostName format is 'host:port' so the Admin UI can display the endpoint uniformly with Galaxy platforms/engines in the fleet status dashboard. GetHostStatuses returns a single-item list with the current state + last-change timestamp (Modbus driver talks to exactly one endpoint per instance — operators spin up multiple driver instances for multi-PLC deployments). ShutdownAsync cancels the probe CTS before tearing down the transport so the loop can't log a spurious Stopped after intentional shutdown (OperationCanceledException caught separately from the 'real' transport errors). ModbusDriverOptions extended with ModbusProbeOptions sub-record (Enabled default true, Interval 5s, Timeout 2s, ProbeAddress ushort for PLCs that have register-0 policies; most PLCs tolerate an FC03 at 0 but some industrial gateways lock it). Tests (7 new ModbusProbeTests): Initial_state_is_Unknown_before_first_probe_tick (probe disabled, state stays Unknown, HostName formatted); First_successful_probe_transitions_to_Running (enabled, waits for probe count + event queue, asserts Unknown → Running with correct OldState/NewState); Transport_failure_transitions_to_Stopped (flip fake.Reachable = false mid-run, wait for state diff); Recovery_transitions_Stopped_back_to_Running (up → down → up, asserts ≥ 3 transitions); Repeated_successful_probes_do_not_generate_duplicate_Running_events (several hundred ms of stable probes, count stays at 1); Disabled_probe_stays_Unknown_and_fires_no_events (safety guard when operator wants to disable probing); Shutdown_stops_the_probe_loop (probe count captured at shutdown, delay 400ms, assert ≤ 1 extra to tolerate the narrow race where an in-flight tick completes after shutdown — the contract is 'no new ticks scheduled' not 'instantaneous freeze'). FlappyTransport fake exposes a volatile Reachable flag so tests can toggle the PLC availability mid-run, + ProbeCount counter so tests can assert the loop actually issued requests. WaitForStateAsync helper polls GetHostStatuses up to a deadline; tolerates scheduler jitter on slow CI runners. Full solution: 0 errors, 202 unit + integration tests pass (22 Modbus + 180 pre-existing).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:12:00 -04:00
edce1be742 Merge pull request 'Phase 3 PR 22 — Modbus ISubscribable via polling overlay' (#21) from phase-3-pr22-modbus-subscribe into v2 2026-04-18 12:07:51 -04:00
Joseph Doherty
18b3e24710 Phase 3 PR 22 — Modbus ISubscribable via polling overlay. Modbus has no push model at the wire protocol (unlike MXAccess's OnDataChange callback or OPC UA's own Subscription service), so the driver layers a per-subscription polling loop on top of the existing IReadable path: SubscribeAsync returns an opaque ModbusSubscriptionHandle, starts a background Task.Run that sleeps for the requested publishing interval (floored to 100ms so a misconfigured sub-10ms request doesn't hammer the PLC), reads every subscribed tag through the same FC01/03/04 path the one-shot ReadAsync uses, diffs the returned DataValueSnapshot against the last known value per tag, and raises OnDataChange exactly when (a) it's the first poll (initial-data push per OPC UA Part 4 convention) or (b) boxed value changed or (c) StatusCode changed — stable values don't generate event traffic after the initial push, matching the v1 MXAccess OnDataChange shape. SubscriptionState record holds the handle + tag list + interval + per-subscription CancellationTokenSource + ConcurrentDictionary<string,DataValueSnapshot> LastValues; UnsubscribeAsync removes the state from _subscriptions and cancels the CTS, stopping the polling loop cleanly. Multiple overlapping subscriptions each get their own polling Task so a slow PLC on one subscription can't stall the others. ShutdownAsync cancels every active subscription CTS before tearing down the transport so the driver doesn't leave orphaned polling tasks pumping requests against a disposed socket. Transient poll errors are swallowed inside the loop (the loop continues to the next tick) — the driver's health surface reflects the last-known Degraded state from the underlying ReadAsync path. OperationCanceledException is caught separately to exit the loop silently on unsubscribe/shutdown. Tests (6 new ModbusSubscriptionTests): Initial_poll_raises_OnDataChange_for_every_subscribed_tag asserts the initial-data push fires once per tag in the subscribe call (2 tags → 2 events with FullReference='Level' and FullReference='Temp'); Unchanged_values_do_not_raise_after_initial_poll lets the loop run ~5 cycles at 100ms with a stable register value, asserts only the initial push fires (no event spam on stable tags); Value_change_between_polls_raises_OnDataChange mutates the fake register bank between poll ticks and asserts a second event fires with the new value (verified via e.Snapshot.Value.ShouldBe((short)200)); Unsubscribe_stops_the_polling_loop captures the event count right after UnsubscribeAsync, mutates a register that would have triggered a change if polling continued, asserts the count stays the same after 400ms; SubscribeAsync_floors_intervals_below_100ms passes a 10ms interval + asserts only 1 event fires across 300ms (if the floor weren't enforced we'd see 30 — the test asserts the floor semantically by counting events on stable data); Multiple_subscriptions_fire_independently creates two subs on different tags, unsubscribes only one, mutates the other's tag, asserts only the surviving sub emits while the unsubscribed one stays at its pre-unsubscribe count. FakeTransport in this test file is scoped to FC03 only since that's all the subscription path exercises — keeps the test doubles minimal and the failure modes obvious. WaitForCountAsync helper polls a ConcurrentQueue up to a deadline, makes the tests tolerant of scheduler jitter on slow CI runners. Full solution: 0 errors, 195 tests pass (6 new subscription + 9 existing Modbus + 180 pre-existing). ModbusDriver now implements IDriver + ITagDiscovery + IReadable + IWritable + ISubscribable — five of the eight capability interfaces. IAlarmSource + IHistoryProvider remain unimplemented because Modbus has no wire-level alarm or history semantics; IHostConnectivityProbe is a plausible future addition (treat transport disconnect as a Stopped signal) but needs the socket-level connection-state tracking plumbed through IModbusTransport which is its own PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:03:39 -04:00
f6a12dafe9 Merge pull request 'Phase 3 PR 21 — Modbus TCP driver (first native-protocol greenfield)' (#20) from phase-3-pr21-modbus-driver into v2 2026-04-18 11:58:20 -04:00
Joseph Doherty
058c3dddd3 Phase 3 PR 21 — Modbus TCP driver: first native-protocol greenfield for v2. New src/Driver.Modbus project with ModbusDriver implementing IDriver + ITagDiscovery + IReadable + IWritable. Validates the driver-agnostic abstractions (IAddressSpaceBuilder, DriverAttributeInfo, DataValueSnapshot, WriteRequest/WriteResult) generalize beyond Galaxy — nothing Galaxy-specific is used here. ModbusDriverOptions carries Host/Port/UnitId/Timeout + a pre-declared tag list (Modbus has no discovery protocol — tags are configuration). IModbusTransport abstracts the socket layer so tests swap in-memory fakes; concrete ModbusTcpTransport speaks the MBAP ADU (TxId + Protocol=0 + Length + UnitId + PDU) over TcpClient, serializes requests through a semaphore for single-flight in-order responses, validates the response TxId matches, surfaces server exception PDUs as ModbusException with function code + exception code. DiscoverAsync streams one folder per driver with a BaseDataVariable per tag + DriverAttributeInfo that flags writable tags as SecurityClassification.Operate vs ViewOnly for read-only regions. ReadAsync routes per-tag by ModbusRegion: FC01 for Coils, FC02 for DiscreteInputs, FC03 for HoldingRegisters, FC04 for InputRegisters; register values decoded through System.Buffers.Binary.BinaryPrimitives (BigEndian for single-register Int16/UInt16 + two-register Int32/UInt32/Float32 per standard modbus word-swap conventions). WriteAsync uses FC05 (Write Single Coil with 0xFF00/0x0000 encoding) for booleans, FC06 (Write Single Register) for 16-bit types, FC16 (Write Multiple Registers) for 32-bit types. Unknown tag → BadNodeIdUnknown; write to InputRegister or DiscreteInput or Writable=false tag → BadNotWritable; exception during transport → BadInternalError + driver health Degraded. Subscriptions + Historian + Alarms deliberately out of scope — Modbus has no push model (subscribe would be a polling overlay, additive PR) and no history/alarm semantics at the protocol level. Tests (9 new ModbusDriverTests): InitializeAsync connects + populates the tag map + sets health=Healthy; Read Int16 from HoldingRegister returns BigEndian value; Read Float32 spans two registers BigEndian (IEEE 754 single for 25.5f round-trips exactly); Read Coil returns boolean from the bit-packed response; unknown tag name returns BadNodeIdUnknown without an exception; Write UInt16 round-trips via FC06; Write Float32 uses FC16 (two-register write verified by decoding back through the fake register bank); Write to InputRegister returns BadNotWritable; Discover streams one folder + one variable per tag with correct DriverDataType mapping (Int16/Int32→Int32, UInt16/UInt32→Int32, Float32→Float32, Bool→Boolean). FakeTransport simulates a 256-register/256-coil bank + implements the 7 function codes the driver uses. slnx updated with the new src + tests entries. Full solution post-add: 0 errors, 189 tests pass (9 Modbus + 180 pre-existing). IDriver abstraction validated against a fundamentally different protocol — Modbus TCP has no AlarmExtension, no ScanState, no IPC boundary, no historian, no LDAP — and the same builder/reader/writer contract plugged straight in. Future PRs on this driver: ISubscribable via a polling loop, IHostConnectivityProbe for dead-device detection, PLC-specific data-type extensions (Int64/BCD/string-in-registers).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:55:21 -04:00
52791952dd Merge pull request 'Phase 3 PR 20 — lmx-followups.md' (#19) from phase-3-pr20-lmx-followups into v2 2026-04-18 11:50:38 -04:00
Joseph Doherty
860deb8e0d Phase 3 PR 20 — lmx-followups.md: track remaining Galaxy-bridge tasks after PR 19 (HistoryReadAtTime/Events Proxy wiring, write-gating by role, Admin UI cert trust, live-LDAP integration test, full-stack Galaxy smoke, multi-driver test, per-host dashboard). Documents what each item depends on, the shipped surface it builds on, and the minimal to-do so a future session can pick any one off in isolation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:43:15 -04:00
f5e7173de3 Merge pull request 'Phase 3 PR 19 — LDAP user identity + Basic256Sha256 security profile' (#18) from phase-3-pr19-ldap-security into v2 2026-04-18 11:36:18 -04:00
Joseph Doherty
22d3b0d23c Phase 3 PR 19 — LDAP user identity + Basic256Sha256 security profile. Replaces the anonymous-only endpoint with a configurable security profile and an LDAP-backed UserName token validator. New IUserAuthenticator abstraction in Backend/Security/: LdapUserAuthenticator binds to the configured directory (reuses the pattern from Admin.Security.LdapAuthService without the cross-app dependency — Novell.Directory.Ldap.NETStandard 3.6.0 package ref added to Server alongside the existing OPCFoundation packages) and maps group membership to OPC UA roles via LdapOptions.GroupToRole (case-insensitive). DenyAllUserAuthenticator is the default when Ldap.Enabled=false so UserName token attempts return a clean BadUserAccessDenied rather than hanging on a localhost:3893 bind attempt. OpcUaSecurityProfile enum + LdapOptions nested record on OpcUaServerOptions. Profile=None keeps the PR 17 shape (SecurityPolicies.None + Anonymous token only) so existing integration tests stay green; Profile=Basic256Sha256SignAndEncrypt adds a second ServerSecurityPolicy (Basic256Sha256 + SignAndEncrypt) to the collection and, when Ldap.Enabled=true, adds a UserName token policy scoped to SecurityPolicies.Basic256Sha256 only — passwords must ride an encrypted channel, the stack rejects UserName over None. OtOpcUaServer.OnServerStarted hooks SessionManager.ImpersonateUser: AnonymousIdentityToken passes through; UserNameIdentityToken delegates to IUserAuthenticator.AuthenticateAsync — rejected identities throw ServiceResultException(BadUserAccessDenied); accepted identities get a RoleBasedIdentity that carries the resolved roles through session.Identity so future PRs can gate writes by role. OpcUaApplicationHost + OtOpcUaServer constructors take IUserAuthenticator as a dependency. Program.cs binds the new OpcUaServer:Ldap section from appsettings (Enabled defaults false, GroupToRole parsed as Dictionary<string,string>), registers IUserAuthenticator as LdapUserAuthenticator when enabled or DenyAllUserAuthenticator otherwise. PR 17 integration test updated to pass DenyAllUserAuthenticator so it keeps exercising the anonymous-only path unchanged. Tests — SecurityConfigurationTests (new, 13 cases): DenyAllAuthenticator rejects every credential; LdapAuthenticator rejects blank creds without hitting the server; rejects when Enabled=false; rejects plaintext when both UseTls=false AND AllowInsecureLdap=false (safety guard matching the Admin service); EscapeLdapFilter theory (4 rows: plain passthrough, parens/asterisk/backslash → hex escape) — regression guard against LDAP injection; ExtractOuSegment theory (3 rows: finds ou=, returns null when absent, handles multiple ou segments by returning first); ExtractFirstRdnValue theory (3 rows: strips cn= prefix, handles single-segment DN, returns plain string unchanged when no =). OpcUaServerOptions_default_is_anonymous_only asserts the default posture preserves PR 17 behavior. InternalsVisibleTo('ZB.MOM.WW.OtOpcUa.Server.Tests') added to Server csproj so ExtractOuSegment and siblings are reachable from the tests. Full solution: 0 errors, 180 tests pass (8 Core + 14 Proxy + 24 Configuration + 6 Shared + 91 Galaxy.Host + 19 Server (17 unit + 2 integration) + 18 Admin). Live-LDAP integration test (connect via Basic256Sha256 endpoint with a real user from GLAuth, assert the session.Identity carries the mapped role) is deferred to a follow-up — it requires the GLAuth dev instance to be running at localhost:3893 which is dev-machine-specific, and the test harness for that also needs a fresh client-side certificate provisioned by the live server's trusted store.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:49:46 -04:00
55696a8750 Merge pull request 'Phase 3 PR 18 — delete v1 archived projects' (#17) from phase-3-pr18-delete-v1 into v2 2026-04-18 08:41:56 -04:00
Joseph Doherty
dd3a449308 Phase 3 PR 18 — delete v1 archived projects. PR 2 archived via IsTestProject=false + PropertyGroup comment; PR 17 landed the full v2 OPC UA server runtime (ApplicationConfiguration + endpoint + client integration test); every v1 surface is now functionally superseded. This PR removes the archive: 154 files across 5 projects — src/OtOpcUa.Host (v1 server, 158 files), src/Historian.Aveva (v1 historian plugin, 4 files), tests/OtOpcUa.Tests.v1Archive (494 unit tests that were archived in PR 2 with IsTestProject=false), tests/Historian.Aveva.Tests (18 tests against the v1 plugin), tests/OtOpcUa.IntegrationTests (6 tests against the v1 Host). slnx trimmed to reflect the current set (12 src + 12 tests). Verified zero incoming references from live projects before deleting — no live csproj references .Host or .Historian.Aveva since PR 5 ported Historian into Driver.Galaxy.Host/Backend/Historian/ and PR 17 stood up the new OtOpcUa.Server. Full solution post-delete: 0 errors, 165 unit + integration tests pass (8 Core + 14 Proxy + 24 Configuration + 91 Galaxy.Host + 6 Shared + 4 Server + 18 Admin) — no regressions. Recovery path if a future PR needs to resurrect a specific v1 routine: git revert this commit or cherry-pick the specific file from pre-delete history; v1 is preserved in the full branch history, not lost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:35:22 -04:00
3c1dc334f9 Merge pull request 'Phase 3 PR 17 — complete OPC UA server startup + live integration test' (#16) from phase-3-pr17-server-startup into v2 2026-04-18 08:28:42 -04:00
Joseph Doherty
46834a43bd Phase 3 PR 17 — complete OPC UA server startup end-to-end + integration test. PR 16 shipped the materialization shape (DriverNodeManager / OtOpcUaServer) without the activation glue; this PR finishes the scope so an external OPC UA client can actually connect, browse, and read. New OpcUaServerOptions DTO bound from the OpcUaServer section of appsettings.json (EndpointUrl default opc.tcp://0.0.0.0:4840/OtOpcUa, ApplicationName, ApplicationUri, PkiStoreRoot default %ProgramData%\OtOpcUa\pki, AutoAcceptUntrustedClientCertificates default true for dev — production flips via config). OpcUaApplicationHost wraps Opc.Ua.Configuration.ApplicationInstance: BuildConfiguration constructs the ApplicationConfiguration programmatically (no external XML) with SecurityConfiguration pointing at <PkiStoreRoot>/own, /issuers, /trusted, /rejected directories — stack auto-creates the cert folders on first run and generates a self-signed application certificate via CheckApplicationInstanceCertificate, ServerConfiguration.BaseAddresses set to the endpoint URL + SecurityPolicies just None + UserTokenPolicies just Anonymous with PolicyId='Anonymous' + SecurityPolicyUri=None so the client's UserTokenPolicy lookup succeeds at OpenSession, TransportQuotas.OperationTimeout=15s + MinRequestThreadCount=5 / MaxRequestThreadCount=100 / MaxQueuedRequestCount=200, CertificateValidator auto-accepts untrusted when configured. StartAsync creates the OtOpcUaServer (passes DriverHost + ILoggerFactory so one DriverNodeManager is created per registered driver in CreateMasterNodeManager from PR 16), calls ApplicationInstance.Start(server) to bind the endpoint, then walks each DriverNodeManager and drives a fresh GenericDriverNodeManager.BuildAddressSpaceAsync against it so the driver's discovery streams into the address space that's already serving clients. Per-driver discovery is isolated per decision #12: a discovery exception marks the driver's subtree faulted but the server stays up serving the other drivers' subtrees. DriverHost.GetDriver(instanceId) public accessor added alongside the existing GetHealth so OtOpcUaServer can enumerate drivers during CreateMasterNodeManager. DriverNodeManager.Driver property made public so OpcUaApplicationHost can identify which driver each node manager wraps during the discovery loop. OpcUaServerService constructor takes OpcUaApplicationHost — ExecuteAsync sequence now: bootstrap.LoadCurrentGenerationAsync → applicationHost.StartAsync → infinite Task.Delay until stop. StopAsync disposes the application host (which stops the server via OtOpcUaServer.Stop) before disposing DriverHost. Program.cs binds OpcUaServerOptions from appsettings + registers OpcUaApplicationHost + OpcUaServerOptions as singletons. Integration test (OpcUaServerIntegrationTests, Category=Integration): IAsyncLifetime spins up the server on a random non-default port (48400+random for test isolation) with a per-test-run PKI store root (%temp%/otopcua-test-<guid>) + a FakeDriver registered in DriverHost that has ITagDiscovery + IReadable implementations — DiscoverAsync registers TestFolder>Var1, ReadAsync returns 42. Client_can_connect_and_browse_driver_subtree creates an in-process OPC UA client session via CoreClientUtils.SelectEndpoint (which talks to the running server's GetEndpoints and fetches the live EndpointDescription with the actual PolicyId), browses the fake driver's root, asserts TestFolder appears in the returned references. Client_can_read_a_driver_variable_through_the_node_manager constructs the variable NodeId using the namespace index the server registered (urn:OtOpcUa:fake), calls Session.ReadValue, asserts the DataValue.Value is 42 — the whole pipeline (client → server endpoint → DriverNodeManager.OnReadValue → FakeDriver.ReadAsync → back through the node manager → response to client) round-trips correctly. Dispose tears down the session, server, driver host, and PKI store directory. Full solution: 0 errors, 165 tests pass (8 Core unit + 14 Proxy unit + 24 Configuration unit + 6 Shared unit + 91 Galaxy.Host unit + 4 Server (2 unit NodeBootstrap + 2 new integration) + 18 Admin). End-to-end outcome: PR 14's GalaxyAlarmTracker alarm events now flow through PR 15's GenericDriverNodeManager event forwarder → PR 16's ConditionSink → OPC UA AlarmConditionState.ReportEvent → out to every OPC UA client subscribed to the alarm condition. The full alarm subsystem (driver-side subscription of the Galaxy 4-attribute quartet, Core-side routing by source node id, Server-side AlarmConditionState materialization with ReportEvent dispatch) is now complete and observable through any compliant OPC UA client. LDAP / security-profile wire-up (replacing the anonymous-only endpoint with BasicSignAndEncrypt + user identity mapping to NodePermissions role) is the next layer — it reuses the same ApplicationConfiguration plumbing this PR introduces but needs a deployment-policy source (central config DB) for the cert trust decisions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:18:37 -04:00
7683b94287 Merge pull request 'Phase 3 PR 16 — concrete OPC UA server scaffolding + AlarmConditionState materialization' (#15) from phase-3-pr16-opcua-server into v2 2026-04-18 08:10:44 -04:00
Joseph Doherty
f53c39a598 Phase 3 PR 16 — concrete OPC UA server scaffolding with AlarmConditionState materialization. Introduces the OPCFoundation.NetStandard.Opc.Ua.Server package (v1.5.374.126, same version the v1 stack already uses) and two new server-side classes: DriverNodeManager : CustomNodeManager2 is the concrete realization of PR 15's IAddressSpaceBuilder contract — Folder() creates FolderState nodes under an Organizes hierarchy rooted at ObjectsFolder > DriverInstanceId; Variable() creates BaseDataVariableState with DataType mapped from DriverDataType (Boolean/Int32/Float/Double/String/DateTime) + ValueRank (Scalar or OneDimension) + AccessLevel CurrentReadOrWrite; AddProperty() creates PropertyState with HasProperty reference. Read hook wires OnReadValue per variable to route to IReadable.ReadAsync; Write hook wires OnWriteValue to route to IWritable.WriteAsync and surface per-tag StatusCode. MarkAsAlarmCondition() materializes an OPC UA AlarmConditionState child of the variable, seeded from AlarmConditionInfo (SourceName, InitialSeverity → UA severity via Low=250/Medium=500/High=700/Critical=900, InitialDescription), initial state Enabled + Acknowledged + Inactive + Retain=false. Returns an IAlarmConditionSink whose OnTransition updates alarm.Severity/Time/Message and switches state per AlarmType string ('Active' → SetActiveState(true) + SetAcknowledgedState(false) + Retain=true; 'Acknowledged' → SetAcknowledgedState(true); 'Inactive' → SetActiveState(false) + Retain=false if already Acked) then calls alarm.ReportEvent to emit the OPC UA event to subscribed clients. Galaxy's GalaxyAlarmTracker (PR 14) now lands at a concrete AlarmConditionState node instead of just raising an unobserved C# event. OtOpcUaServer : StandardServer wires one DriverNodeManager per DriverHost.GetDriver during CreateMasterNodeManager — anonymous endpoint, no security profile (minimum-viable; LDAP + security-profile wire-up is the next PR). DriverHost gains public GetDriver(instanceId) so the server can enumerate drivers at startup. NestedBuilder inner class in DriverNodeManager implements IAddressSpaceBuilder by temporarily retargeting the parent's _currentFolder during each call so Folder→Variable→AddProperty land under the correct subtree — not thread-safe if discovery ran concurrently, but GenericDriverNodeManager.BuildAddressSpaceAsync is sequential per driver so this is safe by construction. NuGet audit suppress for GHSA-h958-fxgg-g7w3 (moderate-severity in OPCFoundation.NetStandard.Opc.Ua.Core 1.5.374.126; v1 stack already accepts this risk on the same package version). PR 16 is scoped as scaffolding — the actual server startup (ApplicationInstance, certificate config, endpoint binding, session management wiring into OpcUaServerService.ExecuteAsync) is deferred to a follow-up PR because it needs ApplicationConfiguration XML + optional-cert-store logic that depends on per-deployment policy decisions. The materialization shape is complete: a subsequent PR adds 100 LOC to start the server and all the already-written IAddressSpaceBuilder + alarm-condition + read/write wire-up activates end-to-end. Full solution: 0 errors, 152 unit tests pass (no new tests this PR — DriverNodeManager unit testing needs an IServerInternal mock which is heavyweight; live-endpoint integration tests land alongside the server-startup PR).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:00:36 -04:00
d569c39f30 Merge pull request 'Phase 3 PR 15 — alarm-condition contract in abstract layer' (#14) from phase-3-pr15-alarm-contract into v2 2026-04-18 07:54:30 -04:00
Joseph Doherty
190d09cdeb Phase 3 PR 15 — alarm-condition contract in IAddressSpaceBuilder + wire OnAlarmEvent through GenericDriverNodeManager. IAddressSpaceBuilder.IVariableHandle gains MarkAsAlarmCondition(AlarmConditionInfo) which returns an IAlarmConditionSink. AlarmConditionInfo carries SourceName/InitialSeverity/InitialDescription. Concrete address-space builders (the upcoming PR 16 OPC UA server backend) materialize a sibling AlarmConditionState node on the first call; the sink receives every lifecycle transition the generic node manager forwards. GenericDriverNodeManager gains a CapturingBuilder wrapper that transparently wraps every Folder/Variable call — the wrapper observes MarkAsAlarmCondition calls without participating in materialization, captures the resulting IAlarmConditionSink into an internal source-node-id → sink ConcurrentDictionary keyed by IVariableHandle.FullReference. After DiscoverAsync completes, if the driver implements IAlarmSource the node manager subscribes to OnAlarmEvent and routes every AlarmEventArgs to the sink registered for args.SourceNodeId — unknown source ids are dropped silently (may belong to another driver or to a variable the builder chose not to flag). Dispose unsubscribes the forwarder to prevent dangling invocation-list references across node-manager rebuilds. GalaxyProxyDriver.DiscoverAsync now calls handle.MarkAsAlarmCondition(new AlarmConditionInfo(fullName, AlarmSeverity.Medium, null)) on every attr.IsAlarm=true variable — severity seed is Medium because the live Priority byte arrives through the subsequent GalaxyAlarmEvent stream (which PR 14's GalaxyAlarmTracker now emits); the Admin UI sees the severity update on the first transition. RecordingAddressSpaceBuilder in Driver.Galaxy.E2E gains a RecordedAlarmCondition list + a RecordingSink implementation that captures AlarmEventArgs for test assertion — the E2E parity suite can now verify alarm-condition registration shape in addition to folder/variable shape. Tests (4 new GenericDriverNodeManagerTests): Alarm_events_are_routed_to_the_sink_registered_for_the_matching_source_node_id — 2 alarms registered (Tank.HiHi + Heater.OverTemp), driver raises an event for Tank.HiHi, the Tank.HiHi sink captures the payload, the Heater.OverTemp sink does not (tag-scoped fan-out, not broadcast); Non_alarm_variables_do_not_register_sinks — plain Tank.Level in the same discover is not in TrackedAlarmSources; Unknown_source_node_id_is_dropped_silently — a transition for Unknown.Source doesn't reach any sink + no exception; Dispose_unsubscribes_from_OnAlarmEvent — post-dispose, a transition for a previously-registered tag is no-op because the forwarder detached. InternalsVisibleTo('ZB.MOM.WW.OtOpcUa.Core.Tests') added to Core csproj so TrackedAlarmSources internal property is visible to the test. Full solution: 0 errors, 152 unit tests pass (8 Core + 14 Proxy + 14 Admin + 24 Configuration + 6 Shared + 84 Galaxy.Host + 2 Server). PR 16 will implement the concrete OPC UA address-space builder that materializes AlarmConditionState from this contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:51:35 -04:00
4e0040e670 Merge pull request 'Phase 2 PR 14 — alarm subsystem (subscribe to alarm attribute quartet + raise GalaxyAlarmEvent)' (#13) from phase-2-pr14-alarm-subsystem into v2 2026-04-18 07:37:49 -04:00
91cb2a1355 Merge pull request 'Phase 2 PR 13 — port GalaxyRuntimeProbeManager + per-platform ScanState probing' (#12) from phase-2-pr13-runtime-probe into v2 2026-04-18 07:37:41 -04:00
Joseph Doherty
c14624f012 Phase 2 PR 14 — alarm subsystem wire-up. Per IsAlarm=true attribute (PR 9 added the discovery flag), GalaxyAlarmTracker in Backend/Alarms/ advises the four Galaxy alarm-state attributes: .InAlarm (boolean alarm active), .Priority (int severity), .DescAttrName (human-readable description), .Acked (boolean acknowledged). Runs the OPC UA Part 9 alarm lifecycle state machine simplified for the Galaxy AlarmExtension model and raises AlarmTransition events on transitions operators must react to — Active (InAlarm false→true, default Unacknowledged), Acknowledged (Acked false→true while InAlarm still true), Inactive (InAlarm true→false). MxAccessGalaxyBackend instantiates the tracker in its constructor with delegate-based subscribe/unsubscribe/write pointers to MxAccessClient, hooks TransitionRaised to forward each transition through the existing OnAlarmEvent IPC event that PR 4 ConnectionSink wires into MessageKind.AlarmEvent frames — no new contract messages required since GalaxyAlarmEvent already exists in Shared.Contracts. Field mapping: EventId = fresh Guid.ToString('N') per transition, ObjectTagName = alarm attribute full reference, AlarmName = alarm attribute full reference, Severity = tracked Priority, StateTransition = 'Active'|'Acknowledged'|'Inactive', Message = DescAttrName or tag fallback, UtcUnixMs = transition time. DiscoverAsync caches every IsAlarm=true attribute's full reference (tag.attribute) into _discoveredAlarmTags (ConcurrentBag cleared-then-filled on every re-Discover to track Galaxy redeploys). SubscribeAlarmsAsync iterates the cache and advises each via GalaxyAlarmTracker.TrackAsync; best-effort per-alarm — a subscribe failure on one alarm doesn't abort the whole call since operators prefer partial alarm coverage to none. Tracker is internally idempotent on repeat Track calls (second invocation for same alarm tag is a no-op; already-subscribed check short-circuits before the 4 MXAccess sub calls). Subscribe-failure rollback inside TrackAsync removes the alarm state + unadvises any of the 4 that did succeed so a partial advise can't leak a phantom tracking entry. AcknowledgeAlarmAsync routes to tracker.AcknowledgeAsync which writes the operator comment to <tag>.AckMsg via MxAccessClient.WriteAsync — writes use the existing MXAccess OnWriteComplete TCS-by-handle path (PR 4 Medium 4) so a runtime-refused ack bubbles up as Success=false rather than false-positive. State-machine quirks preserved from v1: (1) initial Acked=true on subscribe does NOT fire Acknowledged (alarm at rest, pre-acknowledged — default state is Acked=true so the first subscribe callback is a no-op transition), (2) Acked false→true only fires Acknowledged when InAlarm is currently true (acking a latched-inactive alarm is not a user-visible transition), (3) Active transition clears the Acked flag in-state so the next Acked callback correctly fires Acknowledged (v1 had this buried in the ConditionState logic; we track it on the AlarmState struct directly). Priority value handled as int/short/long via type pattern match with int.MaxValue guard — Galaxy attribute category returns varying CLR types (Int32 is canonical but some older templates use Int16), and a long overflow cast to int would silently corrupt the severity. Dispose cascade in MxAccessGalaxyBackend.Dispose: alarm-tracker unsubscribe→dispose, probe-manager unsubscribe→dispose, mx.ConnectionStateChanged detach, historian dispose — same discipline PR 6 / PR 8 / PR 13 established so dangling invocation-list refs don't survive a backend recycle. #pragma warning disable CS0067 around OnAlarmEvent removed since the event is now raised. Tests (9 new, GalaxyAlarmTrackerTests): four-attribute subscribe per alarm, idempotent repeat-track, InAlarm false→true fires Active with Priority + Desc, InAlarm true→false fires Inactive, Acked false→true while InAlarm fires Acknowledged, Acked transition while InAlarm=false does not fire, AckMsg write path carries the comment, snapshot reports latest four fields, foreign probe callback for a non-tracked tag is silently dropped. Full Galaxy.Host.Tests Unit suite 84 pass / 0 fail (9 new alarm + 12 PR 13 probe + 21 PR 12 quality + 42 pre-existing). Galaxy.Host builds clean (0/0). Branches off phase-2-pr13-runtime-probe so the MxAccessGalaxyBackend constructor/Dispose chain gets the probe-manager + alarm-tracker wire-up in a coherent order; fast-forwards if PR 13 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:34:13 -04:00
Joseph Doherty
04d267d1ea Phase 2 PR 13 — port GalaxyRuntimeProbeManager state machine + wire per-platform ScanState probing. PR 8 gave operators the gateway-level transport signal (MxAccessClient.ConnectionStateChanged → OnHostStatusChanged tagged with the Wonderware client identity) — enough to detect when the entire MXAccess COM proxy dies, but silent when a specific or goes down while the gateway stays alive. This PR closes that gap: a pure-logic GalaxyRuntimeProbeManager (ported from v1's 472-LOC GalaxyRuntimeProbeManager.cs, distilled to ~240 LOC of state machine without the OPC UA node-manager entanglement) lives in Backend/Stability/, advises <TagName>.ScanState per WinPlatform + AppEngine gobject discovered during DiscoverAsync, runs the Unknown → Running → Stopped state machine with v1's documented semantics preserved verbatim, and raises a StateChanged event on transitions operators should react to. MxAccessGalaxyBackend instantiates the probe manager in the constructor with SubscribeAsync/UnsubscribeAsync delegate pointers into MxAccessClient, hooks StateChanged to forward each transition through the same OnHostStatusChanged IPC event the gateway signal uses (HostName = platform/engine TagName, RuntimeStatus = 'Running'|'Stopped'|'Unknown', LastObservedUtcUnixMs from the state-change timestamp), so Admin UI gets per-host signals flowing through the existing PR 8 wire with no additional IPC plumbing. State machine rules ported from v1 runtimestatus.md: (a) ScanState is on-change-only — a stably-Running host may go hours without a callback, so Running → Stopped is driven only by explicit ScanState=false, never by starvation; (b) Unknown → Running is a startup transition and does NOT fire StateChanged (would paint every host as 'just recovered' at startup, which is noise and can clear Bad quality set by a concurrently-stopping sibling); (c) Stopped → Running fires StateChanged for the real recovery case; (d) Running → Stopped fires StateChanged; (e) Unknown → Stopped fires StateChanged because that's the first-known-bad signal operators need when a host is down at our startup time. MxAccessGalaxyBackend.DiscoverAsync calls _probeManager.SyncAsync with the runtime-host subset of the hierarchy (CategoryId == 1 WinPlatform or 3 AppEngine) as a best-effort step after building the Discover response — probe failures are swallowed so Discover still returns the hierarchy even if a per-host advise fails; the gateway signal covers the critical rung. SyncAsync is idempotent (second call with the same set is a no-op) and handles the diff on re-Discover for tag rename / host add / host remove. Subscribe failure rolls back the host's state entry under the lock so a later probe callback for a never-advised tag can't transition a phantom state from Unknown to Stopped and fan out a false host-down signal (the same protection v1's GalaxyRuntimeProbeManager had at line 237-243 of v1 with a captured-probe-string comparison under the lock). MxAccessGalaxyBackend.Dispose unsubscribes the StateChanged handler before disposing the probe manager to prevent dangling invocation-list references across reconnects, same discipline as PR 8's ConnectionStateChanged and PR 6's SubscriptionReplayFailed. Tests (12 new GalaxyRuntimeProbeManagerTests): Sync_subscribes_to_ScanState_per_host verifies tag.ScanState subscriptions are advised per Platform/Engine; Sync_is_idempotent_on_repeat_call_with_same_set verifies no duplicate subscribes; Sync_unadvises_removed_hosts verifies the diff unadvises gone hosts; Subscribe_failure_rolls_back_host_entry_so_later_transitions_do_not_fire_stale_events covers the rollback-on-subscribe-fail guard; Unknown_to_Running_does_not_fire_StateChanged preserves the startup-noise rule; Running_to_Stopped_fires_StateChanged_with_both_states asserts OldState and NewState are both captured in the transition record; Stopped_to_Running_fires_StateChanged_for_recovery verifies the recovery case; Unknown_to_Stopped_fires_StateChanged_for_first_known_bad_signal preserves the first-bad rule; Repeated_Good_Running_callbacks_do_not_fire_duplicate_events verifies the state-tracking de-dup; Unknown_callback_for_non_tracked_probe_is_dropped asserts a foreign callback is silently ignored; Snapshot_reports_current_state_for_every_tracked_host covers the dashboard query hook; IsRuntimeHost_recognizes_WinPlatform_and_AppEngine_category_ids asserts the CategoryId filter. Galaxy.Host.Tests Unit suite 75 pass / 0 fail (12 new probe + 63 pre-existing). Galaxy.Host builds clean (0 errors / 0 warnings). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:27:56 -04:00
4448db8207 Merge pull request 'Phase 2 PR 12 � richer historian quality mapping' (#11) from phase-2-pr12-quality-mapper into v2 2026-04-18 07:22:44 -04:00
d96b513bbc Merge pull request 'Phase 2 PR 11 � HistoryReadEvents IPC (alarm history)' (#10) from phase-2-pr11-history-events into v2 2026-04-18 07:22:33 -04:00
053c4e0566 Merge pull request 'Phase 2 PR 10 � HistoryReadAtTime IPC surface' (#9) from phase-2-pr10-history-attime into v2 2026-04-18 07:22:16 -04:00
Joseph Doherty
f24f969a85 Phase 2 PR 12 — richer historian quality mapping. Replace MxAccessGalaxyBackend's inline MapHistorianQualityToOpcUa category-only helper (192+→Good, 64-191→Uncertain, 0-63→Bad) with a new public HistorianQualityMapper.Map utility that preserves specific OPC DA subcodes — BadNotConnected(8)→0x808A0000u instead of generic Bad(0x80000000u), UncertainSubNormal(88)→0x40950000u instead of generic Uncertain, Good_LocalOverride(216)→0x00D80000u instead of generic Good, etc. Mirrors v1 QualityMapper.MapToOpcUaStatusCode byte-for-byte without pulling in OPC UA types — the function returns uint32 literals that are the canonical OPC UA StatusCode wire encoding, surfaced directly as DataValueSnapshot.StatusCode on the Proxy side with no additional translation. Unknown subcodes fall back to the family category (255→Good, 150→Uncertain, 50→Bad) so a future SDK change that adds a quality code we don't map yet still gets a sensible bucket. GalaxyDataValue wire shape unchanged (StatusCode stays uint) — this is a pure fidelity upgrade on the Host side. Downstream callers (Admin UI status dashboard, OPC UA clients receiving historian samples) can now distinguish e.g. a transport outage (BadNotConnected) from a sensor fault (BadSensorFailure) from a warm-up delay (BadWaitingForInitialData) without a second round-trip or dashboard heuristic. 21 new tests (HistorianQualityMapperTests): theory with 15 rows covering every specific mapping from the v1 QualityMapper table, plus 6 fallback tests verifying unknown-subcode codes in each family (Good/Uncertain/Bad) collapse to the family default. Galaxy.Host.Tests Unit suite 56/0 (21 new + 35 existing). Galaxy.Host builds clean (0/0). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:11:02 -04:00
Joseph Doherty
ca025ebe0c Phase 2 PR 11 — HistoryReadEvents IPC (alarm history). New Shared.Contracts messages HistoryReadEventsRequest/Response + GalaxyHistoricalEvent DTO (MessageKind 0x66/0x67). IGalaxyBackend gains HistoryReadEventsAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadEventsAsync (ported in PR 5) and maps HistorianEventDto → GalaxyHistoricalEvent — Guid.ToString() for EventId wire shape, DateTime → Unix ms for both EventTime (when the event fired in the process) and ReceivedTime (when the Historian persisted it), DisplayText + Severity pass through. SourceName is string? — null means 'all sources' (passed straight through to HistorianDataSource.ReadEventsAsync which adds the AddEventFilter('Source', Equal, ...) only when non-null). Distinct from the live GalaxyAlarmEvent type because historical rows carry both timestamps and lack StateTransition (Historian logs instantaneous events, not the OPC UA Part 9 alarm lifecycle; translating to OPC UA event lifecycle is the alarm-subsystem's job). Guards: null historian → Historian-disabled error; SDK exception → Success=false with message chained. Tests (3 new): disabled-error when historian null, maps HistorianEventDto with full field set (Id/Source/EventTime/ReceivedTime/DisplayText/Severity=900) to GalaxyHistoricalEvent, null SourceName passes through unchanged (verifies the 'all sources' contract). Galaxy.Host.Tests Unit suite 34 pass / 0 fail. Galaxy.Host builds clean. Branches off phase-2-pr10-history-attime since both extend the MessageKind enum; fast-forwards if PR 10 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:08:16 -04:00
Joseph Doherty
d13f919112 Phase 2 PR 10 — HistoryReadAtTime IPC surface. New Shared.Contracts messages HistoryReadAtTimeRequest/Response (MessageKind 0x64/0x65), IGalaxyBackend gains HistoryReadAtTimeAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadAtTimeAsync (ported in PR 5, exposed now) — request timestamp array is flow-encoded as Unix ms to avoid MessagePack DateTime quirks then re-hydrated to DateTime on the Host side. Per-sample mapping uses the same ToWire(HistorianSample) helper as ReadRawAsync so the category→StatusCode mapping stays consistent (Quality byte 192+ → Good 0u, 64-191 → Uncertain, 0-63 → Bad 0x80000000u). Guards: null historian → "Historian disabled" (symmetric with other history paths); empty timestamp array short-circuits to Success=true, Values=[] without an SDK round-trip; SDK exception → Success=false with the message chained. Proxy-side IHistoryProvider.ReadAtTimeAsync capability doesn't exist in Core.Abstractions yet (OPC UA HistoryReadAtTime service is supported but the current IHistoryProvider only has ReadRawAsync + ReadProcessedAsync) — this PR adds the Host-side surface so a future Core.Abstractions extension can wire it through without needing another IPC change. Tests (4 new): disabled-error when historian null, empty-timestamp short-circuit without SDK call, Unix-ms↔DateTime round-trip with Good samples at two distinct timestamps, missing sample (Quality=0) maps to 0x80000000u Bad category. Galaxy.Host.Tests Unit suite: 31 pass / 0 fail (4 new at-time + 27 pre-existing). Galaxy.Host builds clean. Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:03:25 -04:00
d2ebb91cb1 Merge pull request 'Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end' (#8) from phase-2-pr9-alarms into v2 2026-04-18 06:59:25 -04:00
90ce0af375 Merge pull request 'Phase 2 PR 8 — gateway-level host-status push from MxAccessGalaxyBackend' (#7) from phase-2-pr8-alarms-hoststatus into v2 2026-04-18 06:59:04 -04:00
e250356e2a Merge pull request 'Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end' (#6) from phase-2-pr7-history-processed into v2 2026-04-18 06:59:02 -04:00
067ad78e06 Merge pull request 'Phase 2 PR 6 — close PR 4 monitor-loop low findings (probe leak + replay signal)' (#5) from phase-2-pr6-monitor-findings into v2 2026-04-18 06:57:57 -04:00
6cfa8d326d Merge pull request 'Phase 2 PR 4 — close 4 open MXAccess findings (push frames + reconnect + write-await + read-cancel)' (#3) from phase-2-pr4-findings into v2 2026-04-18 06:57:21 -04:00
Joseph Doherty
70a5d06b37 Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end. GalaxyRepository.GetAttributesAsync has always emitted is_alarm alongside is_historized (CASE WHEN EXISTS with the primitive_definition join on primitive_name='AlarmExtension' per v1's Extended Attributes SQL lifted byte-for-byte into the PR 5 repository port), and GalaxyAttributeRow.IsAlarm has been populated since the port, but the flag was silently dropped at the MapAttribute helper in both MxAccessGalaxyBackend and DbBackedGalaxyBackend because GalaxyAttributeInfo on the IPC side had no field to carry it — every deployed alarm attribute arrived at the Proxy with no signal that it was alarm-bearing. This PR wires the flag through the three translation boundaries: GalaxyAttributeInfo gains [Key(6)] public bool IsAlarm { get; set; } at the end of the message to preserve wire-compat with pre-PR9 payloads that omit the key (MessagePack treats missing keys as default, so a newer Proxy talking to an older Host simply gets IsAlarm=false for every attribute); both backend MapAttribute helpers copy row.IsAlarm into the IPC shape; DriverAttributeInfo in Core.Abstractions gains a new IsAlarm parameter with default value false so the positional record signature change doesn't force every non-Galaxy driver call site to flow a flag they don't produce (the existing generic node-manager and future Modbus/etc. drivers keep compiling without modification); GalaxyProxyDriver.DiscoverAsync passes attr.IsAlarm through to the DriverAttributeInfo positional constructor. This is the discovery-side foundation — the generic node-manager can now enrich alarm-bearing variables with OPC UA AlarmConditionState during address-space build (the existing v1 LmxNodeManager pattern that subscribes to <tag>.InAlarm + .Priority + .DescAttrName + .Acked and merges them into a ConditionState) but this PR deliberately stops at discovery: the full alarm subsystem (subscription management for the 4 alarm-status attributes, state-machine tracking for Active/Unacknowledged/Confirmed/Inactive transitions, OPC UA Part 9 alarm event emission, and the write-to-AckMsg ack path) is a follow-up PR 10+ because it touches the node-manager's address-space build path — orthogonal to the IPC flow this PR covers. Tests — AlarmDiscoveryTests (new, 3 cases): GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack serializes an IsAlarm=true instance and asserts the decoded flag is true + IsHistorized is true + AttributeName survives unchanged; GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack covers the default path; Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false is the wire-compat regression guard — serializes a stand-in PrePR9Shape class with only keys 0..5 (identical layout to the pre-PR9 GalaxyAttributeInfo) and asserts the newer GalaxyAttributeInfo deserializer produces IsAlarm=false without throwing, so a rolling upgrade where the Proxy ships first can talk to an old Host during the window before the Host upgrades without a MessagePack "missing key" exception. Full solution build: 0 errors, 38 warnings (existing). Galaxy.Host.Tests Unit suite: 27 pass / 0 fail (3 new alarm-discovery + 9 PR5 historian + 15 pre-existing). This PR branches off phase-2-pr5-historian because GalaxyProxyDriver's constructor signature + GalaxyHierarchyRow's IsAlarm init-only property are both ancestor state that the simpler branch bases (phase-2-pr4-findings, master) don't yet include.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:28:01 -04:00
Joseph Doherty
3717405aa6 Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end. PR 5 ported HistorianDataSource.ReadAggregateAsync into Galaxy.Host but left it internal — GalaxyProxyDriver.ReadProcessedAsync still threw NotSupportedException, so OPC UA clients issuing HistoryReadProcessed requests against the v2 topology got rejected at the driver boundary. This PR closes that gap by adding two new Shared.Contracts messages (HistoryReadProcessedRequest/Response, MessageKind 0x62/0x63), routing them through GalaxyFrameHandler, implementing HistoryReadProcessedAsync on all three IGalaxyBackend implementations (Stub/DbBacked return the canonical "pending" Success=false, MxAccessGalaxyBackend delegates to _historian.ReadAggregateAsync), mapping HistorianAggregateSample → GalaxyDataValue at the IPC boundary (null bucket Value → BadNoData 0x800E0000u, otherwise Good 0u), and flipping GalaxyProxyDriver.ReadProcessedAsync from the NotSupported throw to a real IPC call with OPC UA HistoryAggregateType enum mapped to Wonderware AnalogSummary column name on the Proxy side (Average → "Average", Minimum → "Minimum", Maximum → "Maximum", Count → "ValueCount", Total → NotSupported since there's no direct SDK column for sum). Decision #13 IPC data-shape stays intact — HistoryReadProcessedResponse carries GalaxyDataValue[] with the same MessagePack value + OPC UA StatusCode + timestamps shape as the other history responses, so the Proxy's existing ToSnapshot helper handles the conversion without a new code path. MxAccessGalaxyBackend.HistoryReadProcessedAsync guards: null historian → "Historian disabled" (symmetric with HistoryReadAsync); IntervalMs <= 0 → "HistoryReadProcessed requires IntervalMs > 0" (prevents division-by-zero inside the SDK's Resolution parameter); exception during SDK call → Success=false Values=[] with the message so the Proxy surfaces it as InvalidOperationException with a clean error chain. Tests — HistoryReadProcessedTests (new, 4 cases): disabled-error when historian null, rejects zero interval, maps Good sample with Value=12.34 and the Proxy-supplied AggregateColumn + IntervalMs flow unchanged through to the fake IHistorianDataSource, maps null Value bucket to 0x800E0000u BadNoData with null ValueBytes. AggregateColumnMappingTests (new, 5 cases in Proxy.Tests): theory covers all 4 supported HistoryAggregateType enum values → correct column string, and asserts Total throws NotSupportedException with a message that steers callers to Average/Minimum/Maximum/Count (the SDK's AnalogSummaryQueryResult doesn't expose a sum column — the closest is Average × ValueCount which is the responsibility of a caller-side aggregation rather than an extra IPC round-trip). InternalsVisibleTo added to Galaxy.Proxy csproj so Proxy.Tests can reach the internal MapAggregateToColumn static. Builds — Galaxy.Host (net48 x86) + Galaxy.Proxy (net10) both 0 errors, full solution 201 warnings (pre-existing) / 0 errors. Test counts — Host.Tests Unit suite: 28 pass (4 new processed + 9 PR5 historian + 15 pre-existing); Proxy.Tests Unit suite: 14 pass (5 new column-mapping + 9 pre-existing). Deferred to a later PR — ReadAtTime + ReadEvents + Health IPC surfaces (HistorianDataSource has them ported in PR 5 but they need additional contract messages and would push this PR past a comfortable review size); the alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend) which overlaps the ReadEventsAsync IPC work since both pull from HistorianAccess.CreateEventQuery on the SDK side; the Proxy-side quality-byte refinement where HistorianDataSource's per-sample raw quality byte gets decoded through the existing QualityMapper instead of the category-only mapping in ToWire(HistorianSample) — doesn't change correctness today since Good/Uncertain/Bad categories are all the Admin UI and OPC UA clients surface, but richer OPC DA status codes (BadNotConnected, UncertainSubNormal, etc.) are available on the wire and the Proxy could promote them before handing DataValueSnapshot to ISubscribable consumers. This PR branches off phase-2-pr5-historian because it directly extends the Historian IPC surface added there; if PR 5 merges first PR 7 fast-forwards, otherwise it needs a rebase after PR 5 lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 05:53:01 -04:00
Joseph Doherty
1c2bf74d38 Phase 2 PR 6 — close the 2 low findings carried forward from PR 4. Low finding #1 ($Heartbeat probe handle leak in MonitorLoopAsync): the probe calls _proxy.AddItem(connectionHandle, "$Heartbeat") on every monitor tick that observes the connection is past StaleThreshold, but previously discarded the returned item handle — so every probe (one per MonitorInterval, default 5s) leaked one item handle into the MXAccess proxy's internal handle table. Fix: capture the item handle, call RemoveItem(connectionHandle, probeHandle) in the InvokeAsync's finally block so it runs on the same pump turn as the AddItem, best-effort RemoveItem swallow so a dying proxy doesn't throw secondary exceptions out of the probe path. Probe ok becomes probeHandle > 0 so any AddItem that returns 0 (MXAccess's "could not create") counts as a failed probe, matching v1 behavior. Low finding #2 (subscription replay silently swallowed per-tag failures): after a reconnect, the replay loop iterates the pre-reconnect subscription snapshot and calls SubscribeOnPumpAsync for each; previously those failures went into a bare catch { /* skip */ } so an operator had no signal when specific tags failed to re-subscribe — the first indication downstream was a quality drop on OPC UA clients. Fix: new SubscriptionReplayFailedEventArgs (TagReference + Exception) + SubscriptionReplayFailed event on MxAccessClient that fires once per tag that fails to re-subscribe, Log.Warning per failure with the reconnect counter + tag reference, and a summary log line at the end of the replay loop ("{failed} of {total} failed" or "{total} re-subscribed cleanly"). Serilog using + ILogger Log = Serilog.Log.ForContext<MxAccessClient>() added. Tests — MxAccessClientMonitorLoopTests (new file, 2 cases): Heartbeat_probe_calls_RemoveItem_for_every_AddItem constructs a CountingProxy IMxProxy that tracks AddItem/RemoveItem pair counts scoped to the "$Heartbeat" address, runs the client with MonitorInterval=150ms + StaleThreshold=50ms for 700ms, asserts HeartbeatAddCount > 1, HeartbeatAddCount == HeartbeatRemoveCount, OutstandingHeartbeatHandles == 0 after dispose; SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay uses a ReplayFailingProxy that throws on the next $Heartbeat probe (to trigger the reconnect path) and throws on the replay-time AddItem for specified tag names ("BadTag.A", "BadTag.B"), subscribes GoodTag.X + BadTag.A + BadTag.B before triggering probe failure, collects SubscriptionReplayFailed args into a ConcurrentBag, asserts exactly 2 events fired and both bad tags are represented — GoodTag.X replays cleanly so it does not fire. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll because IMxProxy's MxDataChangeHandler delegate signature mentions MXSTATUS_PROXY and the compiler resolves all delegate parameter types when a test class implements the interface, even if the test code never names the type. No regressions: full Galaxy.Host.Tests Unit suite 26 pass / 0 fail (2 new monitor-loop tests + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings) — the new Serilog.Log.ForContext usage picks up the existing Serilog package ref that PR 4 pulled in for the monitor-loop infrastructure. Both findings were flagged as non-blocking for PR 4 merge and are now resolved alongside whichever merge order the reviewer picks; this PR branches off phase-2-pr4-findings so it can rebase cleanly if PR 4 lands first or be re-based onto master after PR 4 merges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 02:06:15 -04:00
312 changed files with 18926 additions and 24869 deletions

1
.gitignore vendored
View File

@@ -29,3 +29,4 @@ packages/
# Claude Code (per-developer settings, runtime lock files, agent transcripts)
.claude/
.local/

View File

@@ -8,8 +8,9 @@
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/ZB.MOM.WW.OtOpcUa.Historian.Aveva.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ZB.MOM.WW.OtOpcUa.Driver.Modbus.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.S7/ZB.MOM.WW.OtOpcUa.Driver.S7.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.Shared/ZB.MOM.WW.OtOpcUa.Client.Shared.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.CLI/ZB.MOM.WW.OtOpcUa.Client.CLI.csproj"/>
<Project Path="src/ZB.MOM.WW.OtOpcUa.Client.UI/ZB.MOM.WW.OtOpcUa.Client.UI.csproj"/>
@@ -22,11 +23,13 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ZB.MOM.WW.OtOpcUa.Admin.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/ZB.MOM.WW.OtOpcUa.Tests.v1Archive.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/ZB.MOM.WW.OtOpcUa.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests/ZB.MOM.WW.OtOpcUa.Client.CLI.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.UI.Tests/ZB.MOM.WW.OtOpcUa.Client.UI.Tests.csproj"/>

1
_p54.json Normal file
View File

@@ -0,0 +1 @@
{"title":"Phase 3 PR 54 -- Siemens S7 Modbus TCP quirks research doc","body":"## Summary\n\nAdds `docs/v2/s7.md` (485 lines) covering Siemens SIMATIC S7 family Modbus TCP behavior. Mirrors the `docs/v2/dl205.md` template for future per-quirk implementation PRs.\n\n## Key findings for the implementation track\n\n- **No fixed memory map** — every S7 Modbus server is user-wired via `MB_SERVER`/`MODBUSCP`/`MODBUSPN` library blocks. Driver must accept per-site config, not assume a vendor layout.\n- **MB_SERVER requires non-optimized DBs** (STATUS `0x8383` if optimized). Most common field bug.\n- **Word order default = ABCD** (opposite of DL260). Driver's S7 profile default must be `ByteOrder.BigEndian`, not `WordSwap`.\n- **One port per MB_SERVER instance** — multi-client requires parallel FBs on 503/504/… Most clients assume port 502 multiplexes (wrong on S7).\n- **CP 343-1 Lean is server-only**, requires the `2XV9450-1MB00` license.\n- **FC20/21/22/23/43 all return Illegal Function** on every S7 variant — driver must not attempt FC23 bulk-read optimization for S7.\n- **STOP-mode behavior non-deterministic** across firmware bands — treat both read/write STOP-mode responses as unavailable.\n\nTwo items flagged as unconfirmed rumour (V2.0+ float byte-order claim, STOP-mode caching location).\n\nNo code, no tests — implementation lands in PRs 56+.\n\n## Test plan\n- [x] Doc renders as markdown\n- [x] 31 citations present\n- [x] Section structure matches dl205.md template","head":"phase-3-pr54-s7-research-doc","base":"v2"}

1
_p55.json Normal file
View File

@@ -0,0 +1 @@
{"title":"Phase 3 PR 55 -- Mitsubishi MELSEC Modbus TCP quirks research doc","body":"## Summary\n\nAdds `docs/v2/mitsubishi.md` (451 lines) covering MELSEC Q/L/iQ-R/iQ-F/FX3U Modbus TCP behavior. Mirrors `docs/v2/dl205.md` template for per-quirk implementation PRs.\n\n## Key findings for the implementation track\n\n- **Module naming trap** — `QJ71MB91` is SERIAL RTU, not TCP. TCP module is `QJ71MT91`. Surface clearly in driver docs.\n- **No canonical mapping** — per-site 'Modbus Device Assignment Parameter' block (up to 16 entries). Treat mapping as runtime config.\n- **X/Y hex vs octal depends on family** — Q/L/iQ-R use HEX (X20 = decimal 32); FX/iQ-F use OCTAL (X20 = decimal 16). Helper must take a family selector.\n- **Word order CDAB default** across all MELSEC families (opposite of Siemens S7). Driver Mitsubishi profile default: `ByteOrder.WordSwap`.\n- **D-registers binary by default** (opposite of DL205's BCD default). Caller opts in to `Bcd16`/`Bcd32` when ladder uses BCD.\n- **FX5U needs firmware ≥ 1.060** for Modbus TCP server — older is client-only.\n- **FX3U-ENET vs FX3U-ENET-P502 vs FX3U-ENET-ADP** — only the middle one binds port 502; the last has no Modbus at all. Common operator mis-purchase.\n- **QJ71MT91 does NOT support FC22 / FC23** — iQ-R / iQ-F do. Bulk-read optimization must gate on capability.\n- **STOP-mode writes configurable** on Q/L/iQ-R/iQ-F (default accept), always rejected on FX3U-ENET.\n\nThree unconfirmed rumours flagged separately.\n\nNo code, no tests — implementation lands in PRs 58+.\n\n## Test plan\n- [x] Doc renders as markdown\n- [x] 17 citations present\n- [x] Per-model test naming matrix included (`Mitsubishi_QJ71MT91_*`, `Mitsubishi_FX5U_*`, `Mitsubishi_FX3U_ENET_*`, shared `Mitsubishi_Common_*`)","head":"phase-3-pr55-mitsubishi-research-doc","base":"v2"}

View File

@@ -348,6 +348,44 @@ The project uses [GLAuth](https://github.com/glauth/glauth) v2.4.0 as the LDAP s
Enable LDAP in `appsettings.json` under `Authentication.Ldap`. See [Configuration Guide](Configuration.md) for the full property reference.
### Active Directory configuration
Production deployments typically point at Active Directory instead of GLAuth. Only four properties differ from the dev defaults: `Server`, `Port`, `UserNameAttribute`, and `ServiceAccountDn`. The same `GroupToRole` mechanism works — map your AD security groups to OPC UA roles.
```json
{
"OpcUaServer": {
"Ldap": {
"Enabled": true,
"Server": "dc01.corp.example.com",
"Port": 636,
"UseTls": true,
"AllowInsecureLdap": false,
"SearchBase": "DC=corp,DC=example,DC=com",
"ServiceAccountDn": "CN=OpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com",
"ServiceAccountPassword": "<from your secret store>",
"DisplayNameAttribute": "displayName",
"GroupAttribute": "memberOf",
"UserNameAttribute": "sAMAccountName",
"GroupToRole": {
"OPCUA-Operators": "WriteOperate",
"OPCUA-Engineers": "WriteConfigure",
"OPCUA-AlarmAck": "AlarmAck",
"OPCUA-Tuners": "WriteTune"
}
}
}
}
```
Notes:
- `UserNameAttribute: "sAMAccountName"` is the critical AD override — the default `uid` is not populated on AD user entries, so the user-DN lookup returns no results without it. Use `userPrincipalName` instead if operators log in with `user@corp.example.com` form.
- `Port: 636` + `UseTls: true` is required under AD's LDAP-signing enforcement. AD increasingly rejects plain-LDAP bind; set `AllowInsecureLdap: false` to refuse fallback.
- `ServiceAccountDn` should name a dedicated read-only service principal — not a privileged admin. The account needs read access to user and group entries in the search base.
- `memberOf` values come back as full DNs like `CN=OPCUA-Operators,OU=OPC UA Security Groups,OU=Groups,DC=corp,DC=example,DC=com`. The authenticator strips the leading `CN=` RDN value so operators configure `GroupToRole` with readable group common-names.
- Nested group membership is **not** expanded — assign users directly to the role-mapped groups, or pre-flatten membership in AD. `LDAP_MATCHING_RULE_IN_CHAIN` / `tokenGroups` expansion is an authenticator enhancement, not a config change.
### Security Considerations
- LDAP credentials are transmitted in plaintext over the OPC UA channel unless transport security is enabled. Use `Basic256Sha256-SignAndEncrypt` for production deployments.

View File

@@ -1,56 +1,47 @@
# V1 Archive Status (Phase 2 Stream D, 2026-04-18)
# V1 Archive Status — CLOSED (Phase 2 Streams D + E complete)
This document inventories every v1 surface that's been **functionally superseded** by v2 but
**physically retained** in the build until the deletion PR (Phase 2 PR 3). Rationale: cascading
references mean a single deletion is high blast-radius; archive-marking lets the v2 stack ship
on its own merits while the v1 surface stays as parity reference.
> **Status as of 2026-04-18: the v1 archive has been fully removed from the tree.**
> This document is retained as historical record of the Phase 2 Stream D / E closure.
## Archived projects
## Final state
| Path | Status | Replaced by | Build behavior |
|---|---|---|---|
| `src/ZB.MOM.WW.OtOpcUa.Host/` | Archive (executable in build) | `OtOpcUa.Server` + `Driver.Galaxy.Host` + `Driver.Galaxy.Proxy` | Builds; not deployed by v2 install scripts |
| `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` | Archive (plugin in build) | TODO: port into `Driver.Galaxy.Host/Backend/Historian/` (Task B.1.h follow-up) | Builds; loaded only by archived Host |
| `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/` | Archive | `Driver.Galaxy.E2E` + per-component test projects | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
| `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/` | Archive | `Driver.Galaxy.E2E` | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
All five v1 archive directories have been deleted:
## How to run the archived suites explicitly
| Path | Deleted | Replaced by |
|---|---|---|
| `src/ZB.MOM.WW.OtOpcUa.Host/` | ✅ | `OtOpcUa.Server` + `Driver.Galaxy.Host` + `Driver.Galaxy.Proxy` |
| `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` | ✅ | `Driver.Galaxy.Host/Backend/Historian/` (ported in Phase 3 PRs 51-55) |
| `tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests/` | ✅ | `Driver.Galaxy.Host.Tests/Historian/` |
| `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/` | ✅ | Per-component `*.Tests` projects + `Driver.Galaxy.E2E` |
| `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/` | ✅ | `Driver.Galaxy.E2E` + `Driver.Modbus.IntegrationTests` |
```powershell
# v1 unit tests (494):
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive
## Closure timeline
# v1 integration tests (6):
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests
```
- **PR 2 (2026-04-18, phase-2-stream-d)** — archive-marked the four v1 projects with
`<IsTestProject>false</IsTestProject>` so solution builds and `dotnet test slnx` bypassed
them. Capture: `docs/v2/implementation/exit-gate-phase-2-final.md`.
- **Phase 3 PR 18 (2026-04-18)** — deleted the archived project source trees. Leftover
`bin/` and `obj/` residue remained on disk from pre-deletion builds.
- **Phase 2 PR 61 (2026-04-18, this closure PR)** — scrubbed the empty residue directories
and confirmed `dotnet build ZB.MOM.WW.OtOpcUa.slnx` clean with 0 errors.
Both still pass on this dev box — they're the parity reference for Phase 2 PR 3's deletion
decision.
## Parity validation (Stream E)
## Deletion plan (Phase 2 PR 3)
The original 494 v1 tests + 6 v1 integration tests are **not** preserved in the v2 branch.
Their parity-bar role is now filled by:
Pre-conditions:
- [ ] `Driver.Galaxy.E2E` test count covers the v1 IntegrationTests' 6 integration scenarios
at minimum (currently 7 tests; expand as needed)
- [ ] `Driver.Galaxy.Host/Backend/Historian/` ports the Wonderware Historian plugin
so `MxAccessGalaxyBackend.HistoryReadAsync` returns real data (Task B.1.h)
- [ ] Operator review on a separate PR — destructive change
Steps:
1. `git rm -r src/ZB.MOM.WW.OtOpcUa.Host/`
2. `git rm -r src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/`
(or move it under Driver.Galaxy.Host first if the lift is part of the same PR)
3. `git rm -r tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
4. `git rm -r tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/`
5. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the four project lines
6. `dotnet build ZB.MOM.WW.OtOpcUa.slnx` → confirm clean
7. `dotnet test ZB.MOM.WW.OtOpcUa.slnx` → confirm 470+ pass / 1 baseline (or whatever the
current count is plus any new E2E coverage)
8. Commit: "Phase 2 Stream D — delete v1 archive (Host + Historian.Aveva + v1Tests + IntegrationTests)"
9. PR 3 against `v2`, link this doc + exit-gate-phase-2-final.md
10. One reviewer signoff
- `Driver.Galaxy.E2E` — cross-FX subprocess parity (spawns the net48 x86 Galaxy.Host.exe
+ connects via real named pipe, exercises every `IDriver` capability through the
supervisor). Stability-findings regression tests (4 × 2026-04-13 findings) live here.
- Per-component `*.Tests` projects — cover the code that moved out of the monolith into
discrete v2 projects. Running `dotnet test ZB.MOM.WW.OtOpcUa.slnx` executes all of them
as one solution-level gate.
- `Driver.Modbus.IntegrationTests` — adds Modbus TCP driver coverage that didn't exist in
v1 (DL205, S7-1500, Mitsubishi MELSEC via pymodbus sim profiles — PRs 30, 56-60).
- Live-stack smoke tests (`Driver.Galaxy.E2E/LiveStack/`) — optional, gated on presence
of the `OtOpcUaGalaxyHost` service + Galaxy repository on the dev box (PRs 33, 36, 37).
## Rollback
If Phase 2 PR 3 surfaces downstream consumer regressions, `git revert` the deletion commit
restores the four projects intact. The v2 stack continues to ship from the v2 branch.
`git revert` of the deletion commits restores the projects intact. The v2 stack continues
to ship from the `v2` branch regardless.

295
docs/v2/dl205.md Normal file
View File

@@ -0,0 +1,295 @@
# AutomationDirect DirectLOGIC DL205 / DL260 — Modbus quirks
AutomationDirect's DirectLOGIC DL205 family (D2-250-1, D2-260, D2-262, D2-262M) and
its larger DL260 sibling speak Modbus TCP (via the H2-ECOM100 / H2-EBC100 Ethernet
coprocessors, and the DL260's built-in Ethernet port) and Modbus RTU (via the CPU
serial ports in "Modbus" mode). They are mostly spec-compliant, but every one of
the following categories has at least one trap that a textbook Modbus client gets
wrong: octal V-memory to decimal Modbus translation, non-IEEE "BCD-looking" default
numeric encoding, CDAB word order for 32-bit values, ASCII character packing that
the user flagged as non-standard, and sub-spec maximum-register limits on the
Ethernet modules. This document catalogues each quirk, cites primary sources, and
names the ModbusPal integration test we'd write for it (convention from
`docs/v2/modbus-test-plan.md`: `DL205_<behavior>`).
## Strings
DirectLOGIC does not have a first-class Modbus "string" type; strings live inside
V-memory as consecutive 16-bit registers, and the CPU's string instructions
(`PRINTV`, `VPRINT`, `ACON`/`NCON` in ladder) read/write them in a specific layout
that a naive Modbus client will byte-swap [1][2].
- **Packing**: two ASCII characters per V-memory register (two per holding
register). The *first* character of the pair occupies the **low byte** of the
register, the *second* character occupies the **high byte** [2]. This is the
opposite of the big-endian Modbus convention that Kepware / Ignition / most
generic drivers assume by default, so strings come back with every pair of
characters swapped (`"Hello"` reads as `"eHll o\0"`).
- **Termination**: null-terminated (`0x00` in the character byte). There is no
length prefix. Writes must pad the final register's unused byte with `0x00`.
- **Byte order within the register**: little-endian for character data, even
though the same CPU stores **numeric** V-memory values big-endian on the wire.
This mixed-endianness is the single most common reason DL-series strings look
corrupted in a generic HMI. Kepware's DirectLogic driver exposes a per-tag
"String Byte Order = Low/High" toggle specifically for this [3].
- **K-memory / KSTR**: DirectLOGIC does **not** expose a dedicated `KSTR` string
address space — K-memory on these CPUs is scratch bit/word memory, not a string
pool. Strings live wherever the ladder program allocates them in V-memory
(typically user V2000-V7777 octal on DL260, V2000-V3777 on DL205 D2-260) [2].
- **Maximum length**: bounded only by the V-memory region assigned. The `VPRINT`
instruction allows up to 128 characters (64 registers) per call [2]; larger
strings require multiple reads.
- **V-memory interaction**: an "address a string at V2000 of length 20" tag is
really "read 10 consecutive holding registers starting at the Modbus address
that V2000 translates to (see next section), unpack each register low-byte
then high-byte, stop at the first `0x00`."
Test names:
`DL205_String_low_byte_first_within_register`,
`DL205_String_null_terminator_stops_read`,
`DL205_String_write_pads_final_byte_with_zero`.
## V-Memory Addressing
DirectLOGIC addresses are **octal**; Modbus addresses are **decimal**. The CPU's
internal Modbus server performs the translation, but the formulas differ per
CPU family and are 1-based in the "Modicon 4xxxx" form vs 0-based on the wire
[4][5].
Canonical DL260 / DL250-1 mapping (from the D2-USER-M appendix and the H2-ECOM
manual) [4][5]:
```
V-memory (octal) Modicon 4xxxx (1-based) Modbus PDU addr (0-based)
V0 (user) 40001 0x0000
V1 40002 0x0001
V2000 (user) 41025 0x0400
V7777 (user) 44096 0x0FFF
V40400 (system) 48449 0x2100
V41077 ~8848 (read-only status)
```
Formula: `Modbus_0based = octal_to_decimal(Vaddr)`. So `V2000` octal = `1024`
decimal = Modbus PDU address `0x0400`. The "4xxxx" Modicon view just adds 1 and
prefixes the register bank digit.
- **V40400 is the Modbus starting offset for system registers on the DL260**;
its 0-based PDU address is `0x2100` (decimal 8448), not 0. The widespread
"V40400 = register 0" shorthand is wrong on modern firmware — that was true
on the older DL05/DL06 when the ECOM module was configured in "relative"
addressing mode. On the H2-ECOM100 factory default ("absolute" mode), V40400
maps to 0x2100 [5].
- **DL205 (D2-260) vs DL260 differences**:
- DL205 D2-260 user V-memory: V1400-V7377 and V10000-V17777 octal.
- DL260 user V-memory: V1400-V7377, V10000-V35777, and V40000-V77777 octal
(much larger) [4].
- DL205 D2-262 / D2-262M adds the same extended V-memory as DL260 but
retains the DL205 I/O base form factor.
- Neither DL205 sub-model changes the *formula* — only the valid range.
- **Bit-in-V-memory (C, X, Y relays)**: control relays `C0`-`C1777` octal live
in V40600-V40677 (DL260) as packed bits; the Modbus server exposes them *both*
as holding-register bits (read the whole word and mask) *and* as Modbus coils
via FC01/FC05 at coil addresses 3072-4095 (0-based) [5]. `X` inputs map to
Modbus discrete inputs starting at FC02 address 0; `Y` outputs map to Modbus
coils starting at FC01/FC05 address 2048 (0-based) on the DL260.
- **Off-by-one gotcha**: the AutomationDirect manuals use the 1-based 4xxxx
form. Kepware, libmodbus, pymodbus, and the .NET stack all take the 0-based
PDU form. When the manual says "V2000 = 41025" you send `0x0400`, not
`0x0401`.
Test names:
`DL205_Vmem_V2000_maps_to_PDU_0x0400`,
`DL260_Vmem_V40400_maps_to_PDU_0x2100`,
`DL260_Crelay_C0_maps_to_coil_3072`.
## Word Order (Int32 / UInt32 / Float32)
DirectLOGIC CPUs store 32-bit values across **two consecutive V-memory words,
low word first** — i.e., `CDAB` when viewed as a Modbus register pair [1][3].
Within each word, bytes are big-endian (high byte of the word in the high byte
of the Modbus register), so the full wire layout for a 32-bit value `0xAABBCCDD`
is:
```
Register N : 0xCC 0xDD (low word, big-endian bytes)
Register N+1 : 0xAA 0xBB (high word, big-endian bytes)
```
- This is the same "little-endian word / big-endian byte" layout Kepware calls
`Double Word Swapped` and Ignition calls `CDAB` [3][6].
- **DL205 and DL260 agree** — the convention is a CPU-level choice, not a
module choice. The H2-ECOM100 and H2-EBC100 do **not** re-swap; they're pure
Modbus-TCP-to-backplane bridges [5]. The DL260 built-in Ethernet port
behaves identically.
- **Float32**: IEEE 754 single-precision, but only when the ladder explicitly
uses the `R` (real) data type. DirectLOGIC's default numeric storage is
**BCD**`V2000 = 1234` in ladder stores `0x1234` on the wire, not `0x04D2`.
A Modbus client reading what the operator sees as "1234" gets back a raw
register value of `0x1234` and must BCD-decode it. Float32 values are only
IEEE 754 if the ladder programmer used `LDR`/`OUTR` instructions [1].
- **Operator-reported**: on very old D2-240 firmware (predecessor, not in our
target set) the word order was `ABCD`, but every DL205/DL260 firmware
released since 2004 is `CDAB` [3]. _Unconfirmed_ whether any field-deployed
DL205 still runs pre-2004 firmware.
Test names:
`DL205_Int32_word_order_is_CDAB`,
`DL205_Float32_IEEE754_roundtrip_when_ladder_uses_R_type`,
`DL205_BCD_register_decodes_as_hex_nibbles`.
## Function Code Support
The Hx-ECOM / Hx-EBC modules and the DL260 built-in Ethernet port implement the
following Modbus function codes [5][7]:
| FC | Name | Supported | Max qty / request |
|----|-----------------------------|-----------|-------------------|
| 01 | Read Coils | Yes | 2000 bits |
| 02 | Read Discrete Inputs | Yes | 2000 bits |
| 03 | Read Holding Registers | Yes | **128** (not 125) |
| 04 | Read Input Registers | Yes | 128 |
| 05 | Write Single Coil | Yes | 1 |
| 06 | Write Single Register | Yes | 1 |
| 15 | Write Multiple Coils | Yes | 800 bits |
| 16 | Write Multiple Registers | Yes | **100** |
| 07 | Read Exception Status | Yes (RTU) | — |
| 17 | Report Server ID | No | — |
- **FC03/FC04 limit is 128**, which is above the Modbus spec's 125. Requesting
129+ returns exception code `03` (Illegal Data Value) [5].
- **FC16 limit is 100**, below the spec's 123. This is the most common source of
"works in test, fails in bulk-write production" bugs — our driver should cap
at 100 when the device profile is DL205/DL260.
- **No custom function codes** are exposed on the Modbus port. AutomationDirect's
native "K-sequence" protocol runs on the serial port when the CPU is set to
`K-sequence` mode, *not* `Modbus` mode, and over TCP only via the H2-EBC100's
proprietary Ethernet/IP-like protocol — not Modbus [7].
Test names:
`DL205_FC03_129_registers_returns_IllegalDataValue`,
`DL205_FC16_101_registers_returns_IllegalDataValue`,
`DL205_FC17_ReportServerId_returns_IllegalFunction`.
## Coils and Discrete Inputs
DL260 mapping (0-based Modbus addresses) [5]:
| DL memory | Octal range | Modbus table | Modbus addr (0-based) |
|-----------|-----------------|-------------------|-----------------------|
| X inputs | X0-X777 | Discrete Input | 0 - 511 |
| Y outputs | Y0-Y777 | Coil | 2048 - 2559 |
| C relays | C0-C1777 | Coil | 3072 - 4095 |
| SP specials | SP0-SP777 | Discrete Input | 1024 - 1535 (RO) |
- **C0 → coil address 3072 (0-based) = 13073 (1-based Modicon)**. Y0 → coil
2048 = 12049. These offsets are wired into the CPU and cannot be remapped.
- **Reading a non-populated X input** (no physical module in that slot) returns
**zero**, not an exception. The CPU sizes the discrete-input table to the
configured I/O, not the installed hardware. Confirmed in the DL260 user
manual's I/O configuration chapter [4].
- **Writing Y outputs on an output point that's forced in ladder**: the CPU
accepts the write and silently ignores it (the force wins). No exception is
returned. _Operator-reported_, matches Kepware driver release notes [3].
Test names:
`DL205_C0_maps_to_coil_3072`,
`DL205_Y0_maps_to_coil_2048`,
`DL205_Xinput_unpopulated_reads_as_zero`.
## Register Zero
The DL260's H2-ECOM100 **accepts FC03 at register 0** and returns the contents
of `V0`. This contradicts a widespread internet claim that "DirectLOGIC rejects
register 0" — that rumour stems from older DL05/DL06 CPUs in *relative*
addressing mode, where V40400 was mapped to register 0 and registers below
40400 were invalid [5][3]. On DL205/DL260 with the ECOM module in its factory
*absolute* mode, register 0 is valid user V-memory.
- Our driver's `ModbusProbeOptions.ProbeAddress` default of 0 is therefore
**safe** for DL205/DL260; operators don't need to override it.
- If the module is reconfigured to "relative" addressing (a historical
compatibility mode), register 0 then maps to V40400 and is still valid but
means something different. The probe will still succeed.
Test name: `DL205_FC03_register_0_returns_V0_contents`.
## Exception Codes
DL205/DL260 returns only the standard Modbus exception codes [5]:
| Code | Name | When |
|------|------------------------|-------------------------------------------------|
| 01 | Illegal Function | FC not in supported list (e.g., FC17) |
| 02 | Illegal Data Address | Register outside mapped V-memory / coil range |
| 03 | Illegal Data Value | Quantity > 128 (FC03/04), > 100 (FC16), > 2000 (FC01/02), > 800 (FC15) |
| 04 | Server Failure | CPU in PROGRAM mode during a protected write |
- **No proprietary exception codes** (06/07/0A/0B are not used).
- **Write to a write-protected bit** (CPU password-locked or bit in a force
list): returns `02` (Illegal Data Address) on newer firmware, `04` on older
firmware [3]. _Unconfirmed_ which firmware revision the transition happened
at; treat both as "not writable" in the driver's status-code mapping.
- **Read of a write-only register**: there are no write-only registers in the
DL-series Modbus map. Every writable register is also readable.
Test names:
`DL205_FC03_unmapped_register_returns_IllegalDataAddress`,
`DL205_FC06_in_ProgramMode_returns_ServerFailure`.
## Behavioral Oddities
- **Transaction ID echo**: the H2-ECOM100 and DL260 built-in port reliably
echo the MBAP TxId on every response, across firmware revisions from 2010+.
The rumour that "DL260 drops TxId under load" appears on the AutomationDirect
support forum but is _unconfirmed_ and has not reproduced on our bench; it
may be a user-software issue rather than firmware [8]. Our driver's
single-flight + TxId-match guard handles it either way.
- **Concurrency**: the ECOM serializes requests internally. Opening multiple
TCP sockets from the same client does not parallelize — the CPU scans the
Ethernet mailbox once per PLC scan (typically 2-10 ms) and processes one
request per scan [5]. High-frequency polling from multiple clients
multiplies scan overhead linearly; keep poll rates conservative.
- **Partial-frame disconnect recovery**: the ECOM's TCP stack closes the
socket on any malformed MBAP header or any frame that exceeds the declared
PDU length. It does not resynchronize mid-stream. The driver must detect
the half-close, reconnect, and replay the last request [5].
- **Keepalive**: the ECOM does **not** send TCP keepalives. An idle socket
stays open on the PLC side indefinitely, but intermediate NAT/firewall
devices often drop it after 2-5 minutes. Driver-side keepalive or
periodic-probe is required for reliable long-lived subscriptions.
- **Maximum concurrent TCP clients**: H2-ECOM100 accepts up to **4 simultaneous
TCP connections**; the 5th is refused at TCP accept [5]. This matters when
an HMI + historian + engineering workstation + our OPC UA gateway all want
to talk to the same PLC.
Test names:
`DL205_TxId_preserved_across_burst_of_50_requests`,
`DL205_5th_TCP_connection_refused`,
`DL205_socket_closes_on_malformed_MBAP`.
## References
1. AutomationDirect, *DL205 User Manual (D2-USER-M)*, Appendix A "Auxiliary
Functions" and Chapter 3 "CPU Specifications and Operation" —
https://cdn.automationdirect.com/static/manuals/d2userm/d2userm.html
2. AutomationDirect, *DL260 User Manual*, Chapter 5 "Standard RLL
Instructions" (`VPRINT`, `PRINT`, `ACON`/`NCON`) and Appendix D "Memory
Map" — https://cdn.automationdirect.com/static/manuals/d2userm/d2userm.html
3. Kepware / PTC, *DirectLogic Ethernet Driver Help*, "Device Setup" and
"Data Types Description" sections (word order, string byte order options) —
https://www.kepware.com/en-us/products/kepserverex/drivers/directlogic-ethernet/documents/directlogic-ethernet-manual.pdf
4. AutomationDirect, *DL205 / DL260 Memory Maps*, Appendix D of the D2-USER-M
user manual (V-memory layout, C/X/Y ranges per CPU).
5. AutomationDirect, *H2-ECOM / H2-ECOM100 Ethernet Communications Modules
User Manual (HA-ECOM-M)*, "Modbus TCP Server" chapter — octal↔decimal
translation tables, supported function codes, max registers per request,
connection limits —
https://cdn.automationdirect.com/static/manuals/hxecomm/hxecomm.html
6. Inductive Automation, *Ignition Modbus Driver — Address Mapping*, word
order options (ABCD/CDAB/BADC/DCBA) —
https://docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/drivers/modbus-v2
7. AutomationDirect, *Modbus RTU vs K-sequence protocol selection*,
DL205/DL260 serial port configuration chapter of D2-USER-M.
8. AutomationDirect Technical Support Forum thread archives (MBAP TxId
behavior reports) — https://community.automationdirect.com/ (search:
"ECOM100 transaction id"). _Unconfirmed_ operator reports only.

195
docs/v2/lmx-followups.md Normal file
View File

@@ -0,0 +1,195 @@
# LMX Galaxy bridge — remaining follow-ups
State after PR 19: the Galaxy driver is functionally at v1 parity through the
`IDriver` abstraction; the OPC UA server runs with LDAP-authenticated
Basic256Sha256 endpoints and alarms are observable through
`AlarmConditionState.ReportEvent`. The items below are what remains LMX-
specific before the stack can fully replace the v1 deployment, in
rough priority order.
## 1. Proxy-side `IHistoryProvider` for `ReadAtTime` / `ReadEvents` — **DONE (PRs 35 + 38)**
PR 35 extended `IHistoryProvider` with `ReadAtTimeAsync` + `ReadEventsAsync`
(default throwing implementations so existing impls keep compiling), added the
`HistoricalEvent` + `HistoricalEventsResult` records to `Core.Abstractions`,
and implemented both methods in `GalaxyProxyDriver` on top of the PR 10 / PR 11
IPC messages.
PR 38 wired the OPC UA HistoryRead service-handler through
`DriverNodeManager` by overriding `CustomNodeManager2`'s four per-kind hooks —
`HistoryReadRawModified` / `HistoryReadProcessed` / `HistoryReadAtTime` /
`HistoryReadEvents`. Each walks `nodesToProcess`, resolves the driver-side
full reference from `NodeId.Identifier`, dispatches to the right
`IHistoryProvider` method, and populates the paired results + errors lists
(both must be set — the MasterNodeManager merges them and a Good result with
an unset error slot serializes as `BadHistoryOperationUnsupported` on the
wire). Historized variables gain `AccessLevels.HistoryRead` so the stack
dispatches; the driver root folder gains `EventNotifiers.HistoryRead` so
`HistoryReadEvents` can target it.
Aggregate translation uses a small `MapAggregate` helper that handles
`Average` / `Minimum` / `Maximum` / `Total` / `Count` (the enum surface the
driver exposes) and returns null for unsupported aggregates so the handler
can surface `BadAggregateNotSupported`. Raw+Processed+AtTime wrap driver
samples as `HistoryData` in an `ExtensionObject`; Events emits a
`HistoryEvent` with the standard BaseEventType field list (EventId /
SourceName / Message / Severity / Time / ReceiveTime) — custom
`SelectClause` evaluation is an explicit follow-up.
**Tests**:
- `DriverNodeManagerHistoryMappingTests` — 12 unit cases pinning
`MapAggregate`, `BuildHistoryData`, `BuildHistoryEvent`, `ToDataValue`.
- `HistoryReadIntegrationTests` — 5 end-to-end cases drive a real OPC UA
client (`Session.HistoryRead`) against a fake `IHistoryProvider` driver
through the running stack. Covers raw round-trip, processed with Average
aggregate, unsupported aggregate → `BadAggregateNotSupported`, at-time
timestamp forwarding, and events field-list shape.
**Deferred**:
- Continuation-point plumbing via `Session.Save/RestoreHistoryContinuationPoint`.
Driver returns null continuations today so the pass-through is fine.
- Per-`SelectClause` evaluation in HistoryReadEvents — clients that send a
custom field selection currently get the standard BaseEventType layout.
## 2. Write-gating by role — **DONE (PR 26)**
Landed in PR 26. `WriteAuthzPolicy` in `Server/Security/` maps
`SecurityClassification` → required role (`FreeAccess` → no role required,
`Operate`/`SecuredWrite``WriteOperate`, `Tune``WriteTune`,
`Configure`/`VerifiedWrite``WriteConfigure`, `ViewOnly` → deny regardless).
`DriverNodeManager` caches the classification per variable during discovery and
checks the session's roles (via `IRoleBearer`) in `OnWriteValue` before calling
`IWritable.WriteAsync`. Roles do not cascade — a session with `WriteOperate`
can't write a `Tune` attribute unless it also carries `WriteTune`.
See `feedback_acl_at_server_layer.md` in memory for the architectural directive
that authz stays at the server layer and never delegates to driver-specific auth.
## 3. Admin UI client-cert trust management — **DONE (PR 28)**
PR 28 shipped `/certificates` in the Admin UI. `CertTrustService` reads the OPC
UA server's PKI store root (`OpcUaServerOptions.PkiStoreRoot` — default
`%ProgramData%\OtOpcUa\pki`) and lists rejected + trusted certs by parsing the
`.der` files directly, so it has no `Opc.Ua` dependency and runs on any
Admin host that can reach the shared PKI directory.
Operator actions: Trust (moves `rejected/certs/*.der``trusted/certs/*.der`),
Delete rejected, Revoke trust. The OPC UA stack re-reads the trusted store on
each new client handshake, so no explicit reload signal is needed —
operators retry the rejected client's connection after trusting.
Deferred: flipping `AutoAcceptUntrustedClientCertificates` to `false` as the
deployment default. That's a production-hardening config change, not a code
gap — the Admin UI is now ready to be the trust gate.
## 4. Live-LDAP integration test — **DONE (PR 31)**
PR 31 shipped `Server.Tests/LdapUserAuthenticatorLiveTests.cs` — 6 live-bind
tests against the dev GLAuth instance at `localhost:3893`, skipped cleanly
when the port is unreachable. Covers: valid bind, wrong password, unknown
user, empty credentials, single-group → WriteOperate mapping, multi-group
admin user surfacing all mapped roles.
Also added `UserNameAttribute` to `LdapOptions` (default `uid` for RFC 2307
compat) so Active Directory deployments can configure `sAMAccountName` /
`userPrincipalName` without code changes. `LdapUserAuthenticatorAdCompatTests`
(5 unit guards) pins the AD-shape DN parsing + filter escape behaviors. See
`docs/security.md` §"Active Directory configuration" for the AD appsettings
snippet.
Deferred: asserting `session.Identity` end-to-end on the server side (i.e.
drive a full OPC UA session with username/password, then read an
`IHostConnectivityProbe`-style "whoami" node to verify the role surfaced).
That needs a test-only address-space node and is a separate PR.
## 5. Full Galaxy live-service smoke test against the merged v2 stack — **IN PROGRESS (PRs 36 + 37)**
PR 36 shipped the prerequisites helper (`AvevaPrerequisites`) that probes
every dependency a live smoke test needs and produces actionable skip
messages.
PR 37 shipped the live-stack smoke test project structure:
`tests/Driver.Galaxy.Proxy.Tests/LiveStack/` with `LiveStackFixture` (connects
to the *already-running* `OtOpcUaGalaxyHost` Windows service via named pipe;
never spawns the Host process) and `LiveStackSmokeTests` covering:
- Fixture initializes successfully (IPC handshake succeeds end-to-end).
- Driver reports `DriverState.Healthy` post-handshake.
- `DiscoverAsync` returns at least one variable from the live Galaxy.
- `GetHostStatuses` reports at least one Platform/AppEngine host.
- `ReadAsync` on a discovered variable round-trips through
Proxy → Host pipe → MXAccess → back without a BadInternalError.
Shared secret + pipe name resolve from `OTOPCUA_GALAXY_SECRET` /
`OTOPCUA_GALAXY_PIPE` env vars, falling back to reading the service's
registry-stored Environment values (requires elevated test host).
**PR 40** added the write + subscribe facts targeting
`DelmiaReceiver_001.TestAttribute` (the writable Boolean UDA the dev Galaxy
ships under TestMachine_001) — write-then-read with a 5s scan-window poll +
restore-on-finally, and subscribe-then-write asserting both an initial-value
OnDataChange and a post-write OnDataChange. PR 39 added the elevated-shell
short-circuit so a developer running from an admin window gets an actionable
skip instead of `UnauthorizedAccessException`.
**Run the live tests** (from a NORMAL non-admin PowerShell):
```powershell
$env:OTOPCUA_GALAXY_SECRET = Get-Content C:\Users\dohertj2\Desktop\lmxopcua\.local\galaxy-host-secret.txt
cd C:\Users\dohertj2\Desktop\lmxopcua
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests --filter "FullyQualifiedName~LiveStackSmokeTests"
```
Expected: 7/7 pass against the running `OtOpcUaGalaxyHost` service.
**Remaining for #5 in production-grade form**:
- Confirm the suite passes from a non-elevated shell (operator action).
- Add similar facts for an alarm-source attribute once `TestMachine_001` (or
a sibling) carries a deployed alarm condition — the current dev Galaxy's
TestAttribute isn't alarm-flagged.
## 6. Second driver instance on the same server — **DONE (PR 32)**
`Server.Tests/MultipleDriverInstancesIntegrationTests.cs` registers two
drivers with distinct `DriverInstanceId`s on one `DriverHost`, spins up the
full OPC UA server, and asserts three behaviors: (1) each driver's namespace
URI (`urn:OtOpcUa:{id}`) resolves to a distinct index in the client's
NamespaceUris, (2) browsing one subtree returns that driver's folder and
does NOT leak the other driver's folder, (3) reads route to the correct
driver — the alpha instance returns 42 while beta returns 99, so a misroute
would surface at the assertion layer.
Deferred: the alarm-event multi-driver parity case (two drivers each raising
a `GalaxyAlarmEvent`, assert each condition lands on its owning instance's
condition node). Alarm tracking already has its own integration test
(`AlarmSubscription*`); the multi-driver alarm case would need a stub
`IAlarmSource` that's worth its own focused PR.
## 7. Host-status per-AppEngine granularity → Admin UI dashboard — **DONE (PRs 33 + 34)**
**PR 33** landed the data layer: `DriverHostStatus` entity + migration with
composite key `(NodeId, DriverInstanceId, HostName)` and two query-supporting
indexes (per-cluster drill-down on `NodeId`, stale-row detection on
`LastSeenUtc`).
**PR 34** wired the publisher + consumer. `HostStatusPublisher` is a
`BackgroundService` in the Server process that walks every registered
`IHostConnectivityProbe`-capable driver every 10s, calls
`GetHostStatuses()`, and upserts rows (`LastSeenUtc` advances each tick;
`State` + `StateChangedUtc` update on transitions). Admin UI `/hosts` page
groups by cluster, shows four summary cards (Hosts / Running / Stale /
Faulted), and flags rows whose `LastSeenUtc` is older than 30s as Stale so
operators see crashed Servers without waiting for a state change.
Deferred as follow-ups:
- Event-driven push (subscribe to `OnHostStatusChanged` per driver for
sub-heartbeat latency). Adds DriverHost lifecycle-event plumbing;
10s polling is fine for operator-scale use.
- Failure-count column — needs the publisher to track a transition history
per host, not just current-state.
- SignalR fan-out to the Admin page (currently the page polls the DB, not
a hub). The DB-polled version is fine at current cadence but a hub push
would eliminate the 10s race where a new row sits in the DB before the
Admin page notices.

451
docs/v2/mitsubishi.md Normal file
View File

@@ -0,0 +1,451 @@
# Mitsubishi Electric MELSEC — Modbus TCP quirks
Mitsubishi's MELSEC family speaks Modbus TCP through a patchwork of add-on modules
and built-in Ethernet ports, not a single unified stack. The module names are
confusingly similar (`QJ71MB91` is *serial* RTU, `QJ71MT91` is the TCP/IP module
[9]; `LJ71MT91` is the L-series equivalent; `RJ71EN71` is the iQ-R Ethernet module
with a MODBUS/TCP *slave* mode bolted on [8]; `FX3U-ENET`, `FX3U-ENET-P502`,
`FX3U-ENET-ADP`, `FX3GE` built-in, and `FX5U` built-in are all different code
paths) — and every one of the categories below has at least one trap a textbook
Modbus client gets wrong: hex-numbered X/Y devices colliding with decimal Modbus
addresses, a user-defined "device assignment" parameter block that means *no two
sites are identical*, CDAB-vs-ABCD word order driven by how the ladder built the
32-bit value, sub-spec FC16 caps on the older QJ71MT91, and an FX3U port-502
licensing split that makes `FX3U-ENET` and `FX3U-ENET-P502` different SKUs.
This document catalogues each quirk, cites primary sources, and names the
ModbusPal integration test we'd write for it (convention from
`docs/v2/modbus-test-plan.md`: `Mitsubishi_<model>_<behavior>`).
## Models and server/client capability
| Model | Family | Modbus TCP server | Modbus TCP client | Source |
|------------------------|----------|-------------------|-------------------|--------|
| `QJ71MT91` | MELSEC-Q | Yes (slave) | Yes (master) | [9] |
| `QJ71MB91` | MELSEC-Q | **Serial only** — RS-232/422/485 RTU, *not TCP* | — | [1][3] |
| `LJ71MT91` | MELSEC-L | Yes (slave) | Yes (master) | [10] |
| `RJ71EN71` / `RnENCPU` | MELSEC iQ-R | Yes (slave) | Yes (master) | [8] |
| `RJ71C24` / `RJ71C24-R2` | MELSEC iQ-R | RTU (serial) | RTU (serial) | [13] |
| iQ-R built-in Ethernet | CPU | Yes (slave) | Yes (master) | [7] |
| iQ-F `FX5U` built-in Ethernet | CPU | Yes, firmware ≥ 1.060 [11] | Yes | [7][11][12] |
| `FX3U-ENET` | FX3U bolt-on | Yes (slave), but **not on port 502** [5] | Yes | [4][5] |
| `FX3U-ENET-P502` | FX3U bolt-on | Yes (slave), port 502 enabled | Yes | [5] |
| `FX3U-ENET-ADP` | FX3U adapter | **No MODBUS** [5] | No MODBUS | [5] |
| `FX3GE` built-in | FX3GE CPU | No MODBUS (needs ENET module) [6] | No | [6] |
| `FX3G` + `FX3U-ENET` | FX3G | Yes via ENET module | Yes | [6] |
- A common integration mistake is to buy `FX3U-ENET-ADP` expecting MODBUS —
that adapter speaks only MC protocol / SLMP. Our driver should surface a clear
capability error, not "connection refused", when the operator's device tag
says `FX3U-ENET-ADP` [5].
- Older forum threads assert the FX5U is "client only" [12] — that was true on
firmware ≤ 1.040. Firmware 1.060 and later ship the parameter-driven MODBUS
TCP server built-in and need no function blocks [11].
## Modbus device assignment (the parameter block)
Unlike a DL260 where the CPU exposes a *fixed* V-memory-to-Modbus mapping, every
MELSEC MODBUS-TCP module exposes a **Modbus Device Assignment Parameter** block
that the engineer configures in GX Works2 / GX Configurator-MB / GX Works3.
Each of the four Modbus tables (Coil, Input, Input Register, Holding Register)
can be split into up to 16 independent "assignment" entries, each binding a
contiguous Modbus address range to a MELSEC device head (`M0`, `D0`, `X0`,
`Y0`, `B0`, `W0`, `SM0`, `SD0`, `R0`, etc.) and a point count [3][7][8][9].
- **There is no canonical "MELSEC Modbus mapping"**. Two sites running the same
QJ71MT91 module can expose completely different Modbus layouts. Our driver
must treat the mapping as site-data (config-file-driven), not as a device
profile constant.
- **Default values do exist** — both GX Configurator-MB (for Q/L series) and
GX Works3 (for iQ-R / iQ-F / FX5) ship a "dedicated pattern" default that is
applied when the engineer does not override the assignment. Per the FX5
MODBUS Communication manual (JY997D56101) and the QJ71MT91 manual, the FX5
dedicated default is [3][7][11]:
| Modbus table | Modbus range (0-based) | MELSEC device | Head |
|--------------------|------------------------|---------------|------|
| Coil (FC01/05/15) | 0 7679 | M | M0 |
| Coil | 8192 8959 | Y | Y0 |
| Input (FC02) | 0 7679 | M | M0 |
| Input | 8192 8959 | X | X0 |
| Input Register (FC04) | 0 6143 | D | D0 |
| Holding Register (FC03/06/16) | 0 6143 | D | D0 |
This matches the widely circulated "FC03 @ 0 = D0" convention that shows up
in Ubidots / Ignition / AdvancedHMI integration guides [6][12].
- **X/Y in the default mapping occupy a second, non-zero Modbus range** (8192+
on FX5; similar on Q/L/iQ-R). Driver users who expect "X0 = coil 0" will be
reading M0 instead. Document this clearly.
- **Assignment-range collisions silently disable the slave.** The QJ71MT91
manual states explicitly that if any two of assignments 1-16 duplicate the
head Modbus device number, the slave function is inactive with no clear
error — the module just won't respond [9]. The driver probe will look like a
simple timeout; the site engineer has to open GX Configurator-MB to diagnose.
Test names:
`Mitsubishi_FX5U_default_mapping_coil_0_is_M0`,
`Mitsubishi_FX5U_default_mapping_holding_0_is_D0`,
`Mitsubishi_QJ71MT91_duplicate_assignment_head_disables_slave`.
## X/Y addressing — hex on MELSEC, decimal on Modbus
**MELSEC X (input) and Y (output) device numbers are hexadecimal on Q / L /
iQ-R** and **octal** on FX / iQ-F (with a GX Works3 toggle) [14][15].
- On a Q CPU, `X20` means decimal **32**, not 20. On an FX5U in default (octal)
mode, `X20` means decimal **16**. GX Works3 exposes a project-level option to
display FX5U X/Y in hex to match Q/L/iQ-R convention — the same physical
input is then called `X10` [14].
- The Modbus Device Assignment Parameter block takes the *head device* as a
MELSEC-native number, which is interpreted in the CPU's native base
(hex for Q/L/iQ-R, octal for FX/iQ-F). After that, **Modbus offsets from
the head are plain decimal** — the module does not apply a second hex
conversion [3][9].
- Example (QJ71MT91 on a Q CPU): assignment "Coil 0 = X0, 512 points" exposes
physical `X0` through `X1FF` (hex) as coils 0-511. A client reading coil 32
gets the bit `X20` (hex) — i.e. the 33rd input, not the value at "input 20"
that the operator wrote on the wiring diagram in decimal.
- **Driver bug source**: if the operator's tag configuration says "read X20" and
the driver helpfully converts "20" to decimal 20 → coil offset 20, the
returned bit is actually `X14` (hex) — off by twelve. Our config layer must
preserve the MELSEC-native base that the site engineer sees in GX Works.
- Timers/counters (`T`, `C`, `ST`) are always decimal in MELSEC notation.
Internal relays (`M`, `B`, `L`), data registers (`D`, `W`, `R`, `ZR`),
and special relays/registers (`SM`, `SD`) also decimal. **Only `X` and `Y`
(and on Q/L/iQ-R, `B` link relays and `W` link registers) use hex**, and
the X/Y decision is itself family-dependent [14][15].
Test names:
`Mitsubishi_Q_X_address_is_hex_X20_equals_coil_offset_32`,
`Mitsubishi_FX5U_X_address_is_octal_X20_equals_coil_offset_16`,
`Mitsubishi_W_link_register_is_hex_W10_equals_holding_offset_16`.
## Word order for 32-bit values
MELSEC stores 32-bit ladder values (`DINT`, `DWORD`, `REAL` / single-precision
float) across **two consecutive D-registers, low word first** — i.e., `CDAB`
when viewed as a Modbus register pair [2][6].
```
D100 (low word) : 0xCC 0xDD (big-endian bytes within the word)
D101 (high word) : 0xAA 0xBB
```
A Modbus master reading D100/D101 as a `float` with default (ABCD) word order
gets garbage. Ignition's built-in Modbus driver notes Mitsubishi as a "CDAB
device" specifically for this reason [2].
- **Q / L / iQ-R / iQ-F all agree** — this is a CPU-level convention, not a
module choice. Both the QJ71MT91 manual and the FX5 MODBUS Communication
manual describe 32-bit access by "reading the lower 16 bits from the start
address and the upper 16 bits from start+1" [6][11].
- **Byte order within each register is big-endian** (Modbus standard). The
module does not byte-swap.
- **Configurable?** The MODBUS modules themselves do **not** expose a word-
order toggle; the behavior is fixed to how the CPU laid out the value in the
two D-registers. If the ladder programmer used an `SWAP` instruction or a
union-style assignment, the word order can be whatever they made it — but
for values produced by the standard `D→DBL` and `FLT`/`FLT2` instructions
it is always CDAB [2].
- **FX5U quirk**: the FX5 MODBUS Communication manual tells the programmer to
use the `SWAP` instruction *if* the remote Modbus peer requires
little-endian *byte* ordering (BADC) [11]. This is only relevant when the
FX5U is the Modbus *client*, but it confirms the FX5U's native wire layout
is big-endian-byte / little-endian-word (CDAB) on the server side too.
- **Rumoured exception**: a handful of MrPLC forum threads report iQ-R
RJ71EN71 firmware < 1.05 returning DWORDs in `ABCD` order when accessed via
the built-in Ethernet port's MODBUS slave [8]. _Unconfirmed_; treat as a
per-site test.
Test names:
`Mitsubishi_Float32_word_order_is_CDAB`,
`Mitsubishi_Int32_word_order_is_CDAB`,
`Mitsubishi_FX5U_SWAP_instruction_changes_byte_order_not_word_order`.
## BCD vs binary encoding
**MELSEC stores integer values in D-registers as plain binary two's-complement**,
not BCD [16]. This is the opposite of AutomationDirect DirectLOGIC, where
V-memory defaults to BCD and the ladder must explicitly request binary.
- A ladder `MOV K1234 D100` stores `0x04D2` (1234 decimal) in D100, not
`0x1234`. The Modbus master reads `0x04D2` and decodes it as an integer
directly — no BCD conversion needed [16].
- **Timer / counter current values** (`T0` current value, `C0` count) are
stored in binary as word devices on Q/L/iQ-R/iQ-F. The ladder preset
(`K...`) is also binary [16][17].
- **Timer / counter preset `K` operand in FX3U / earlier FX**: also binary when
loaded from a D-register or a `K` constant. The older A-series CPUs had BCD
presets on some timer types, but MELSEC-Q, L, iQ-R, iQ-F, and FX3U all use
binary presets by default [17].
- The FX3U programming manual dedicates `FNC 18 BCD` and `FNC 19 BIN` to
explicit conversion — their existence confirms that anything in D-registers
that came from a `BCD` instruction output is BCD, but nothing is BCD by
default [17].
- **7-segment display registers** are a common site-specific exception — many
ladders pack `BCD D100` into a D-register so the operator panel can drive
a display directly. Our driver should not assume; expose a per-tag
"encoding = binary | BCD" knob.
Test names:
`Mitsubishi_D_register_stores_binary_not_BCD`,
`Mitsubishi_FX3U_timer_current_value_is_binary`.
## Max registers per request
From the FX5 MODBUS Communication manual Chapter 11 [11]:
| FC | Name | FX5U (built-in) | QJ71MT91 | iQ-R (RJ71EN71 / built-in) | FX3U-ENET |
|----|----------------------------|-----------------|--------------|-----------------------------|-----------|
| 01 | Read Coils | 1-2000 | 1-2000 [9] | 1-2000 [8] | 1-2000 |
| 02 | Read Discrete Inputs | 1-2000 | 1-2000 | 1-2000 | 1-2000 |
| 03 | Read Holding Registers | **1-125** | 1-125 [9] | 1-125 [8] | 1-125 |
| 04 | Read Input Registers | 1-125 | 1-125 | 1-125 | 1-125 |
| 05 | Write Single Coil | 1 | 1 | 1 | 1 |
| 06 | Write Single Register | 1 | 1 | 1 | 1 |
| 0F | Write Multiple Coils | 1-1968 | 1-1968 | 1-1968 | 1-1968 |
| 10 | Write Multiple Registers | **1-123** | 1-123 | 1-123 | 1-123 |
| 16 | Mask Write Register | 1 | not supported | 1 | not supported |
| 17 | Read/Write Multiple Regs | R:1-125, W:1-121 | not supported | R:1-125, W:1-121 | not supported |
- **The FX5U / iQ-R native-port limits match the Modbus spec**: 125 for FC03/04,
123 for FC16 [11]. No sub-spec caps like DL260's 100-register ceiling.
- **QJ71MT91 does not support FC16 (0x16, Mask Write Register) or FC17
(0x17, Read/Write Multiple)** — requesting them returns exception `01`
Illegal Function [9]. FX5U and iQ-R *do* support both.
- **QJ71MT91 device size**: 64k points (65,536) for each of Coil / Input /
Input Register / Holding Register, plus up to 4086k points for Extended
File Register via a secondary assignment range [9].
- **FX3U-ENET / -P502 function code list is a strict subset** of the common
eight (FC01/02/03/04/05/06/0F/10). FC16 and FC17 not supported [4].
Test names:
`Mitsubishi_FX5U_FC03_126_registers_returns_IllegalDataValue`,
`Mitsubishi_FX5U_FC16_124_registers_returns_IllegalDataValue`,
`Mitsubishi_QJ71MT91_FC16_MaskWrite_returns_IllegalFunction`,
`Mitsubishi_QJ71MT91_FC23_ReadWrite_returns_IllegalFunction`.
## Exception codes
MELSEC MODBUS modules return **only the standard Modbus exception codes 01-04**;
no proprietary exception codes are exposed on the wire [8][9][11]. Module-
internal diagnostics (buffer-memory error codes like `7380H`) are logged but
not returned as Modbus exceptions.
| Code | Name | MELSEC trigger |
|------|----------------------|---------------------------------------------------------|
| 01 | Illegal Function | FC17 or FC16 on QJ71MT91/FX3U; FC08 (Diagnostics); FC43 |
| 02 | Illegal Data Address | Modbus address outside any assignment range |
| 03 | Illegal Data Value | Quantity out of per-FC range (see table above); odd coil-byte count |
| 04 | Server Device Failure | See below |
- **04 (Server Failure) triggers on MELSEC**:
- CPU in STOP or PAUSE during a write to an assignment whose "Access from
External Device" permission is set to "Disabled in STOP" [9][11].
*With the default "always enabled" setting the write succeeds in STOP
mode* — another common trap.
- CPU errors (parameter error, watchdog) during any access.
- Assignment points to a device range that is not configured (e.g. write
to `D16384` when CPU D-device size is 12288).
- **Write to a "System Area" device** (e.g., `SD` special registers that are
CPU-reserved read-only) returns `04`, not `02`, on QJ71MT91 and iQ-R — the
assignment is valid, the device exists, but the CPU rejects the write [8][9].
- **FX3U-ENET / -P502** returns `04` on any write attempt while the CPU is in
STOP, regardless of permission settings — the older firmware does not
implement the "Access from External Device" granularity that Q/L/iQ-R/iQ-F
expose [4].
- **No rumour of proprietary codes 05-0B** from MELSEC; operators sometimes
report "exception 0A" but those traces all came from a third-party gateway
sitting between the master and the MELSEC module.
Test names:
`Mitsubishi_QJ71MT91_STOP_mode_write_with_Disabled_permission_returns_ServerFailure`,
`Mitsubishi_QJ71MT91_STOP_mode_write_with_default_permission_succeeds`,
`Mitsubishi_SD_system_register_write_returns_ServerFailure`,
`Mitsubishi_FX3U_STOP_mode_write_always_returns_ServerFailure`.
## Connection behavior
Max simultaneous Modbus TCP clients, per module [7][8][9][11]:
| Model | Max TCP connections | Port 502 | Keepalive | Source |
|----------------------|---------------------|----------|-----------|--------|
| `QJ71MT91` | 16 (shared with master role) | Yes | No | [9] |
| `LJ71MT91` | 16 | Yes | No | [10] |
| iQ-R built-in / `RJ71EN71` | 16 | Yes | Configurable (KeepAlive = ON in parameter) | [8] |
| iQ-F `FX5U` built-in | 8 | Yes | Configurable | [7][11] |
| `FX3U-ENET` | 8 TCP, but **not port 502** | No (port < 1024 blocked) | No | [4][5] |
| `FX3U-ENET-P502` | 8, port 502 enabled | Yes | No | [5] |
- **QJ71MT91's 16 is total connections shared between slave-listen and
master-initiated sockets** [9]. A site that uses the same module as both
master to downstream VFDs and slave to upstream SCADA splits the 16 pool.
- **FX3U-ENET port-502 gotcha**: if the engineer loads a configuration with
port 502 into a non-P502 ENET module, GX Works shows the download as
successful; on next power cycle the module enters error state and the
MODBUS listener never starts. This is documented on third-party FX3G
integration guides [6].
- **CPU STOP → RUN transition**: does **not** drop Modbus connections on any
MELSEC family. Existing sockets stay open; outstanding requests during the
transition may see exception 04 for a few scans but then resume [8][9].
- **CPU reset (power cycle or `SM1255` forced reset)** drops all Modbus
connections and the module re-listens after typically 5-10 seconds.
- **Idle timeout**: QJ71MT91 and iQ-R have a per-connection "Alive-Check"
(idle timer) parameter, default 0 (disabled). If enabled, default 10 s
probe interval, 3 retries before close [8][9]. FX5U similar defaults.
- **Keep-alive (TCP-level)**: only iQ-R / iQ-F expose a TCP keep-alive option
(parameter "KeepAlive" in the Ethernet settings); QJ71MT91 and FX3U-ENET
do not — so NAT/firewall idle drops require driver-side pinging.
Test names:
`Mitsubishi_QJ71MT91_17th_connection_refused`,
`Mitsubishi_FX5U_9th_connection_refused`,
`Mitsubishi_STOP_to_RUN_transition_preserves_socket`,
`Mitsubishi_CPU_reset_closes_all_sockets`.
## Behavioral oddities
- **Transaction ID echo**: QJ71MT91 and iQ-R reliably echo the MBAP TxId on
every response across firmware revisions; no reports of TxId drops under
load [8][9]. FX3U-ENET has an older, less-tested TCP stack; at least one
MrPLC thread reports out-of-order TxId echoes under heavy polling on
firmware < 1.14 [4]. _Unconfirmed_ on current firmware.
- **Per-connection request serialization**: all MELSEC slaves serialize
requests within a single TCP connection — a new request is not processed
until the prior response has been sent. Pipelining multiple requests on one
socket causes the module to queue them in buffer memory and respond in
order, but **the queue depth is 1** on QJ71MT91 (a second in-flight request
is held on the TCP receive buffer, not queued) [9]. Driver should treat
Mitsubishi slaves as strictly single-flight per socket.
- **Partial-frame handling**: QJ71MT91 and iQ-R close the socket on malformed
MBAP length fields. FX5U resynchronises at the next valid MBAP header
within 100 ms but will emit an error to `SD` diagnostics [11]. Driver must
reconnect on half-close and replay.
- **FX3U UDP vs TCP**: `FX3U-ENET` supports both UDP and TCP MODBUS transports;
UDP is lossy and reorders under load. Default is TCP. Some legacy SCADA
configurations pinned the module to UDP for multicast discovery — do not
select UDP unless the site requires it [4].
- **Known firmware-revision variants**:
- QJ71MT91 ≤ firmware 10052000000 (year-month format): FC15 with coil
count that forces byte-count to an odd value silently truncates the
last coil. Fixed in later revisions [9]. _Operator-reported_.
- FX5U firmware < 1.060: no native MODBUS TCP server — only accessible via
a predefined-protocol function block hack. Firmware ≥ 1.060 ships
parameter-based server. Our capability probe should read `SD203`
(firmware version) and flag < 1.060 as unsupported for server mode [11][12].
- iQ-R RJ71EN71 early firmware: possible ABCD word order (rumoured,
unconfirmed) [8].
- **SD (special-register) reads during assignment-parameter load**: while
the CPU is loading a new MODBUS device assignment parameter (~1-2 s), the
slave returns exception 04 Server Failure on every request. Happens after
a parameter write from GX Configurator-MB [9].
- **iQ-R "Station-based block transfer" collision**: if the RJ71EN71 is also
running CC-Link IE Control on the same module, a MODBUS/TCP request that
arrives during a CCIE cyclic period is delayed to the next scan — visible
as jittery response time, not a failure [8].
Test names:
`Mitsubishi_QJ71MT91_single_flight_per_socket`,
`Mitsubishi_FX5U_malformed_MBAP_resync_within_100ms`,
`Mitsubishi_FX3U_TxId_preserved_across_burst`,
`Mitsubishi_FX5U_firmware_below_1_060_reports_no_server_mode`.
## Model-specific differences for test coverage
Summary of which quirks differ per model, so test-class naming can reflect them:
| Quirk | QJ71MT91 | LJ71MT91 | iQ-R (RJ71EN71 / built-in) | iQ-F (FX5U) | FX3U-ENET(-P502) |
|------------------------------------------|----------|----------|----------------------------|-------------|------------------|
| FC16 Mask-Write supported | No | No | Yes | Yes | No |
| FC17 Read/Write Multiple supported | No | No | Yes | Yes | No |
| Max connections | 16 | 16 | 16 | 8 | 8 |
| X/Y numbering base | hex | hex | hex | octal (default) | octal |
| 32-bit word order | CDAB | CDAB | CDAB (firmware-dependent rumour of ABCD) | CDAB | CDAB |
| Port 502 supported | Yes | Yes | Yes | Yes | P502 only |
| STOP-mode write permission configurable | Yes | Yes | Yes | Yes | No (always blocks) |
| TCP keep-alive parameter | No | No | Yes | Yes | No |
| Modbus device assignment — max entries | 16 | 16 | 16 | 16 | 8 |
| Server via parameter (no FB) | Yes | Yes | Yes | Yes (fw ≥ 1.060) | Yes |
- **Test file layout**: `Mitsubishi_QJ71MT91_*`, `Mitsubishi_LJ71MT91_*`,
`Mitsubishi_iQR_*`, `Mitsubishi_FX5U_*`, `Mitsubishi_FX3U_ENET_*`,
`Mitsubishi_FX3U_ENET_P502_*`. iQ-R built-in Ethernet and the RJ71EN71
behave identically for MODBUS/TCP slave purposes and can share a file
`Mitsubishi_iQR_*`.
- **Cross-model shared tests** (word order CDAB, binary not BCD, standard
exception codes, 125-register FC03 cap) can live in a single
`Mitsubishi_Common_*` fixture.
## References
1. Mitsubishi Electric, *MODBUS Interface Module User's Manual — QJ71MB91*
(SH-080578ENG), RS-232/422/485 MODBUS RTU serial module for MELSEC-Q —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plc/sh080578eng/sh080578engk.pdf
2. Inductive Automation, *Ignition Modbus Driver — Mitsubishi Q / iQ-R word
order*, documents CDAB convention —
https://docs.inductiveautomation.com/docs/8.1/ignition-modules/opc-ua/drivers/modbus-v2
and forum discussion https://forum.inductiveautomation.com/t/modbus-tcp-device-word-byte-order/65984
3. Mitsubishi Electric, *Programmable Controller User's Manual QJ71MB91 MODBUS
Interface Module*, Chapter 7 "Parameter Setting" describing the Modbus
Device Assignment Parameter block (assignments 1-16, head-device
configuration) —
https://www.lcautomation.com/dbdocument/29156/QJ71MB91%20Users%20manual.pdf
4. Mitsubishi Electric, *FX3U-ENET User's Manual* (JY997D18101), Chapter on
MODBUS/TCP communication; function code support and connection limits —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plc_fx/jy997d18101/jy997d18101h.pdf
5. Venus Automation, *Mitsubishi FX3U-ENET-P502 Module — Open Port 502 for
Modbus TCP/IP* —
https://venusautomation.com.au/mitsubishi-fx3u-enet-p502-module-open-port-502-for-modbus-tcp-ip/
and FX3U-ENET-ADP user manual (JY997D45801), which confirms the -ADP
variant does not support MODBUS —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plc_fx/jy997d45801/jy997d45801h.pdf
6. XML Control / Ubidots integration notes, *FX3G Modbus* — port-502 trap,
D-register mapping default, word order reference —
https://sites.google.com/site/xmlcontrol/archive/fx3g-modbus
and https://ubidots.com/blog/mitsubishi-plc-as-modbus-tcp-server/
7. FA Support Me, *Modbus TCP on Built-in Ethernet port in iQ-F and iQ-R*
confirms 16-connection limit on iQ-R, 8 on iQ-F, parameter-driven
configuration via GX Works3 —
https://www.fasupportme.com/portal/en/kb/articles/modbus-tcp-on-build-in-ethernet-port-in-iq-f-and-iq-r-en
8. Mitsubishi Electric, *MELSEC iQ-R Ethernet User's Manual (Application)*
(SH-081259ENG) and *MELSEC iQ-RJ71EN71 User's Manual* Chapter on
"Communications Using Modbus/TCP" —
https://www.allied-automation.com/wp-content/uploads/2015/02/MITSUBISHI_manual_plc_iq-r_ethernet_users.pdf
and https://www.manualslib.com/manual/1533351/Mitsubishi-Electric-Melsec-Iq-Rj71en71.html?page=109
9. Mitsubishi Electric, *MODBUS/TCP Interface Module User's Manual — QJ71MT91*
(SH-080446ENG), exception codes page 248, device assignment parameter
pages 116-124, duplicate-assignment-disables-slave note —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plc/sh080446eng/sh080446engj.pdf
10. Mitsubishi Electric, *MELSEC-L Network Features* — LJ71MT91 documented as
L-series equivalent of QJ71MT91 with identical MODBUS/TCP behavior —
https://us.mitsubishielectric.com/fa/en/products/cnt/programmable-controllers/melsec-l-series/network/features/
11. Mitsubishi Electric, *MELSEC iQ-F FX5 User's Manual (MODBUS Communication)*
(JY997D56101), Chapter 11 "Modbus/TCP Communication Specifications" —
function code max-quantity table, frame specification, device assignment
defaults —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plcf/jy997d56101/jy997d56101h.pdf
12. MrPLC forum, *FX5U Modbus-TCP Server (Slave)*, firmware ≥ 1.60 enables
native server via parameter; earlier firmware required function block —
https://mrplc.com/forums/topic/31883-fx5u-modbus-tcp-server-slave/
and Industrial Monitor Direct's "FX5U MODBUS TCP Server Workaround"
article (reflects older firmware behavior) —
https://industrialmonitordirect.com/blogs/knowledgebase/mitsubishi-fx5u-modbus-tcp-server-configuration-workaround
13. Mitsubishi Electric, *MELSEC iQ-R MODBUS and MODBUS/TCP Reference Manual —
RJ71C24 / RJ71C24-R2* (BCN-P5999-1060) — RJ71C24 is serial RTU only,
not TCP —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plc/bcn-p5999-1060/bcnp59991060b.pdf
14. HMS Industrial Networks, *eWON and Mitsubishi FX5U PLC* (KB-0264-00) —
documents that FX5U X/Y are octal in GX Works3 but hex when viewed as a
Q-series PLC through eWON; the project-level hex/octal toggle —
https://hmsnetworks.blob.core.windows.net/www/docs/librariesprovider10/downloads-monitored/manuals/knowledge-base/kb-0264-00-en-ewon-and-mitsubishi-fx5u-plc.pdf
15. Fernhill Software, *Mitsubishi Melsec PLC Data Address* — documents
hex-vs-octal device numbering split across MELSEC families —
https://www.fernhillsoftware.com/help/drivers/mitsubishi-melsec/data-address-format.html
16. Inductive Automation support, *Understanding Mitsubishi PLCs* — D registers
store signed 16-bit binary, not BCD; DINT combines two consecutive D
registers —
https://support.inductiveautomation.com/hc/en-us/articles/16517576753165-Understanding-Mitsubishi-PLCs
17. Mitsubishi Electric, *FXCPU Structured Programming Manual [Device &
Common]* (JY997D26001) — FNC 18 BCD and FNC 19 BIN explicit-conversion
instructions confirm binary-by-default storage —
https://dl.mitsubishielectric.com/dl/fa/document/manual/plc_fx/jy997d26001/jy997d26001l.pdf

121
docs/v2/modbus-test-plan.md Normal file
View File

@@ -0,0 +1,121 @@
# Modbus driver — test plan + device-quirk catalog
The Modbus TCP driver unit tests (PRs 2124) cover the protocol surface against an
in-memory fake transport. They validate the codec, state machine, and function-code
routing against a textbook Modbus server. That's necessary but not sufficient: real PLC
populations disagree with the spec in small, device-specific ways, and a driver that
passes textbook tests can still misbehave against actual equipment.
This doc is the harness-and-quirks playbook. The project it describes lives at
`tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/` — scaffolded in PR 30 with
the simulator fixture, DL205 profile stub, and one write/read smoke test. Each
confirmed DL205 quirk lands in a follow-up PR as a named test in that project.
## Harness
**Chosen simulator: pymodbus 3.13.0** (`pip install 'pymodbus[simulator]==3.13.0'`).
Replaced ModbusPal in PR 43 — see `tests/.../Pymodbus/README.md` for the
trade-off rationale. Headline reasons:
- **Headless** pure-Python CLI; no Java GUI, runs cleanly on a CI runner.
- **Maintained** — current stable 3.13.0; ModbusPal 1.6b is abandoned.
- **All four standard tables** (HR, IR, coils, DI) configurable; ModbusPal
1.6b only exposed HR + coils.
- **Built-in actions** (`increment`, `random`, `timestamp`, `uptime`) +
optional custom-Python actions for declarative dynamic behaviors.
- **Per-register raw uint16 seeding** — encoding the DL205 string-byte-order
/ BCD / CDAB-float quirks stays explicit (the quirk math lives in the
`_quirk` JSON-comment fields next to each register).
- Pip-installable on Windows; sidesteps the privileged-port admin
requirement by defaulting to TCP **5020** instead of 502.
**Setup pattern**:
1. `pip install "pymodbus[simulator]==3.13.0"`.
2. Start the simulator with one of the in-repo profiles:
`tests\.../Pymodbus\serve.ps1 -Profile standard` (or `-Profile dl205`).
3. `dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests`
tests auto-skip when the endpoint is unreachable. Default endpoint is
`localhost:5020`; override via `MODBUS_SIM_ENDPOINT` for a real PLC on its
native port 502.
## Per-device quirk catalog
### AutomationDirect DL205 / DL260
First known target device family. **Full quirk catalog with primary-source citations
and per-quirk integration-test names lives at [`dl205.md`](dl205.md)** — that doc is
the reference; this section is the testing roadmap.
Confirmed quirks (priority order — top items are highest-impact for our driver
and ship first as PR 41+):
| Quirk | Driver impact | Integration-test name |
|---|---|---|
| **String packing**: 2 chars/register, **first char in low byte** (opposite of generic Modbus) | `ModbusDataType.String` decoder must be configurable per-device family — current code assumes high-byte-first | `DL205_String_low_byte_first_within_register` |
| **Word order CDAB** for Int32/UInt32/Float32 | Already configurable via `ModbusByteOrder.WordSwap`; default per device profile | `DL205_Int32_word_order_is_CDAB` |
| **BCD-as-default** numeric storage (only IEEE 754 when ladder uses `R` type) | New decoder mode — register reads as `0x1234` for ladder value `1234`, not as decimal `4660` | `DL205_BCD_register_decodes_as_hex_nibbles` |
| **FC16 capped at 100 registers** (below the spec's 123) | Bulk-write batching must cap per-device-family | `DL205_FC16_101_registers_returns_IllegalDataValue` |
| **FC03/04 capped at 128** (above the spec's 125) | Less impactful — clients that respect the spec's 125 stay safe | `DL205_FC03_129_registers_returns_IllegalDataValue` |
| **V-memory octal-to-decimal addressing** (V2000 octal → 0x0400 decimal) | New address-format helper in profile config so operators can write `V2000` instead of computing `1024` themselves | `DL205_Vmem_V2000_maps_to_PDU_0x0400` |
| **C-relay → coil 3072 / Y-output → coil 2048** offsets | Hard-coded constants in DL205 device profile | `DL205_C0_maps_to_coil_3072`, `DL205_Y0_maps_to_coil_2048` |
| **Register 0 is valid** (rejects-register-0 rumour was DL05/DL06 relative-mode artefact) | None — current default is safe | `DL205_FC03_register_0_returns_V0_contents` |
| **Max 4 simultaneous TCP clients** on H2-ECOM100 | Connect-time: handle TCP-accept failure with a clearer error message | `DL205_5th_TCP_connection_refused` |
| **No TCP keepalive** | Driver-side periodic-probe (already wired via `IHostConnectivityProbe`) | _Covered by existing `ModbusProbeTests`_ |
| **No mid-stream resync on malformed MBAP** | Already covered — single-flight + reconnect-on-error | _Covered by existing `ModbusDriverTests`_ |
| **Write-protect exception code: `02` newer / `04` older** | Translate either to `BadNotWritable` | `DL205_FC06_in_ProgramMode_returns_ServerFailure` |
_Operator-reported / unconfirmed_ — covered defensively in the driver but no
integration tests until reproduced on hardware:
- TxId drop under load (forum rumour; not reproduced).
- Pre-2004 firmware ABCD word order (every shipped DL205/DL260 since 2004 is CDAB).
### Future devices
One section per device class, same shape as DL205. Quirks that apply across
multiple devices (e.g., "all AB PLCs use CDAB") can be noted in the cross-device
patterns section below once we have enough data points.
## Cross-device patterns
Once multiple device catalogs accumulate, quirks that recur across two or more
vendors get promoted into driver defaults or opt-in options:
- _(empty — filled in as catalogs grow)_
## Test conventions
- **One named test per quirk.** `DL205_word_order_is_CDAB_for_Float32` is easier to
diagnose on failure than a generic `Float32_roundtrip`. The `DL205_` prefix makes
filtering by device class trivial (`--filter "DisplayName~DL205"`).
- **Skip with a clear SkipReason.** Follow the pattern from
`GalaxyRepositoryLiveSmokeTests`: check reachability in the fixture, capture
a `SkipReason` string, and have each test call `Assert.Skip(SkipReason)` when
it's set. Don't throw — skipped tests read cleanly in CI logs.
- **Use the real `ModbusTcpTransport`.** Integration tests exercise the wire
protocol end-to-end. The in-memory `FakeTransport` from the unit test suite is
deliberately not used here — its value is speed + determinism, which doesn't
help reproduce device-specific issues.
- **Don't depend on simulator state between tests.** Each test resets the
simulator's register bank or uses a unique address range. Avoid relying on
"previous test left value at register 10" setups that flake when tests run in
parallel or re-order. Either the test mutates the scratch ranges and restores
on finally, or it uses pymodbus's REST API to reset state between facts.
## Next concrete PRs
- **PR 30 — Integration test project + DL205 profile scaffold** — **DONE**.
Shipped `tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests` with
`ModbusSimulatorFixture` (TCP-probe, skips with a clear `SkipReason` when the
endpoint is unreachable), `DL205/DL205Profile.cs` (tag map stub), and
`DL205/DL205SmokeTests.cs` (write-then-read round-trip).
- **PR 41 — DL205 quirk catalog doc** — **DONE**. `docs/v2/dl205.md`
documents every DL205/DL260 Modbus divergence with primary-source citations.
- **PR 42 — ModbusPal `.xmpp` profiles** — **SUPERSEDED by PR 43**. Replaced
with pymodbus JSON because ModbusPal 1.6b is abandoned, GUI-only, and only
exposes 2 of the 4 standard tables.
- **PR 43 — pymodbus JSON profiles** — **DONE**. `Pymodbus/standard.json` +
`Pymodbus/dl205.json` + `Pymodbus/serve.ps1` runner. Both bind TCP 5020.
- **PR 44+**: one PR per confirmed DL205 quirk, landing the named test + any
driver-side adjustment (string byte order, BCD decoder, V-memory address
helper, FC16 cap-per-device-family) needed to pass it. Each quirk's value
is already pre-encoded in `Pymodbus/dl205.json`.

485
docs/v2/s7.md Normal file
View File

@@ -0,0 +1,485 @@
# Siemens SIMATIC S7 (S7-1200 / S7-1500 / S7-300 / S7-400 / ET 200SP) — Modbus TCP quirks
Siemens S7 PLCs do *not* speak Modbus TCP natively at the OS/firmware level. Every
S7 Modbus-TCP-server deployment is either (a) the **`MB_SERVER`** library block
running on the CPU's PROFINET port (S7-1200 / S7-1500 / CPU 1510SP-series
ET 200SP), or (b) the **`MODBUSCP`** function block running on a separate
communication processor (**CP 343-1 / CP 343-1 Lean** on S7-300, **CP 443-1** on
S7-400), or (c) the **`MODBUSPN`** block on an S7-1500 PN port via a licensed
library. That means the quirks a Modbus client has to cope with are as much
"this is how the user's PLC programmer wired the library block up" as "this is
how the firmware behaves" — the byte-order and coil-mapping rules aren't
hard-wired into silicon like they are on a DL260. This document catalogues the
behaviours a driver has to handle across the supported model/CP variants, cites
primary sources, and names the ModbusPal integration test we'd write for each
(convention from `docs/v2/modbus-test-plan.md`: `S7_<model>_<behavior>`).
## Model / CP Capability Matrix
| PLC family | Modbus TCP server mechanism | Modbus TCP client mechanism | License required? | Typical port 502 source |
|---------------------|------------------------------------|------------------------------------|-----------------------|-----------------------------------------------------------|
| S7-1200 (V4.0+) | `MB_SERVER` on integrated PN port | `MB_CLIENT` | No (in TIA Portal) | CPU's onboard Ethernet [1][2] |
| S7-1500 (all) | `MB_SERVER` on integrated PN port | `MB_CLIENT` | No (in TIA Portal) | CPU's onboard Ethernet [1][3] |
| S7-1500 + CP 1543-1 | `MB_SERVER` on CP's IP | `MB_CLIENT` | No | Separate CP IP address [1] |
| ET 200SP CPU (1510SP, 1512SP) | `MB_SERVER` on PN port | `MB_CLIENT` | No | CPU's onboard Ethernet [3] |
| S7-300 + CP 343-1 / CP 343-1 Lean | `MODBUSCP` (FB `MODBUSCP`, instance DB per connection) | Same FB, client mode | **Yes — 2XV9450-1MB00** per CP | CP's Ethernet port [4][5] |
| S7-400 + CP 443-1 | `MODBUSCP` | `MODBUSCP` client mode | **Yes — 2XV9450-1MB00** per CP | CP's Ethernet port [4] |
| S7-400H + CP 443-1 (redundant H) | `MODBUSCP_REDUNDANT` / paired FBs | Not typical | Yes | Paired CPs in H-system [6] |
| S7-300 / S7-400 CPU PN (e.g. CPU 315-2 PN/DP) | `MODBUSPN` library | `MODBUSPN` client mode | **Yes** — Modbus-TCP PN CPU lib | CPU's PN port [7] |
| "CP 343-1 Lean" | **Server only** (no client mode supported by Lean) | — | Yes, but with restrictions | CP's Ethernet port [4][5] |
- **CP 343-1 Lean is server-only.** It can host `MODBUSCP` in server mode only;
client calls return an immediate error. A surprising number of "Lean + client
doesn't work" forum posts trace back to this [5].
- **Pure OPC UA / PROFINET CPs (CP 1542SP-1, CP 1543-1)** support Modbus TCP on
S7-1500 via the same `MB_SERVER`/`MB_CLIENT` instructions by passing the
CP's `hw_identifier`. There is no separate "Modbus CP" license needed on
S7-1500, unlike S7-300/400 [1].
- **No S7 Modbus server supports function codes 20/21 (file records),
22 (mask write), 23 (read-write multiple), or 43 (device identification).**
Sending any of these returns exception `01` (Illegal Function) on every S7
variant [1][4]. Our driver must not negotiate FC23 as a "bulk-read optimization"
when the profile is S7.
Test names:
`S7_1200_MBSERVER_Loads_OB1_Cyclic`,
`S7_CP343_Lean_Client_Mode_Rejected`,
`S7_All_FC23_Returns_IllegalFunction`.
## Address / DB Mapping
S7 Modbus servers **do not auto-expose PLC memory** — the PLC programmer has to
wire one area per Modbus table to a DB or process-image region. This is the
single biggest difference vs. DL205/Modicon/etc., where the memory map is
fixed at the factory. Our driver must therefore be tolerant of "the same
`40001` means completely different things on two S7-1200s on the same site."
### S7-1200 / S7-1500 `MB_SERVER`
The `MB_SERVER` instance exposes four Modbus tables to each connected client;
each table's backing storage is a per-block parameter [1][8]:
| Modbus table | FCs | Backing parameter | Default / typical backing |
|---------------------|-------------|-----------------------------|-----------------------------|
| Coils (0x) | FC01, FC05, FC15 | *implicit* — Q process image | `%Q0.0``%Q1023.7` (→ coil addresses 08191) [1][9] |
| Discrete Inputs (1x)| FC02 | *implicit* — I process image | `%I0.0``%I1023.7` (→ discrete addresses 08191) [1][9] |
| Input Registers (3x)| FC04 | *implicit* — M memory or DB (version-dependent) | Some firmware routes FC04 through the same MB_HOLD_REG buffer [1][8] |
| Holding Registers (4x)| FC03, FC06, FC16 | `MB_HOLD_REG` pointer | User DB (e.g. `DB10.DBW0`) or `%MW` area [1][2][8] |
- **`MB_HOLD_REG` is a pointer (VARIANT / ANY) into a user-defined DB** whose
first byte is holding-register 0 (`40001` in 1-based Modicon form). Byte
offset 2 is register 1, byte offset 4 is register 2, etc. [1][2].
- **The DB *must* have "Optimized block access" UNCHECKED.** Optimized DBs let
the compiler reorder fields for alignment; Modbus requires fixed byte
offsets. With optimized access on, the compiler accepts the project but
`MB_SERVER` returns STATUS `0x8383` (misaligned access) or silently reads
zeros [8][10][11]. This is the #1 support-forum complaint.
- **FC01/FC02/FC05/FC15 hit the Q and I process images directly — not the
`MB_HOLD_REG` DB.** Coil address 0 = `%Q0.0`, coil 1 = `%Q0.1`, coil 8 =
`%Q1.0`. The S7-1200 system manual publishes this mapping as `00001 → Q0.0`
through `09999 → Q1023.7` and `10001 → I0.0` through `19999 → I1023.7` in
1-based form; on the wire (0-based) that's coils 0-8191 and discrete inputs
0-8191 [9].
- **`%M` markers are NOT automatically exposed.** To expose `%M` over Modbus
the programmer must either (a) copy `%M` to the `MB_HOLD_REG` DB each scan,
or (b) define an Array\[0..n\] of Bool inside that DB and copy bits in/out
of `%M`. Siemens has no "MB_COIL_REG" parameter analogous to
`MB_HOLD_REG` — this confuses users migrating from Schneider [9][12].
- **Bit ordering within a Modbus holding register sourced from an `Array of
Bool`**: S7 stores bool\[0\] at `DBX0.0` which is bit 0 of byte 0 which is
the **low byte, low bit** of Modbus register `40001`. A naive client that
reads register `40001` and masks `0x0001` gets bool\[0\]. A client that
masks `0x8000` gets bool\[15\] because the high byte of the Modbus register
is the *second* byte of the DB. Siemens programmers routinely get this
wrong in the DB-via-DBX form; `Array[0..n] of Bool` is the recommended
layout because it aligns naturally [12][13].
### S7-300/400 + CP 343-1 / CP 443-1 `MODBUSCP`
Different paradigm: per-connection **parameter DB** (template
`MODBUS_PARAM_CP`) declares a table of up to 8 register-area mappings. Each
mapping is a tuple `(data_type, DB#, start_offset, length)` where `data_type`
picks the Modbus table [4]:
- `B#16#1` = Coils
- `B#16#2` = Discrete Inputs
- `B#16#3` = Holding Registers
- `B#16#4` = Input Registers
The `holding_register_start` and analogous `coils_start` parameters declare
**which Modbus address range** the CP will serve, and the DB pointers say
where in S7 memory that range lives [4][14]. Unlike `MB_SERVER`, the CP does
not reach into `%Q`/`%I` directly — *everything* goes through a DB. If an
address outside the declared ranges is requested, the CP returns exception
`02` (Illegal Data Address) [4].
Test names:
`S7_1200_FC03_Reg0_Reads_DB10_DBW0`,
`S7_1200_Optimized_DB_Returns_0x8383_MisalignedAccess`,
`S7_1200_FC01_Coil0_Reads_Q0_0`,
`S7_CP343_FC03_Outside_ParamBlock_Range_Returns_IllegalDataAddress`.
## Data Types and Byte Order
Siemens CPUs store scalars **big-endian** internally ("Motorola format"), which
is the same byte order Modbus specifies inside each register. So for 16-bit
values (`Int`, `Word`, `UInt`) the on-the-wire layout is straightforward
`AB` — high byte of the PLC value in the high byte of the Modbus register
[15][16]. No byte-swap trap for 16-bit types.
The trap is 32-bit types (`DInt`, `DWord`, `Real`). Here's what actually
happens across the S7 family:
### S7-1200 / S7-1500 `MB_SERVER`
- **The backing DB stores 32-bit values in big-endian byte order, high word
first** — i.e. `ABCD` when viewed as two consecutive Modbus registers. A
`Real` at `DB10.DBD0` with value `0x12345678` reads over Modbus as
register 0 = `0x1234`, register 1 = `0x5678` [15][16][17].
- **This is `ABCD`, *not* `CDAB`.** Clients that hard-code CDAB (common default
for meters and VFDs) will get wildly wrong floats. Configure the S7 profile
with `WordOrder = ABCD` (aka "big-endian word + big-endian byte" aka
"high-word first") [15][17].
- **`MB_SERVER` does not swap.** It's a direct memcpy from the DB bytes to
the Modbus payload. Whatever byte order the ladder programmer stored into
the DB is what the client receives [17]. This means a programmer who used
`MOVE_BLK` from two separate `Word`s into `DBD` with the "wrong" order can
produce `CDAB` without realising.
- **`Real` is IEEE 754 single-precision** — unambiguous, no BCD trap like on
DL series [15].
- **Strings**: S7 `String[n]` has a 2-byte header (max length, current length)
*before* the character bytes. A client reading a string over Modbus gets
the header in the first register and then the characters two-per-register
in high-byte-first order. `WString` is UTF-16 and the header is 4 bytes
[18]. Our driver's string decoder must expose the "skip header" option for
S7 profile.
### S7-300/400 `MODBUSCP` (CP 343-1 / CP 443-1)
- The CP writes the exact DB bytes onto the wire — again `ABCD` if the DB
stores `DInt`/`Real` in native Siemens order [4].
- **`MODBUSCP` has no `data_type` byte-swap knob.** (The `data_type` parameter
names the Modbus table, not the byte order — see the Address Mapping
section.) If the other end of the link expects `CDAB`, the programmer has
to swap words in ladder before writing the DB [4][14].
### Operator-reported oddity
- Some S7 drivers (Kepware's "Siemens TCP/IP Ethernet" driver, Ignition's
"Siemens S7" driver) expose a per-tag `Float Byte Order` with options
`ABCD`/`CDAB`/`BADC`/`DCBA` because end-users have encountered every
permutation in the field — not because the PLC natively swaps, but because
ladder programmers have historically stored floats every which way [19].
Our S7 Modbus profile should default to `ABCD` but expose a per-tag
override.
- **Unconfirmed rumour**: that S7-1500 firmware V2.0+ reverses float byte
order for `MB_CLIENT` only. Not reproduced; the Siemens forum thread that
launched it was a user error (the remote server was the swapper, not the
S7) [20]. Treat as false until proven.
Test names:
`S7_1200_Real_WordOrder_ABCD_Default`,
`S7_1200_DInt_HighWord_First_At_DBD0`,
`S7_1200_String_Header_First_Two_Bytes`,
`S7_CP343_No_Internal_ByteSwap`.
## Coil / Discrete Input Mapping
On `MB_SERVER` the mapping from coil address → S7 bit is fixed at the
process-image level [1][9][12]:
| Modbus coil / discrete input addr | S7 address | Notes |
|-----------------------------------|---------------|-------------------------------------|
| Coil 0 (FC01/05/15) | `%Q0.0` | bit 0 of output byte 0 |
| Coil 7 | `%Q0.7` | bit 7 of output byte 0 |
| Coil 8 | `%Q1.0` | bit 0 of output byte 1 |
| Coil 8191 (max) | `%Q1023.7` | highest exposed output bit |
| Discrete input 0 (FC02) | `%I0.0` | bit 0 of input byte 0 |
| Discrete input 8191 | `%I1023.7` | highest exposed input bit |
Formulas:
```
coil_addr = byte_index * 8 + bit_index (e.g. %Q5.3 → coil 43)
discr_addr = byte_index * 8 + bit_index (e.g. %I10.2 → disc 82)
```
- **1-based Modicon form adds 1:** coil 0 (wire) = `00001` (Modicon), etc.
Our driver sends the 0-based PDU form, so `%Q0.0` writes to wire address 0.
- **Writing FC05/FC15 to `%Q` is accepted even while the CPU is in STOP** —
the PLC's process image doesn't care about the user program state. But the
output won't propagate to the physical module until RUN (see STOP section
below) [1][21].
- **`%M` markers require a DB-backed `Array of Bool`** as described in the
Address Mapping section. Our driver can't assume "coil N = MN.0" like it
can on Modicon — on S7 it's always Q/I unless the programmer built a
mapping DB [12].
- **Bit-inside-holding-register**: for `Array of Bool` inside the
`MB_HOLD_REG` DB, bool[0] is bit 0 of byte 0 → **low byte, low bit** of
Modbus register 40001. Most third-party clients probe this in the low
byte, so the common case works; the less-common case (bool[8]) is bit 0 of
byte 1 → **high byte, low bit** of Modbus register 40001. Clients that
test only bool[0] will pass and miss the mis-alignment on bool[8] [12][13].
Test names:
`S7_1200_Coil_0_Is_Q0_0`,
`S7_1200_Coil_8_Is_Q1_0`,
`S7_1200_Discrete_Input_7_Is_I0_7`,
`S7_1200_Coil_Write_In_STOP_Accepted_But_Output_Frozen`.
## Function Code Support & Max Registers Per Request
| FC | Name | S7-1200 / S7-1500 MB_SERVER | CP 343-1 / CP 443-1 MODBUSCP | Max qty per request |
|----|----------------------------|-----------------------------|------------------------------|--------------------------------|
| 01 | Read Coils | Yes | Yes | 2000 bits (spec) |
| 02 | Read Discrete Inputs | Yes | Yes | 2000 bits (spec) |
| 03 | Read Holding Registers | Yes | Yes | **125** (spec max) |
| 04 | Read Input Registers | Yes | Yes | **125** |
| 05 | Write Single Coil | Yes | Yes | 1 |
| 06 | Write Single Register | Yes | Yes | 1 |
| 15 | Write Multiple Coils | Yes | Yes | 1968 bits (spec) — *see note* |
| 16 | Write Multiple Registers | Yes | Yes | **123** (spec max for TCP) |
| 07 | Read Exception Status | No (RTU only) | No | — |
| 17 | Report Server ID | No | No | — |
| 20/21 | Read/Write File Record | No | No | — |
| 22 | Mask Write Register | No | No | — |
| 23 | Read/Write Multiple | No | No | — |
| 43 | Read Device Identification | No | No | — |
- **S7-1200/1500 honour the full spec maxima** for FC03/04 (125) and FC16
(123) [1][22]. No sub-spec cap like DL260's 100-register FC16 limit.
- **FC15 (Write Multiple Coils) on `MB_SERVER`** writes into `%Q`, which maxes
out at 1024 bytes = 8192 bits, but the spec's 1968-bit per-request limit
caps any single call first [1][9].
- **`MB_HOLD_REG` buffer size is bounded by DB size** — max DB size on
S7-1200 is 64 KB, on S7-1500 is much larger (several MB depending on CPU),
so the practical `MB_HOLD_REG` limit is 32767 16-bit registers on S7-1200
and effectively unbounded on S7-1500 [22][23]. The *per-request* limit is
still 125.
- **Read past the end of `MB_HOLD_REG`** returns exception `02` (Illegal
Data Address) at the start of the overflow register, not a partial read
[1][8].
- **Request larger than spec max** (e.g. FC03 quantity 126) returns exception
`03` (Illegal Data Value). Verified on S7-1200 V4.2 [1][24].
- **CP 343-1 `MODBUSCP` per-request maxima are spec** (125/125/123/1968/2000),
matching the standard [4]. The CP's `MODBUS_PARAM_CP` caps the total
*exposed* range, not the per-call quantity.
Test names:
`S7_1200_FC03_126_Registers_Returns_IllegalDataValue`,
`S7_1200_FC16_124_Registers_Returns_IllegalDataValue`,
`S7_1200_FC03_Past_MB_HOLD_REG_End_Returns_IllegalDataAddress`,
`S7_1200_FC17_ReportServerId_Returns_IllegalFunction`.
## Exception Codes
S7 Modbus servers return only the four standard exception codes [1][4]:
| Code | Name | Triggered by |
|------|-----------------------|----------------------------------------------------------------------|
| 01 | Illegal Function | FC not in the supported list (17, 20-23, 43, any undefined FC) |
| 02 | Illegal Data Address | Register outside `MB_HOLD_REG` / outside `MODBUSCP` param-block range |
| 03 | Illegal Data Value | Quantity exceeds spec (FC03/04 > 125, FC16 > 123, FC01/02 > 2000, FC15 > 1968) |
| 04 | Server Failure | Runtime error inside MB_SERVER (DB access fault, corrupt DB header, MB_SERVER disabled mid-request) [1][24] |
- **No proprietary exception codes (05/06/0A/0B) are used** on any S7
Modbus server [1][4]. Our driver's status-code mapper can treat these as
"never observed" on the S7 profile.
- **CPU in STOP → `MB_SERVER` keeps running if it's in OB1 of the firmware's
communication task, but OB1 itself is not scanned.** In practice:
- Holding-register *reads* (FC03) continue to return the last DB values
frozen at the moment the CPU entered STOP. The `MB_SERVER` block is in
OB1 so it isn't re-invoked; however the TCP stack keeps the socket open
and returns cached data on subsequent polls [1][21]. **Unconfirmed**
whether this is cached in the CP or in the CPU's communication processor;
behaviour varies between firmware 4.0 and 4.5 [21].
- Holding-register *writes* (FC06/FC16) during STOP return exception `04`
(Server Failure) on S7-1200 V4.2+, and return success-but-discarded on
older firmware [1][24]. Our driver should treat FC06/FC16 during STOP as
non-deterministic and not rely on the response code.
- Coil *writes* (FC05/FC15) to `%Q` are *accepted* by the process image
during STOP, but the physical output freezes at its last RUN-mode value
(or the configured STOP-mode substitute value) until RUN resumes [1][21].
- **Writing a read-only address via FC06/FC16**: returns `02` (Illegal Data
Address), not `04`. S7 does not have "write-protected" holding registers —
the programmer either exposes a DB for read-write or doesn't expose it at
all [1][12].
STATUS codes (returned in the `STATUS` output of the block, not on the wire):
- `0x0000` — no error.
- `0x7001` — first call, connection being established.
- `0x7002` — subsequent cyclic call, connection in progress.
- `0x8383` — data access error (optimized DB, DB too small, or type mismatch)
[10][24].
- `0x8188` — invalid parameter combination (e.g. MB_MODE out of range) [24].
- `0x80C8` — mismatched UNIT_ID between MB_CLIENT and `MB_SERVER` [25].
Test names:
`S7_1200_FC03_Outside_HoldReg_Returns_IllegalDataAddress`,
`S7_1200_FC16_In_STOP_Returns_ServerFailure`,
`S7_1200_FC03_In_STOP_Returns_Cached_Values`,
`S7_1200_No_Proprietary_ExceptionCodes_0x05_0x06_0x0A_0x0B`.
## Connection Behavior
- **Max simultaneous Modbus TCP connections**:
- **S7-1200**: shares a pool of 8 open-communication connections across
all TCP/UDP/Modbus use. On a CPU 1211C you get 8 total; on 1215C/1217C
still 8 shared among PG/HMI/OUC/Modbus. Each `MB_SERVER` instance
reserves one. A typical site with a PG + 1 HMI + 2 Modbus clients uses
4 of the 8 [1][26].
- **S7-1500**: up to **8 concurrent Modbus TCP server connections** per
`MB_SERVER` port, across multiple `MB_SERVER` instance DBs each with a
unique port. Total open-communication resources depend on CPU (e.g.
CPU 1515-2 PN supports 128 OUC connections total; Modbus is a subset)
[1][27].
- **CP 343-1 Lean**: up to **8** simultaneous Modbus TCP connections on
port 502 [4][5]. Exceeding this refuses at TCP accept.
- **CP 443-1 Advanced**: up to **16** simultaneous Modbus TCP connections
[4].
- **Multi-connection model on `MB_SERVER`**: one instance DB per connection.
An instance DB listening on port 502 serves exactly one connection at a
time; to serve N simultaneous clients you need N instance DBs each with a
unique port (502/503/504...). **This is a real trap** — most users expect
port 502 to multiplex [27][28]. Our driver must not assume port 502 is the
only listener.
- **Keep-alive**: S7-1500's TCP stack does send TCP keepalives (default
every ~30 s) but the interval is not exposed as a configurable. S7-1200 is
the same. CP 343-1 keepalives are configured via HW Config → CP properties
→ Options → "Send keepalive" (default **off** on older firmware, default
**on** on firmware V3.0+) [1][29]. Driver-side keepalive is still
advisable for S7-300/CP 343-1 on old firmware.
- **Idle-timeout close**: `MB_SERVER` does *not* close idle sockets on its
own. However, the TCP stack on S7-1500 will close a socket that fails
three consecutive keepalive probes (~2 minutes). Forum reports describe
`MB_SERVER` connections "dying overnight" on S7-1500 when an HMI stops
polling — the fix is to enable driver-side periodic reads or driver-side
TCP keepalive [29][30].
- **Reconnect after power cycle**: MB_SERVER starts listening ~1-2 seconds
after the CPU reaches RUN. If the client reconnects during STARTUP OB
(OB100), the connection is refused until OB1 runs the block at least once.
Our driver should back off and retry on `ECONNREFUSED` for the first 5
seconds after a power-cycle detection [1][24].
- **Unit Identifier**: `MB_SERVER` accepts **any** Unit ID by default — there
is no configurable filter; the PLC ignores the Unit ID field entirely.
`MB_CLIENT` defaults to Unit ID = 255 as "ignore" [25][31]. Some
third-party Modbus-TCP gateways *require* a specific Unit ID; sending
anything to S7 is safe. **CP 343-1 `MODBUSCP`** also accepts any Unit ID
in server mode, but the parameter DB exposes a `single_write` / `unit_id`
field on newer firmware to allow filtering [4].
Test names:
`S7_1200_9th_TCP_Connection_Refused_On_8_Conn_Pool`,
`S7_1500_Port_503_Required_For_Second_Instance`,
`S7_1200_Reconnect_After_Power_Cycle_Succeeds_Within_5s`,
`S7_1200_Unit_ID_Ignored_Any_Accepted`.
## Behavioral Oddities
- **Transaction ID echo** is reliable on all S7 variants. `MB_SERVER` copies
the MBAP TxId verbatim. No known firmware that drops TxId under load [1][31].
- **Request serialization**: a single `MB_SERVER` instance serializes
requests from its one connected client — the block processes one PDU per
call and calls happen once per OB1 scan. OB1 scan time of 5-50 ms puts an
upper bound on throughput at ~20-200 requests/sec per connection [1][30].
Multiple `MB_SERVER` instances (one per port) run in parallel because OB1
calls them sequentially within the same scan.
- **OB1 scan coupling**: `MB_SERVER` must be called cyclically from OB1 (or
another cyclic OB). If the programmer puts it in a conditional branch
that doesn't fire every scan, requests time out. The STATUS `0x7002`
"in progress" is *expected* between calls, not an error [1][24].
- **Optimized DB backing `MB_HOLD_REG`** — already covered in Address
Mapping; STATUS becomes `0x8383`. This is the most common deployment bug
on S7-1500 projects migrated from older S7-1200 examples [10][11].
- **CPU STOP behaviour** — covered in Exception Codes section. The short
version: reads may return stale data without error; writes return exception
04 on modern firmware.
- **Partial-frame disconnect**: S7-1200/1500 TCP stack closes the socket on
any MBAP header where the `Length` field doesn't match the PDU length.
Driver must detect half-close and reconnect [1][29].
- **MBAP `Protocol ID` must be 0**. Any non-zero value causes the CP/CPU to
drop the frame silently (no response, no RST) on S7-1500 firmware V2.0
through V2.9; firmware V3.0+ sends an RST [1][30]. *Unconfirmed* whether
V3.1 still sends RST or returns to silent drop.
- **FC01/FC02 access outside `%Q`/`%I` range**: on S7-1200, requesting
coil address 8192 (= `%Q1024.0`) returns exception `02` (Illegal Data
Address) [1][9]. The 8192-bit hard cap is a process-image size limit on
the CPU, not a Modbus protocol limit.
- **`MB_CLIENT` UNIT_ID mismatch with remote `MB_SERVER`** produces STATUS
`0x80C8` on the client side, and the server silently discards the frame
(no response on the wire) [25]. This matters for Modbus-TCP-to-RTU
gateway scenarios where the Unit ID picks the RTU slave.
- **Non-IEEE REAL / BCD**: S7 does *not* use BCD like DirectLOGIC. `Real` is
always IEEE 754 single-precision. `LReal` (8-byte double) occupies 4
Modbus registers in `ABCDEFGH` order (big-endian byte, big-endian word)
[15][18].
- **`MODBUSCP` single-write** on CP 343-1: a parameter `single_write` in the
param DB controls whether FC06 on a register in the "holding register"
area triggers a callback to the user program vs. updates the DB directly.
Default is direct update. If a ladder programmer enables the callback
without implementing the callback OB, FC06 writes hang for 5 seconds then
return exception `04` [4].
Test names:
`S7_1200_TxId_Preserved_Across_Burst_Of_50_Requests`,
`S7_1200_MBSERVER_Throughput_Capped_By_OB1_Scan`,
`S7_1200_MBAP_ProtocolID_NonZero_Frame_Dropped`,
`S7_1200_Partial_MBAP_Causes_Half_Close`.
## Model-specific Differences Worth Separate Test Coverage
- **S7-1200 V4.0 vs V4.4+**: Older firmware does not support `WString` over
`MB_HOLD_REG` and returns `0x8383` if the DB contains one [18][24]. Test
both firmware bands separately.
- **S7-1500 vs S7-1200**: S7-1500 supports multiple `MB_SERVER` instances on
the *same* CPU with different ports cleanly; S7-1200 can too but its
8-connection pool is shared tighter [1][27]. Throughput per-connection is
~5× faster on S7-1500 because the comms task runs on a dedicated core.
- **S7-300 + CP 343-1 vs S7-1200/1500**: parameter-block mapping (not
`MB_HOLD_REG` pointer), per-connection license, no `%Q`/`%I` direct
access for coils (everything goes through a DB), different STATUS codes
(`DONE`/`ERROR`/`STATUS` word pairs vs. the single STATUS word) [4][14].
Driver-side it's a different profile.
- **CP 343-1 Lean vs CP 343-1 Advanced**: Lean is server-only; Advanced is
client + server. Lean's max connections = 8; Advanced = 16 [4][5].
- **CP 443-1 in S7-400H**: uses `MODBUSCP_REDUNDANT` which presents two
Ethernet endpoints that fail over. Our driver's redundancy support should
recognize the S7-400H profile as "two IP addresses, same server state,
advertise via `ServerUriArray`" [6].
- **ET 200SP CPU (1510SP / 1512SP)**: behaves as S7-1500 from `MB_SERVER`
perspective. No known deltas [3].
## References
1. Siemens Industry Online Support, *Modbus/TCP Communication between SIMATIC S7-1500 / S7-1200 and Modbus/TCP Controllers with Instructions `MB_CLIENT` and `MB_SERVER`*, Entry ID 102020340, V6 (Feb 2021). https://cache.industry.siemens.com/dl/files/340/102020340/att_118119/v6/net_modbus_tcp_s7-1500_s7-1200_en.pdf
2. Siemens TIA Portal Online Docs, *MB_SERVER instruction*. https://docs.tia.siemens.cloud/r/simatic_s7_1200_manual_collection_eses_20/communication-processor-and-modbus-tcp/modbus-communication/modbus-tcp/modbus-tcp-instructions/mb_server-communicate-using-profinet-as-modbus-tcp-server-instruction
3. Siemens, *SIMATIC S7-1500 Communication Function Manual* (covers ET 200SP CPU). http://public.eandm.com/Public_Docs/s71500_communication_function_manual_en-US_en-US.pdf
4. Siemens Industry Online Support, *SIMATIC Modbus/TCP communication using CP 343-1 and CP 443-1 — Programming Manual*, Entry ID 103447617. https://cache.industry.siemens.com/dl/files/617/103447617/att_106971/v1/simatic_modbus_tcp_cp_en-US_en-US.pdf
5. Siemens Industry Online Support FAQ *"Which technical data applies for the SIMATIC Modbus/TCP software for CP 343-1 / CP 443-1?"*, Entry ID 104946406. https://www.industry-mobile-support.siemens-info.com/en/article/detail/104946406
6. Siemens Industry Online Support, *Redundant Modbus/TCP communication via CP 443-1 in S7-400H systems*, Entry ID 109739212. https://cache.industry.siemens.com/dl/files/212/109739212/att_887886/v1/SIMATIC_modbus_tcp_cp_red_e_en-US.pdf
7. Siemens Industry Online Support, *SIMATIC MODBUS (TCP) PN CPU Library — Programming and Operating Manual 06/2014*, Entry ID 75330636. https://support.industry.siemens.com/cs/attachments/75330636/ModbusTCPPNCPUen.pdf
8. DMC Inc., *Using an S7-1200 PLC as a Modbus TCP Slave*. https://www.dmcinfo.com/blog/27313/using-an-s7-1200-plc-as-a-modbus-tcp-slave/
9. Siemens, *SIMATIC S7-1200 System Manual* (V4.x), "MB_SERVER" pages 736-742. https://www.manualslib.com/manual/1453610/Siemens-S7-1200.html?page=736
10. lamaPLC, *Simatic Modbus S7 error- and statuscodes*. https://www.lamaplc.com/doku.php?id=simatic:errorcodes
11. ScadaProtocols, *How to Configure Modbus TCP on Siemens S7-1200 (TIA Portal Step-by-Step)*. https://scadaprotocols.com/modbus-tcp-siemens-s7-1200-tia-portal/
12. Industrial Monitor Direct, *Reading and Writing Memory Bits via Modbus TCP on S7-1200*. https://industrialmonitordirect.com/blogs/knowledgebase/reading-and-writing-memory-bits-via-modbus-tcp-on-s7-1200
13. PLCtalk forum *"Siemens S7-1200 modbus understanding"*. https://www.plctalk.net/forums/threads/siemens-s7-1200-modbus-understanding.104119/
14. Siemens SIMATIC S7 Manual, "Function block MODBUSCP — Functionality" (ManualsLib p29). https://www.manualslib.com/manual/1580661/Siemens-Simatic-S7.html?page=29
15. Chipkin, *How Real (Floating Point) and 32-bit Data is Encoded in Modbus*. https://store.chipkin.com/articles/how-real-floating-point-and-32-bit-data-is-encoded-in-modbus-rtu-messages
16. Siemens Industry Online Support forum, *MODBUS DATA conversion in S7-1200 CPU*, Entry ID 97287. https://support.industry.siemens.com/forum/WW/en/posts/modbus-data-converson-in-s7-1200-cpu/97287
17. Industrial Monitor Direct, *Siemens S7-1500 MB_SERVER Modbus TCP Configuration Guide*. https://industrialmonitordirect.com/de/blogs/knowledgebase/siemens-s7-1500-mb-server-modbus-tcp-configuration-guide
18. Siemens TIA Portal, *Data types in SIMATIC S7-1200/1500 — String/WString header layout* (system manual, "Elementary Data Types").
19. Kepware / PTC, *Siemens TCP/IP Ethernet Driver Help*, "Byte / Word Order" tag property. https://www.opcturkey.com/uploads/siemens-tcp-ip-ethernet-manual.pdf
20. Siemens SiePortal forum, *Transfer float out of words*, Entry ID 187811. https://sieportal.siemens.com/en-ww/support/forum/posts/transfer-float-out-of-words/187811 _(operator-reported "S7 swaps float" claim — traced to remote-device issue; **unconfirmed**.)_
21. Siemens SiePortal forum, *S7-1200 communication with Modbus TCP*, Entry ID 133086. https://support.industry.siemens.com/forum/WW/en/posts/s7-1200-communication-with-modbus-tcp/133086
22. Siemens SiePortal forum, *S7-1500 MB Server Holding Register Max Word*, Entry ID 224636. https://support.industry.siemens.com/forum/WW/en/posts/s7-1500-mb-server-holding-register-max-word/224636
23. Siemens, *SIMATIC S7-1500 Technical Specifications* — CPU-specific DB size limits in each CPU manual's "Memory" table.
24. Siemens TIA Portal Online Docs, *Error messages (S7-1200, S7-1500) — Modbus instructions*. https://docs.tia.siemens.cloud/r/en-us/v20/modbus-rtu-s7-1200-s7-1500/error-messages-s7-1200-s7-1500
25. Industrial Monitor Direct, *Fix Siemens S7-1500 MB_Client UnitID Error 80C8*. https://industrialmonitordirect.com/blogs/knowledgebase/troubleshooting-mb-client-on-s7-1500-cpu-1515sp-modbus-tcp
26. Siemens SiePortal forum, *How many TCP connections can the S7-1200 make?*, Entry ID 275570. https://support.industry.siemens.com/forum/WW/en/posts/how-many-tcp-connections-can-the-s7-1200-make/275570
27. Siemens SiePortal forum, *Simultaneous connections of Modbus TCP*, Entry ID 189626. https://support.industry.siemens.com/forum/ww/en/posts/simultaneous-connections-of-modbus-tcp/189626
28. Siemens SiePortal forum, *How many Modbus TCP IP clients can read simultaneously from S7-1517*, Entry ID 261569. https://support.industry.siemens.com/forum/WW/en/posts/how-many-modbus-tcp-ip-client-can-read-simultaneously-in-s7-1517/261569
29. Industrial Monitor Direct, *Troubleshooting Intermittent Modbus TCP Connections on S7-1500 PLC*. https://industrialmonitordirect.com/blogs/knowledgebase/troubleshooting-intermittent-modbus-tcp-connections-on-s7-1500-plc
30. PLCtalk forum *"S7-1500 modbus tcp speed?"*. https://www.plctalk.net/forums/threads/s7-1500-modbus-tcp-speed.114046/
31. Siemens SiePortal forum, *MB_Unit_ID parameter in Modbus TCP*, Entry ID 156635. https://support.industry.siemens.com/forum/WW/en/posts/mb-unit-id-parameter-in-modbus-tcp/156635

View File

@@ -5,15 +5,18 @@
<h5 class="mb-4">OtOpcUa Admin</h5>
<ul class="nav flex-column">
<li class="nav-item"><a class="nav-link text-light" href="/">Overview</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/fleet">Fleet status</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/hosts">Host status</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/clusters">Clusters</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/reservations">Reservations</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/certificates">Certificates</a></li>
</ul>
<div class="mt-5">
<AuthorizeView>
<Authorized>
<div class="small text-light">
Signed in as <strong>@context.User.Identity?.Name</strong>
Signed in as <a class="text-light" href="/account"><strong>@context.User.Identity?.Name</strong></a>
</div>
<div class="small text-muted">
@string.Join(", ", context.User.Claims.Where(c => c.Type.EndsWith("/role")).Select(c => c.Value))

View File

@@ -0,0 +1,129 @@
@page "/account"
@attribute [Microsoft.AspNetCore.Authorization.Authorize]
@using System.Security.Claims
@using ZB.MOM.WW.OtOpcUa.Admin.Services
<h1 class="mb-4">My account</h1>
<AuthorizeView>
<Authorized>
@{
var username = context.User.FindFirst(ClaimTypes.NameIdentifier)?.Value ?? "—";
var displayName = context.User.Identity?.Name ?? "—";
var roles = context.User.Claims
.Where(c => c.Type == ClaimTypes.Role).Select(c => c.Value).ToList();
var ldapGroups = context.User.Claims
.Where(c => c.Type == "ldap_group").Select(c => c.Value).ToList();
}
<div class="row g-4">
<div class="col-md-6">
<div class="card">
<div class="card-body">
<h5 class="card-title">Identity</h5>
<dl class="row mb-0">
<dt class="col-sm-4">Username</dt><dd class="col-sm-8"><code>@username</code></dd>
<dt class="col-sm-4">Display name</dt><dd class="col-sm-8">@displayName</dd>
</dl>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card">
<div class="card-body">
<h5 class="card-title">Admin roles</h5>
@if (roles.Count == 0)
{
<p class="text-muted mb-0">No Admin roles mapped — sign-in would have been blocked, so if you're seeing this, the session claim is likely stale.</p>
}
else
{
<div class="mb-2">
@foreach (var r in roles)
{
<span class="badge bg-primary me-1">@r</span>
}
</div>
<small class="text-muted">LDAP groups: @(ldapGroups.Count == 0 ? "(none surfaced)" : string.Join(", ", ldapGroups))</small>
}
</div>
</div>
</div>
<div class="col-12">
<div class="card">
<div class="card-body">
<h5 class="card-title">Capabilities</h5>
<p class="text-muted small">
Each Admin role grants a fixed capability set per <code>admin-ui.md</code> §Admin Roles.
Pages below reflect what this session can access; the route's <code>[Authorize]</code> guard
is the ground truth — this table mirrors it for readability.
</p>
<table class="table table-sm align-middle mb-0">
<thead>
<tr>
<th>Capability</th>
<th>Required role(s)</th>
<th class="text-end">You have it?</th>
</tr>
</thead>
<tbody>
@foreach (var cap in Capabilities)
{
var has = cap.RequiredRoles.Any(r => roles.Contains(r, StringComparer.OrdinalIgnoreCase));
<tr>
<td>@cap.Name<br /><small class="text-muted">@cap.Description</small></td>
<td>@string.Join(" or ", cap.RequiredRoles)</td>
<td class="text-end">
@if (has)
{
<span class="badge bg-success">Yes</span>
}
else
{
<span class="badge bg-secondary">No</span>
}
</td>
</tr>
}
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="mt-4">
<form method="post" action="/auth/logout">
<button class="btn btn-outline-danger" type="submit">Sign out</button>
</form>
</div>
</Authorized>
</AuthorizeView>
@code {
private sealed record Capability(string Name, string Description, string[] RequiredRoles);
// Kept in sync with Program.cs authorization policies + each page's [Authorize] attribute.
// When a new page or policy is added, extend this list so operators can self-service check
// whether their session has access without trial-and-error navigation.
private static readonly IReadOnlyList<Capability> Capabilities =
[
new("View clusters + fleet status",
"Read-only access to the cluster list, fleet dashboard, and generation history.",
[AdminRoles.ConfigViewer, AdminRoles.ConfigEditor, AdminRoles.FleetAdmin]),
new("Edit configuration drafts",
"Create and edit draft generations, manage namespace bindings and node ACLs. CanEdit policy.",
[AdminRoles.ConfigEditor, AdminRoles.FleetAdmin]),
new("Publish generations",
"Promote a draft to Published — triggers node roll-out. CanPublish policy.",
[AdminRoles.FleetAdmin]),
new("Manage certificate trust",
"Trust rejected client certs + revoke trust. FleetAdmin-only because the trust decision gates OPC UA client access.",
[AdminRoles.FleetAdmin]),
new("Manage external-ID reservations",
"Reserve / release external IDs that map into Galaxy contained names.",
[AdminRoles.ConfigEditor, AdminRoles.FleetAdmin]),
];
}

View File

@@ -0,0 +1,154 @@
@page "/certificates"
@attribute [Microsoft.AspNetCore.Authorization.Authorize(Roles = AdminRoles.FleetAdmin)]
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@inject CertTrustService Certs
@inject AuthenticationStateProvider AuthState
@inject ILogger<Certificates> Log
<h1 class="mb-4">Certificate trust</h1>
<div class="alert alert-info small mb-4">
PKI store root <code>@Certs.PkiStoreRoot</code>. Trusting a rejected cert moves the file into the trusted store — the OPC UA server picks up the change on the next client handshake, so operators should retry the rejected client's connection after trusting.
</div>
@if (_status is not null)
{
<div class="alert alert-@_statusKind alert-dismissible">
@_status
<button type="button" class="btn-close" @onclick="ClearStatus"></button>
</div>
}
<h2 class="h4">Rejected (@_rejected.Count)</h2>
@if (_rejected.Count == 0)
{
<p class="text-muted">No rejected certificates. Clients that fail to handshake with an untrusted cert land here.</p>
}
else
{
<table class="table table-sm align-middle">
<thead><tr><th>Subject</th><th>Issuer</th><th>Thumbprint</th><th>Valid</th><th class="text-end">Actions</th></tr></thead>
<tbody>
@foreach (var c in _rejected)
{
<tr>
<td>@c.Subject</td>
<td>@c.Issuer</td>
<td><code class="small">@c.Thumbprint</code></td>
<td class="small">@c.NotBefore.ToString("yyyy-MM-dd") → @c.NotAfter.ToString("yyyy-MM-dd")</td>
<td class="text-end">
<button class="btn btn-sm btn-success me-1" @onclick="() => TrustAsync(c)">Trust</button>
<button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteRejectedAsync(c)">Delete</button>
</td>
</tr>
}
</tbody>
</table>
}
<h2 class="h4 mt-5">Trusted (@_trusted.Count)</h2>
@if (_trusted.Count == 0)
{
<p class="text-muted">No client certs have been explicitly trusted. The server's own application cert lives in <code>own/</code> and is not listed here.</p>
}
else
{
<table class="table table-sm align-middle">
<thead><tr><th>Subject</th><th>Issuer</th><th>Thumbprint</th><th>Valid</th><th class="text-end">Actions</th></tr></thead>
<tbody>
@foreach (var c in _trusted)
{
<tr>
<td>@c.Subject</td>
<td>@c.Issuer</td>
<td><code class="small">@c.Thumbprint</code></td>
<td class="small">@c.NotBefore.ToString("yyyy-MM-dd") → @c.NotAfter.ToString("yyyy-MM-dd")</td>
<td class="text-end">
<button class="btn btn-sm btn-outline-danger" @onclick="() => UntrustAsync(c)">Revoke</button>
</td>
</tr>
}
</tbody>
</table>
}
@code {
private IReadOnlyList<CertInfo> _rejected = [];
private IReadOnlyList<CertInfo> _trusted = [];
private string? _status;
private string _statusKind = "success";
protected override void OnInitialized() => Reload();
private void Reload()
{
_rejected = Certs.ListRejected();
_trusted = Certs.ListTrusted();
}
private async Task TrustAsync(CertInfo c)
{
if (Certs.TrustRejected(c.Thumbprint))
{
await LogActionAsync("cert.trust", c);
Set($"Trusted cert {c.Subject} ({Short(c.Thumbprint)}).", "success");
}
else
{
Set($"Could not trust {Short(c.Thumbprint)} — file missing; another admin may have already handled it.", "warning");
}
Reload();
}
private async Task DeleteRejectedAsync(CertInfo c)
{
if (Certs.DeleteRejected(c.Thumbprint))
{
await LogActionAsync("cert.delete.rejected", c);
Set($"Deleted rejected cert {c.Subject} ({Short(c.Thumbprint)}).", "success");
}
else
{
Set($"Could not delete {Short(c.Thumbprint)} — file missing.", "warning");
}
Reload();
}
private async Task UntrustAsync(CertInfo c)
{
if (Certs.UntrustCert(c.Thumbprint))
{
await LogActionAsync("cert.untrust", c);
Set($"Revoked trust for {c.Subject} ({Short(c.Thumbprint)}).", "success");
}
else
{
Set($"Could not revoke {Short(c.Thumbprint)} — file missing.", "warning");
}
Reload();
}
private async Task LogActionAsync(string action, CertInfo c)
{
// Cert trust changes are operator-initiated and security-sensitive — Serilog captures the
// user + thumbprint trail. CertTrustService also logs at Information on each filesystem
// move/delete; this line ties the action to the authenticated admin user so the two logs
// correlate. DB-level ConfigAuditLog persistence is deferred — its schema is
// cluster-scoped and cert actions are cluster-agnostic.
var state = await AuthState.GetAuthenticationStateAsync();
var user = state.User.Identity?.Name ?? "(anonymous)";
Log.LogInformation("Admin cert action: user={User} action={Action} thumbprint={Thumbprint} subject={Subject}",
user, action, c.Thumbprint, c.Subject);
}
private void Set(string message, string kind)
{
_status = message;
_statusKind = kind;
}
private void ClearStatus() => _status = null;
private static string Short(string thumbprint) =>
thumbprint.Length > 12 ? thumbprint[..12] + "…" : thumbprint;
}

View File

@@ -0,0 +1,172 @@
@page "/fleet"
@using Microsoft.EntityFrameworkCore
@using ZB.MOM.WW.OtOpcUa.Configuration
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject IServiceScopeFactory ScopeFactory
@implements IDisposable
<h1 class="mb-4">Fleet status</h1>
<div class="d-flex align-items-center mb-3 gap-2">
<button class="btn btn-sm btn-outline-primary" @onclick="RefreshAsync" disabled="@_refreshing">
@if (_refreshing) { <span class="spinner-border spinner-border-sm me-1" /> }
Refresh
</button>
<span class="text-muted small">
Auto-refresh every @RefreshIntervalSeconds s. Last updated: @(_lastRefreshUtc?.ToString("HH:mm:ss 'UTC'") ?? "—")
</span>
</div>
@if (_rows is null)
{
<p>Loading…</p>
}
else if (_rows.Count == 0)
{
<div class="alert alert-info">
No node state recorded yet. Nodes publish their state to the central DB on each poll; if
this list is empty, either no nodes have been registered or the poller hasn't run yet.
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3">
<div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Nodes</h6>
<div class="fs-3">@_rows.Count</div>
</div></div>
</div>
<div class="col-md-3">
<div class="card border-success"><div class="card-body">
<h6 class="text-muted mb-1">Applied</h6>
<div class="fs-3 text-success">@_rows.Count(r => r.Status == "Applied")</div>
</div></div>
</div>
<div class="col-md-3">
<div class="card border-warning"><div class="card-body">
<h6 class="text-muted mb-1">Stale</h6>
<div class="fs-3 text-warning">@_rows.Count(r => IsStale(r))</div>
</div></div>
</div>
<div class="col-md-3">
<div class="card border-danger"><div class="card-body">
<h6 class="text-muted mb-1">Failed</h6>
<div class="fs-3 text-danger">@_rows.Count(r => r.Status == "Failed")</div>
</div></div>
</div>
</div>
<table class="table table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Cluster</th>
<th>Generation</th>
<th>Status</th>
<th>Last applied</th>
<th>Last seen</th>
<th>Error</th>
</tr>
</thead>
<tbody>
@foreach (var r in _rows)
{
<tr class="@RowClass(r)">
<td><code>@r.NodeId</code></td>
<td>@r.ClusterId</td>
<td>@(r.GenerationId?.ToString() ?? "—")</td>
<td>
<span class="badge @StatusBadge(r.Status)">@(r.Status ?? "—")</span>
</td>
<td>@FormatAge(r.AppliedAt)</td>
<td class="@(IsStale(r) ? "text-warning" : "")">@FormatAge(r.SeenAt)</td>
<td class="text-truncate" style="max-width: 320px;" title="@r.Error">@r.Error</td>
</tr>
}
</tbody>
</table>
}
@code {
// Refresh cadence. 5s matches FleetStatusPoller's poll interval — the dashboard always sees
// the most recent published state without polling ahead of the broadcaster.
private const int RefreshIntervalSeconds = 5;
private List<FleetNodeRow>? _rows;
private bool _refreshing;
private DateTime? _lastRefreshUtc;
private Timer? _timer;
protected override async Task OnInitializedAsync()
{
await RefreshAsync();
_timer = new Timer(async _ => await InvokeAsync(RefreshAsync),
state: null,
dueTime: TimeSpan.FromSeconds(RefreshIntervalSeconds),
period: TimeSpan.FromSeconds(RefreshIntervalSeconds));
}
private async Task RefreshAsync()
{
if (_refreshing) return;
_refreshing = true;
try
{
using var scope = ScopeFactory.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
var rows = await db.ClusterNodeGenerationStates.AsNoTracking()
.Join(db.ClusterNodes.AsNoTracking(), s => s.NodeId, n => n.NodeId, (s, n) => new FleetNodeRow(
s.NodeId, n.ClusterId, s.CurrentGenerationId,
s.LastAppliedStatus != null ? s.LastAppliedStatus.ToString() : null,
s.LastAppliedError, s.LastAppliedAt, s.LastSeenAt))
.OrderBy(r => r.ClusterId)
.ThenBy(r => r.NodeId)
.ToListAsync();
_rows = rows;
_lastRefreshUtc = DateTime.UtcNow;
}
finally
{
_refreshing = false;
StateHasChanged();
}
}
private static bool IsStale(FleetNodeRow r)
{
if (r.SeenAt is null) return true;
return (DateTime.UtcNow - r.SeenAt.Value) > TimeSpan.FromSeconds(30);
}
private static string RowClass(FleetNodeRow r) => r.Status switch
{
"Failed" => "table-danger",
_ when IsStale(r) => "table-warning",
_ => "",
};
private static string StatusBadge(string? status) => status switch
{
"Applied" => "bg-success",
"Failed" => "bg-danger",
"Applying" => "bg-info",
_ => "bg-secondary",
};
private static string FormatAge(DateTime? t)
{
if (t is null) return "—";
var age = DateTime.UtcNow - t.Value;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 24) return $"{(int)age.TotalHours}h ago";
return t.Value.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
public void Dispose() => _timer?.Dispose();
internal sealed record FleetNodeRow(
string NodeId, string ClusterId, long? GenerationId,
string? Status, string? Error, DateTime? AppliedAt, DateTime? SeenAt);
}

View File

@@ -0,0 +1,160 @@
@page "/hosts"
@using Microsoft.EntityFrameworkCore
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject IServiceScopeFactory ScopeFactory
@implements IDisposable
<h1 class="mb-4">Driver host status</h1>
<div class="d-flex align-items-center mb-3 gap-2">
<button class="btn btn-sm btn-outline-primary" @onclick="RefreshAsync" disabled="@_refreshing">
@if (_refreshing) { <span class="spinner-border spinner-border-sm me-1" /> }
Refresh
</button>
<span class="text-muted small">
Auto-refresh every @RefreshIntervalSeconds s. Last updated: @(_lastRefreshUtc?.ToString("HH:mm:ss 'UTC'") ?? "—")
</span>
</div>
<div class="alert alert-info small mb-4">
Each row is one host reported by a driver instance on a server node. Galaxy drivers report
per-Platform / per-AppEngine entries; Modbus drivers report the PLC endpoint. Rows age out
of the Server's publisher on every 10-second heartbeat — rows whose LastSeen is older than
30s are flagged Stale, which usually means the owning Server process has crashed or lost
its DB connection.
</div>
@if (_rows is null)
{
<p>Loading…</p>
}
else if (_rows.Count == 0)
{
<div class="alert alert-secondary">
No host-status rows yet. The Server publishes its first tick 2s after startup; if this list stays empty, check that the Server is running and the driver implements <code>IHostConnectivityProbe</code>.
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Hosts</h6>
<div class="fs-3">@_rows.Count</div>
</div></div></div>
<div class="col-md-3"><div class="card border-success"><div class="card-body">
<h6 class="text-muted mb-1">Running</h6>
<div class="fs-3 text-success">@_rows.Count(r => r.State == DriverHostState.Running && !HostStatusService.IsStale(r))</div>
</div></div></div>
<div class="col-md-3"><div class="card border-warning"><div class="card-body">
<h6 class="text-muted mb-1">Stale</h6>
<div class="fs-3 text-warning">@_rows.Count(HostStatusService.IsStale)</div>
</div></div></div>
<div class="col-md-3"><div class="card border-danger"><div class="card-body">
<h6 class="text-muted mb-1">Faulted</h6>
<div class="fs-3 text-danger">@_rows.Count(r => r.State == DriverHostState.Faulted)</div>
</div></div></div>
</div>
@foreach (var cluster in _rows.GroupBy(r => r.ClusterId ?? "(unassigned)").OrderBy(g => g.Key))
{
<h2 class="h5 mt-4">Cluster: <code>@cluster.Key</code></h2>
<table class="table table-sm table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Driver</th>
<th>Host</th>
<th>State</th>
<th>Last transition</th>
<th>Last seen</th>
<th>Detail</th>
</tr>
</thead>
<tbody>
@foreach (var r in cluster)
{
<tr class="@RowClass(r)">
<td><code>@r.NodeId</code></td>
<td><code>@r.DriverInstanceId</code></td>
<td>@r.HostName</td>
<td>
<span class="badge @StateBadge(r.State)">@r.State</span>
@if (HostStatusService.IsStale(r))
{
<span class="badge bg-warning text-dark ms-1">Stale</span>
}
</td>
<td class="small">@FormatAge(r.StateChangedUtc)</td>
<td class="small @(HostStatusService.IsStale(r) ? "text-warning" : "")">@FormatAge(r.LastSeenUtc)</td>
<td class="text-truncate small" style="max-width: 320px;" title="@r.Detail">@r.Detail</td>
</tr>
}
</tbody>
</table>
}
}
@code {
// Mirrors HostStatusPublisher.HeartbeatInterval — polling ahead of the broadcaster
// produces stale-looking rows mid-cycle.
private const int RefreshIntervalSeconds = 10;
private List<HostStatusRow>? _rows;
private bool _refreshing;
private DateTime? _lastRefreshUtc;
private Timer? _timer;
protected override async Task OnInitializedAsync()
{
await RefreshAsync();
_timer = new Timer(async _ => await InvokeAsync(RefreshAsync),
state: null,
dueTime: TimeSpan.FromSeconds(RefreshIntervalSeconds),
period: TimeSpan.FromSeconds(RefreshIntervalSeconds));
}
private async Task RefreshAsync()
{
if (_refreshing) return;
_refreshing = true;
try
{
using var scope = ScopeFactory.CreateScope();
var svc = scope.ServiceProvider.GetRequiredService<HostStatusService>();
_rows = (await svc.ListAsync()).ToList();
_lastRefreshUtc = DateTime.UtcNow;
}
finally
{
_refreshing = false;
StateHasChanged();
}
}
private static string RowClass(HostStatusRow r) => r.State switch
{
DriverHostState.Faulted => "table-danger",
_ when HostStatusService.IsStale(r) => "table-warning",
_ => "",
};
private static string StateBadge(DriverHostState s) => s switch
{
DriverHostState.Running => "bg-success",
DriverHostState.Stopped => "bg-secondary",
DriverHostState.Faulted => "bg-danger",
_ => "bg-secondary",
};
private static string FormatAge(DateTime t)
{
var age = DateTime.UtcNow - t;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 24) return $"{(int)age.TotalHours}h ago";
return t.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
public void Dispose() => _timer?.Dispose();
}

View File

@@ -47,6 +47,13 @@ builder.Services.AddScoped<NodeAclService>();
builder.Services.AddScoped<ReservationService>();
builder.Services.AddScoped<DraftValidationService>();
builder.Services.AddScoped<AuditLogService>();
builder.Services.AddScoped<HostStatusService>();
// Cert-trust management — reads the OPC UA server's PKI store root so rejected client certs
// can be promoted to trusted via the Admin UI. Singleton: no per-request state, just
// filesystem operations.
builder.Services.Configure<CertTrustOptions>(builder.Configuration.GetSection(CertTrustOptions.SectionName));
builder.Services.AddSingleton<CertTrustService>();
// LDAP auth — parity with ScadaLink's LdapAuthService (decision #102).
builder.Services.Configure<LdapOptions>(

View File

@@ -0,0 +1,22 @@
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Points the Admin UI at the OPC UA Server's PKI store root so
/// <see cref="CertTrustService"/> can list and move certs between the
/// <c>rejected/</c> and <c>trusted/</c> directories the server maintains. Must match the
/// <c>OpcUaServer:PkiStoreRoot</c> the Server process is configured with.
/// </summary>
public sealed class CertTrustOptions
{
public const string SectionName = "CertTrust";
/// <summary>
/// Absolute path to the PKI root. Defaults to
/// <c>%ProgramData%\OtOpcUa\pki</c> — matches <c>OpcUaServerOptions.PkiStoreRoot</c>'s
/// default so a standard side-by-side install needs no override.
/// </summary>
public string PkiStoreRoot { get; init; } =
Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData),
"OtOpcUa", "pki");
}

View File

@@ -0,0 +1,135 @@
using System.Security.Cryptography.X509Certificates;
using Microsoft.Extensions.Options;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Metadata for a certificate file found in one of the OPC UA server's PKI stores. The
/// <see cref="FilePath"/> is the absolute path of the DER/CRT file the stack created when it
/// rejected the cert (for <see cref="CertStoreKind.Rejected"/>) or when an operator trusted
/// it (for <see cref="CertStoreKind.Trusted"/>).
/// </summary>
public sealed record CertInfo(
string Thumbprint,
string Subject,
string Issuer,
DateTime NotBefore,
DateTime NotAfter,
string FilePath,
CertStoreKind Store);
public enum CertStoreKind
{
Rejected,
Trusted,
}
/// <summary>
/// Filesystem-backed view over the OPC UA server's PKI store. The Opc.Ua stack uses a
/// Directory-typed store — each cert is a <c>.der</c> file under <c>{root}/{store}/certs/</c>
/// with a filename derived from subject + thumbprint. This service exposes operators for the
/// Admin UI: list rejected, list trusted, trust a rejected cert (move to trusted), remove a
/// rejected cert (delete), untrust a previously trusted cert (delete from trusted).
/// </summary>
/// <remarks>
/// The Admin process is separate from the Server process; this service deliberately has no
/// Opc.Ua dependency — it works on the on-disk layout directly so it can run on the Admin
/// host even when the Server isn't installed locally, as long as the PKI root is reachable
/// (typical deployment has Admin + Server side-by-side on the same machine).
///
/// Trust/untrust requires the Server to re-read its trust list. The Opc.Ua stack re-reads
/// the Directory store on each new incoming connection, so there's no explicit signal
/// needed — the next client handshake picks up the change. Operators should retry the
/// rejected client's connection after trusting.
/// </remarks>
public sealed class CertTrustService
{
private readonly CertTrustOptions _options;
private readonly ILogger<CertTrustService> _logger;
public CertTrustService(IOptions<CertTrustOptions> options, ILogger<CertTrustService> logger)
{
_options = options.Value;
_logger = logger;
}
public string PkiStoreRoot => _options.PkiStoreRoot;
public IReadOnlyList<CertInfo> ListRejected() => ListStore(CertStoreKind.Rejected);
public IReadOnlyList<CertInfo> ListTrusted() => ListStore(CertStoreKind.Trusted);
/// <summary>
/// Move the cert with <paramref name="thumbprint"/> from the rejected store to the
/// trusted store. No-op returns false if the rejected file doesn't exist (already moved
/// by another operator, or thumbprint mismatch). Overwrites an existing trusted copy
/// silently — idempotent.
/// </summary>
public bool TrustRejected(string thumbprint)
{
var cert = FindInStore(CertStoreKind.Rejected, thumbprint);
if (cert is null) return false;
var trustedDir = CertsDir(CertStoreKind.Trusted);
Directory.CreateDirectory(trustedDir);
var destPath = Path.Combine(trustedDir, Path.GetFileName(cert.FilePath));
File.Move(cert.FilePath, destPath, overwrite: true);
_logger.LogInformation("Trusted cert {Thumbprint} (subject={Subject}) — moved {From} → {To}",
cert.Thumbprint, cert.Subject, cert.FilePath, destPath);
return true;
}
public bool DeleteRejected(string thumbprint) => DeleteFromStore(CertStoreKind.Rejected, thumbprint);
public bool UntrustCert(string thumbprint) => DeleteFromStore(CertStoreKind.Trusted, thumbprint);
private bool DeleteFromStore(CertStoreKind store, string thumbprint)
{
var cert = FindInStore(store, thumbprint);
if (cert is null) return false;
File.Delete(cert.FilePath);
_logger.LogInformation("Deleted cert {Thumbprint} (subject={Subject}) from {Store} store",
cert.Thumbprint, cert.Subject, store);
return true;
}
private CertInfo? FindInStore(CertStoreKind store, string thumbprint) =>
ListStore(store).FirstOrDefault(c =>
string.Equals(c.Thumbprint, thumbprint, StringComparison.OrdinalIgnoreCase));
private IReadOnlyList<CertInfo> ListStore(CertStoreKind store)
{
var dir = CertsDir(store);
if (!Directory.Exists(dir)) return [];
var results = new List<CertInfo>();
foreach (var path in Directory.EnumerateFiles(dir))
{
// Skip CRL sidecars + private-key files — trust operations only concern public certs.
var ext = Path.GetExtension(path);
if (!ext.Equals(".der", StringComparison.OrdinalIgnoreCase) &&
!ext.Equals(".crt", StringComparison.OrdinalIgnoreCase) &&
!ext.Equals(".cer", StringComparison.OrdinalIgnoreCase))
{
continue;
}
try
{
var cert = X509CertificateLoader.LoadCertificateFromFile(path);
results.Add(new CertInfo(
cert.Thumbprint, cert.Subject, cert.Issuer,
cert.NotBefore.ToUniversalTime(), cert.NotAfter.ToUniversalTime(),
path, store));
}
catch (Exception ex)
{
// A malformed file in the store shouldn't take down the page. Surface it in logs
// but skip — operators see the other certs and can clean the bad file manually.
_logger.LogWarning(ex, "Failed to parse cert at {Path} — skipping", path);
}
}
return results;
}
private string CertsDir(CertStoreKind store) =>
Path.Combine(_options.PkiStoreRoot, store == CertStoreKind.Rejected ? "rejected" : "trusted", "certs");
}

View File

@@ -0,0 +1,63 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// One row per <see cref="DriverHostStatus"/> record, enriched with the owning
/// <c>ClusterNode.ClusterId</c> when available (left-join). The Admin <c>/hosts</c> page
/// groups by cluster and renders a per-node → per-driver → per-host tree.
/// </summary>
public sealed record HostStatusRow(
string NodeId,
string? ClusterId,
string DriverInstanceId,
string HostName,
DriverHostState State,
DateTime StateChangedUtc,
DateTime LastSeenUtc,
string? Detail);
/// <summary>
/// Read-side service for the Admin UI's per-host drill-down. Loads
/// <see cref="DriverHostStatus"/> rows (written by the Server process's
/// <c>HostStatusPublisher</c>) and left-joins <c>ClusterNode</c> so each row knows which
/// cluster it belongs to — the Admin UI groups by cluster for the fleet-wide view.
/// </summary>
/// <remarks>
/// The publisher heartbeat is 10s (<c>HostStatusPublisher.HeartbeatInterval</c>). The
/// Admin page also polls every ~10s and treats rows with <c>LastSeenUtc</c> older than
/// <c>StaleThreshold</c> (30s) as stale — covers a missed heartbeat tolerance plus
/// a generous buffer for clock skew and publisher GC pauses.
/// </remarks>
public sealed class HostStatusService(OtOpcUaConfigDbContext db)
{
public static readonly TimeSpan StaleThreshold = TimeSpan.FromSeconds(30);
public async Task<IReadOnlyList<HostStatusRow>> ListAsync(CancellationToken ct = default)
{
// LEFT JOIN on NodeId so a row persists even when its owning ClusterNode row hasn't
// been created yet (first-boot bootstrap case — keeps the UI from losing sight of
// the reporting server).
var rows = await (from s in db.DriverHostStatuses.AsNoTracking()
join n in db.ClusterNodes.AsNoTracking()
on s.NodeId equals n.NodeId into nodeJoin
from n in nodeJoin.DefaultIfEmpty()
orderby s.NodeId, s.DriverInstanceId, s.HostName
select new HostStatusRow(
s.NodeId,
n != null ? n.ClusterId : null,
s.DriverInstanceId,
s.HostName,
s.State,
s.StateChangedUtc,
s.LastSeenUtc,
s.Detail)).ToListAsync(ct);
return rows;
}
public static bool IsStale(HostStatusRow row) =>
DateTime.UtcNow - row.LastSeenUtc > StaleThreshold;
}

View File

@@ -0,0 +1,61 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Per-host connectivity snapshot the Server publishes for each driver's
/// <c>IHostConnectivityProbe.GetHostStatuses</c> entry. One row per
/// (<see cref="NodeId"/>, <see cref="DriverInstanceId"/>, <see cref="HostName"/>) triple —
/// a redundant 2-node cluster with one Galaxy driver reporting 3 platforms produces 6
/// rows, not 3, because each server node owns its own runtime view.
/// </summary>
/// <remarks>
/// <para>
/// Closes the data-layer piece of LMX follow-up #7 (per-AppEngine Admin dashboard
/// drill-down). The publisher hosted service on the Server side subscribes to every
/// registered driver's <c>OnHostStatusChanged</c> and upserts rows on transitions +
/// periodic liveness heartbeats. <see cref="LastSeenUtc"/> advances on every
/// heartbeat so the Admin UI can flag stale rows from a crashed Server.
/// </para>
/// <para>
/// No foreign-key to <see cref="ClusterNode"/> — a Server may start reporting host
/// status before its ClusterNode row exists (e.g. first-boot bootstrap), and we'd
/// rather keep the status row than drop it. The Admin-side service left-joins on
/// NodeId when presenting rows.
/// </para>
/// </remarks>
public sealed class DriverHostStatus
{
/// <summary>Server node that's running the driver.</summary>
public required string NodeId { get; set; }
/// <summary>Driver instance's stable id (matches <c>IDriver.DriverInstanceId</c>).</summary>
public required string DriverInstanceId { get; set; }
/// <summary>
/// Driver-side host identifier — Galaxy Platform / AppEngine name, Modbus
/// <c>host:port</c>, whatever the probe returns. Opaque to the Admin UI except as
/// a display string.
/// </summary>
public required string HostName { get; set; }
public DriverHostState State { get; set; } = DriverHostState.Unknown;
/// <summary>Timestamp of the last state transition (not of the most recent heartbeat).</summary>
public DateTime StateChangedUtc { get; set; }
/// <summary>
/// Advances on every publisher heartbeat — the Admin UI uses
/// <c>now - LastSeenUtc &gt; threshold</c> to flag rows whose owning Server has
/// stopped reporting (crashed, network-partitioned, etc.), independent of
/// <see cref="State"/>.
/// </summary>
public DateTime LastSeenUtc { get; set; }
/// <summary>
/// Optional human-readable detail populated when <see cref="State"/> is
/// <see cref="DriverHostState.Faulted"/> — e.g. the exception message from the
/// driver's probe. Null for Running / Stopped / Unknown transitions.
/// </summary>
public string? Detail { get; set; }
}

View File

@@ -0,0 +1,21 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>
/// Persisted mirror of <c>Core.Abstractions.HostState</c> — the lifecycle state each
/// <c>IHostConnectivityProbe</c>-capable driver reports for its per-host topology
/// (Galaxy Platforms / AppEngines, Modbus PLC endpoints, future OPC UA gateway upstreams).
/// Defined here instead of re-using <c>Core.Abstractions.HostState</c> so the
/// Configuration project stays free of driver-runtime dependencies.
/// </summary>
/// <remarks>
/// The server-side publisher (follow-up PR) translates
/// <c>HostStatusChangedEventArgs.NewState</c> to this enum on every transition and
/// upserts into <see cref="Entities.DriverHostStatus"/>. Admin UI reads from the DB.
/// </remarks>
public enum DriverHostState
{
Unknown,
Running,
Stopped,
Faulted,
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,49 @@
using System;
using Microsoft.EntityFrameworkCore.Migrations;
#nullable disable
namespace ZB.MOM.WW.OtOpcUa.Configuration.Migrations
{
/// <inheritdoc />
public partial class AddDriverHostStatus : Migration
{
/// <inheritdoc />
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.CreateTable(
name: "DriverHostStatus",
columns: table => new
{
NodeId = table.Column<string>(type: "nvarchar(64)", maxLength: 64, nullable: false),
DriverInstanceId = table.Column<string>(type: "nvarchar(64)", maxLength: 64, nullable: false),
HostName = table.Column<string>(type: "nvarchar(256)", maxLength: 256, nullable: false),
State = table.Column<string>(type: "nvarchar(16)", maxLength: 16, nullable: false),
StateChangedUtc = table.Column<DateTime>(type: "datetime2(3)", nullable: false),
LastSeenUtc = table.Column<DateTime>(type: "datetime2(3)", nullable: false),
Detail = table.Column<string>(type: "nvarchar(1024)", maxLength: 1024, nullable: true)
},
constraints: table =>
{
table.PrimaryKey("PK_DriverHostStatus", x => new { x.NodeId, x.DriverInstanceId, x.HostName });
});
migrationBuilder.CreateIndex(
name: "IX_DriverHostStatus_LastSeen",
table: "DriverHostStatus",
column: "LastSeenUtc");
migrationBuilder.CreateIndex(
name: "IX_DriverHostStatus_Node",
table: "DriverHostStatus",
column: "NodeId");
}
/// <inheritdoc />
protected override void Down(MigrationBuilder migrationBuilder)
{
migrationBuilder.DropTable(
name: "DriverHostStatus");
}
}
}

View File

@@ -332,6 +332,46 @@ namespace ZB.MOM.WW.OtOpcUa.Configuration.Migrations
});
});
modelBuilder.Entity("ZB.MOM.WW.OtOpcUa.Configuration.Entities.DriverHostStatus", b =>
{
b.Property<string>("NodeId")
.HasMaxLength(64)
.HasColumnType("nvarchar(64)");
b.Property<string>("DriverInstanceId")
.HasMaxLength(64)
.HasColumnType("nvarchar(64)");
b.Property<string>("HostName")
.HasMaxLength(256)
.HasColumnType("nvarchar(256)");
b.Property<string>("Detail")
.HasMaxLength(1024)
.HasColumnType("nvarchar(1024)");
b.Property<DateTime>("LastSeenUtc")
.HasColumnType("datetime2(3)");
b.Property<string>("State")
.IsRequired()
.HasMaxLength(16)
.HasColumnType("nvarchar(16)");
b.Property<DateTime>("StateChangedUtc")
.HasColumnType("datetime2(3)");
b.HasKey("NodeId", "DriverInstanceId", "HostName");
b.HasIndex("LastSeenUtc")
.HasDatabaseName("IX_DriverHostStatus_LastSeen");
b.HasIndex("NodeId")
.HasDatabaseName("IX_DriverHostStatus_Node");
b.ToTable("DriverHostStatus", (string)null);
});
modelBuilder.Entity("ZB.MOM.WW.OtOpcUa.Configuration.Entities.DriverInstance", b =>
{
b.Property<Guid>("DriverInstanceRowId")

View File

@@ -27,6 +27,7 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
public DbSet<ClusterNodeGenerationState> ClusterNodeGenerationStates => Set<ClusterNodeGenerationState>();
public DbSet<ConfigAuditLog> ConfigAuditLogs => Set<ConfigAuditLog>();
public DbSet<ExternalIdReservation> ExternalIdReservations => Set<ExternalIdReservation>();
public DbSet<DriverHostStatus> DriverHostStatuses => Set<DriverHostStatus>();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
@@ -47,6 +48,7 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
ConfigureClusterNodeGenerationState(modelBuilder);
ConfigureConfigAuditLog(modelBuilder);
ConfigureExternalIdReservation(modelBuilder);
ConfigureDriverHostStatus(modelBuilder);
}
private static void ConfigureServerCluster(ModelBuilder modelBuilder)
@@ -484,4 +486,30 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.HasIndex(x => x.EquipmentUuid).HasDatabaseName("IX_ExternalIdReservation_Equipment");
});
}
private static void ConfigureDriverHostStatus(ModelBuilder modelBuilder)
{
modelBuilder.Entity<DriverHostStatus>(e =>
{
e.ToTable("DriverHostStatus");
// Composite key — one row per (server node, driver instance, probe-reported host).
// A redundant 2-node cluster with one Galaxy driver reporting 3 platforms produces
// 6 rows because each server node owns its own runtime view; the composite key is
// what lets both views coexist without shadowing each other.
e.HasKey(x => new { x.NodeId, x.DriverInstanceId, x.HostName });
e.Property(x => x.NodeId).HasMaxLength(64);
e.Property(x => x.DriverInstanceId).HasMaxLength(64);
e.Property(x => x.HostName).HasMaxLength(256);
e.Property(x => x.State).HasConversion<string>().HasMaxLength(16);
e.Property(x => x.StateChangedUtc).HasColumnType("datetime2(3)");
e.Property(x => x.LastSeenUtc).HasColumnType("datetime2(3)");
e.Property(x => x.Detail).HasMaxLength(1024);
// NodeId-only index drives the Admin UI's per-cluster drill-down (select all host
// statuses for the nodes of a specific cluster via join on ClusterNode.ClusterId).
e.HasIndex(x => x.NodeId).HasDatabaseName("IX_DriverHostStatus_Node");
// LastSeenUtc index powers the Admin UI's stale-row query (now - LastSeen > N).
e.HasIndex(x => x.LastSeenUtc).HasDatabaseName("IX_DriverHostStatus_LastSeen");
});
}
}

View File

@@ -19,10 +19,17 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <param name="ArrayDim">Declared array length when <see cref="IsArray"/> is true; null otherwise.</param>
/// <param name="SecurityClass">Write-authorization tier for this attribute.</param>
/// <param name="IsHistorized">True when this attribute is expected to feed historian / HistoryRead.</param>
/// <param name="IsAlarm">
/// True when this attribute represents an alarm condition (Galaxy: has an
/// <c>AlarmExtension</c> primitive). The generic node-manager enriches the variable with an
/// OPC UA <c>AlarmConditionState</c> when true. Defaults to false so existing non-Galaxy
/// drivers aren't forced to flow a flag they don't produce.
/// </param>
public sealed record DriverAttributeInfo(
string FullName,
DriverDataType DriverDataType,
bool IsArray,
uint? ArrayDim,
SecurityClassification SecurityClass,
bool IsHistorized);
bool IsHistorized,
bool IsAlarm = false);

View File

@@ -42,4 +42,39 @@ public interface IVariableHandle
{
/// <summary>Driver-side full reference for read/write addressing.</summary>
string FullReference { get; }
/// <summary>
/// Annotate this variable with an OPC UA <c>AlarmConditionState</c>. Drivers with
/// <see cref="DriverAttributeInfo.IsAlarm"/> = true call this during discovery so the
/// concrete address-space builder can materialize a sibling condition node. The returned
/// sink receives lifecycle transitions raised through <see cref="IAlarmSource.OnAlarmEvent"/>
/// — the generic node manager wires the subscription; the concrete builder decides how
/// to surface the state (e.g. OPC UA <c>AlarmConditionState.Activate</c>,
/// <c>Acknowledge</c>, <c>Deactivate</c>).
/// </summary>
IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info);
}
/// <summary>
/// Metadata used to materialize an OPC UA <c>AlarmConditionState</c> sibling for a variable.
/// Populated by the driver's discovery step; concrete builders decide how to surface it.
/// </summary>
/// <param name="SourceName">Human-readable alarm name used for the <c>SourceName</c> event field.</param>
/// <param name="InitialSeverity">Severity at address-space build time; updates arrive via <see cref="IAlarmConditionSink"/>.</param>
/// <param name="InitialDescription">Initial description; updates arrive via <see cref="IAlarmConditionSink"/>.</param>
public sealed record AlarmConditionInfo(
string SourceName,
AlarmSeverity InitialSeverity,
string? InitialDescription);
/// <summary>
/// Sink a concrete address-space builder returns from <see cref="IVariableHandle.MarkAsAlarmCondition"/>.
/// The generic node manager routes per-alarm <see cref="IAlarmSource.OnAlarmEvent"/> payloads here —
/// the sink translates the transition into an OPC UA condition state change or whatever the
/// concrete builder's backing address space supports.
/// </summary>
public interface IAlarmConditionSink
{
/// <summary>Push an alarm transition (Active / Acknowledged / Inactive) for this condition.</summary>
void OnTransition(AlarmEventArgs args);
}

View File

@@ -30,6 +30,52 @@ public interface IHistoryProvider
TimeSpan interval,
HistoryAggregateType aggregate,
CancellationToken cancellationToken);
/// <summary>
/// Read one sample per requested timestamp — OPC UA HistoryReadAtTime service. The
/// driver interpolates (or returns the prior-boundary sample) when no exact match
/// exists. Optional; drivers that can't interpolate throw <see cref="NotSupportedException"/>.
/// </summary>
/// <remarks>
/// Default implementation throws. Drivers opt in by overriding; keeps existing
/// <c>IHistoryProvider</c> implementations compiling without forcing a ReadAtTime path
/// they may not have a backend for.
/// </remarks>
Task<HistoryReadResult> ReadAtTimeAsync(
string fullReference,
IReadOnlyList<DateTime> timestampsUtc,
CancellationToken cancellationToken)
=> throw new NotSupportedException(
$"{GetType().Name} does not implement ReadAtTimeAsync. " +
"Drivers whose backends support at-time reads override this method.");
/// <summary>
/// Read historical alarm/event records — OPC UA HistoryReadEvents service. Distinct
/// from the live event stream — historical rows come from an event historian (Galaxy's
/// Alarm Provider history log, etc.) rather than the driver's active subscription.
/// </summary>
/// <param name="sourceName">
/// Optional filter: null means "all sources", otherwise restrict to events from that
/// source-object name. Drivers may ignore the filter if the backend doesn't support it.
/// </param>
/// <param name="startUtc">Inclusive lower bound on <c>EventTimeUtc</c>.</param>
/// <param name="endUtc">Exclusive upper bound on <c>EventTimeUtc</c>.</param>
/// <param name="maxEvents">Upper cap on returned events — the driver's backend enforces this.</param>
/// <param name="cancellationToken">Request cancellation.</param>
/// <remarks>
/// Default implementation throws. Only drivers with an event historian (Galaxy via the
/// Wonderware Alarm &amp; Events log) override. Modbus / the OPC UA Client driver stay
/// with the default and let callers see <c>BadHistoryOperationUnsupported</c>.
/// </remarks>
Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName,
DateTime startUtc,
DateTime endUtc,
int maxEvents,
CancellationToken cancellationToken)
=> throw new NotSupportedException(
$"{GetType().Name} does not implement ReadEventsAsync. " +
"Drivers whose backends have an event historian override this method.");
}
/// <summary>Result of a HistoryRead call.</summary>
@@ -48,3 +94,29 @@ public enum HistoryAggregateType
Total,
Count,
}
/// <summary>
/// One row returned by <see cref="IHistoryProvider.ReadEventsAsync"/> — a historical
/// alarm/event record, not the OPC UA live-event stream. Fields match the minimum set the
/// Server needs to populate a <c>HistoryEventFieldList</c> for HistoryReadEvents responses.
/// </summary>
/// <param name="EventId">Stable unique id for the event — driver-specific format.</param>
/// <param name="SourceName">Source object that emitted the event. May differ from the <c>sourceName</c> filter the caller passed (fuzzy matches).</param>
/// <param name="EventTimeUtc">Process-side timestamp — when the event actually occurred.</param>
/// <param name="ReceivedTimeUtc">Historian-side timestamp — when the historian persisted the row; may lag <paramref name="EventTimeUtc"/> by the historian's buffer flush cadence.</param>
/// <param name="Message">Human-readable message text.</param>
/// <param name="Severity">OPC UA severity (1-1000). Drivers map their native priority scale onto this range.</param>
public sealed record HistoricalEvent(
string EventId,
string? SourceName,
DateTime EventTimeUtc,
DateTime ReceivedTimeUtc,
string? Message,
ushort Severity);
/// <summary>Result of a <see cref="IHistoryProvider.ReadEventsAsync"/> call.</summary>
/// <param name="Events">Events in chronological order by <c>EventTimeUtc</c>.</param>
/// <param name="ContinuationPoint">Opaque token for the next call when more events are available; null when complete.</param>
public sealed record HistoricalEventsResult(
IReadOnlyList<HistoricalEvent> Events,
byte[]? ContinuationPoint);

View File

@@ -24,6 +24,17 @@ public sealed class DriverHost : IAsyncDisposable
return _drivers.TryGetValue(driverInstanceId, out var d) ? d.GetHealth() : null;
}
/// <summary>
/// Look up a registered driver by instance id. Used by the OPC UA server runtime
/// (<c>OtOpcUaServer</c>) to instantiate one <c>DriverNodeManager</c> per driver at
/// startup. Returns null when the driver is not registered.
/// </summary>
public IDriver? GetDriver(string driverInstanceId)
{
lock (_lock)
return _drivers.TryGetValue(driverInstanceId, out var d) ? d : null;
}
/// <summary>
/// Registers the driver and calls <see cref="IDriver.InitializeAsync"/>. If initialization
/// throws, the driver is kept in the registry so the operator can retry; quality on its

View File

@@ -1,27 +1,41 @@
using System.Collections.Concurrent;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.OpcUa;
/// <summary>
/// Generic, driver-agnostic backbone for populating the OPC UA address space from an
/// <see cref="IDriver"/>. The Galaxy-specific subclass (<c>GalaxyNodeManager</c>) is deferred
/// to Phase 2 per decision #62 — this class is the foundation that Phase 2 ports the v1
/// <c>LmxNodeManager</c> logic into.
/// <see cref="IDriver"/>. Walks the driver's discovery, wires the alarm + data-change +
/// rediscovery subscription events, and hands each variable to the supplied
/// <see cref="IAddressSpaceBuilder"/>. Concrete OPC UA server implementations provide the
/// builder — see the Server project's <c>OpcUaAddressSpaceBuilder</c> for the materialization
/// against <c>CustomNodeManager2</c>.
/// </summary>
/// <remarks>
/// Phase 1 status: scaffold only. The v1 <c>LmxNodeManager</c> in the legacy Host is unchanged
/// so IntegrationTests continue to pass. Phase 2 will lift-and-shift its logic here, swapping
/// <c>IMxAccessClient</c> for <see cref="IDriver"/> and <c>GalaxyAttributeInfo</c> for
/// <see cref="DriverAttributeInfo"/>.
/// Per <c>docs/v2/plan.md</c> decision #52 + #62 — Core owns the node tree, drivers stream
/// <c>Folder</c>/<c>Variable</c> calls, alarm-bearing variables are annotated via
/// <see cref="IVariableHandle.MarkAsAlarmCondition"/> and subsequent
/// <see cref="IAlarmSource.OnAlarmEvent"/> payloads route to the sink the builder returned.
/// </remarks>
public abstract class GenericDriverNodeManager(IDriver driver)
public class GenericDriverNodeManager(IDriver driver) : IDisposable
{
protected IDriver Driver { get; } = driver ?? throw new ArgumentNullException(nameof(driver));
public string DriverInstanceId => Driver.DriverInstanceId;
// Source tag (DriverAttributeInfo.FullName) → alarm-condition sink. Populated during
// BuildAddressSpaceAsync by a recording IAddressSpaceBuilder implementation that captures the
// IVariableHandle per attr.IsAlarm=true variable and calls MarkAsAlarmCondition.
private readonly ConcurrentDictionary<string, IAlarmConditionSink> _alarmSinks =
new(StringComparer.OrdinalIgnoreCase);
private EventHandler<AlarmEventArgs>? _alarmForwarder;
private bool _disposed;
/// <summary>
/// Populates the address space by streaming nodes from the driver into the supplied builder.
/// Populates the address space by streaming nodes from the driver into the supplied builder,
/// wraps the builder so alarm-condition sinks are captured, subscribes to the driver's
/// alarm event stream, and routes each transition to the matching sink by <c>SourceNodeId</c>.
/// Driver exceptions are isolated per decision #12 — the driver's subtree is marked Faulted,
/// but other drivers remain available.
/// </summary>
@@ -32,6 +46,73 @@ public abstract class GenericDriverNodeManager(IDriver driver)
if (Driver is not ITagDiscovery discovery)
throw new NotSupportedException($"Driver '{Driver.DriverInstanceId}' does not implement ITagDiscovery.");
await discovery.DiscoverAsync(builder, ct);
var capturing = new CapturingBuilder(builder, _alarmSinks);
await discovery.DiscoverAsync(capturing, ct);
if (Driver is IAlarmSource alarmSource)
{
_alarmForwarder = (_, e) =>
{
// Route the alarm to the sink registered for the originating variable, if any.
// Unknown source ids are dropped silently — they may belong to another driver or
// to a variable the builder chose not to flag as an alarm condition.
if (_alarmSinks.TryGetValue(e.SourceNodeId, out var sink))
sink.OnTransition(e);
};
alarmSource.OnAlarmEvent += _alarmForwarder;
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
if (_alarmForwarder is not null && Driver is IAlarmSource alarmSource)
{
alarmSource.OnAlarmEvent -= _alarmForwarder;
}
_alarmSinks.Clear();
}
/// <summary>
/// Snapshot the current alarm-sink registry by source node id. Diagnostic + test hook;
/// not part of the hot path.
/// </summary>
internal IReadOnlyCollection<string> TrackedAlarmSources => _alarmSinks.Keys.ToList();
/// <summary>
/// Wraps the caller-supplied <see cref="IAddressSpaceBuilder"/> so every
/// <see cref="IVariableHandle.MarkAsAlarmCondition"/> call registers the returned sink in
/// the node manager's source-node-id map. The builder itself drives materialization;
/// this wrapper only observes.
/// </summary>
private sealed class CapturingBuilder(
IAddressSpaceBuilder inner,
ConcurrentDictionary<string, IAlarmConditionSink> sinks) : IAddressSpaceBuilder
{
public IAddressSpaceBuilder Folder(string browseName, string displayName)
=> new CapturingBuilder(inner.Folder(browseName, displayName), sinks);
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
=> new CapturingHandle(inner.Variable(browseName, displayName, attributeInfo), sinks);
public void AddProperty(string browseName, DriverDataType dataType, object? value)
=> inner.AddProperty(browseName, dataType, value);
}
private sealed class CapturingHandle(
IVariableHandle inner,
ConcurrentDictionary<string, IAlarmConditionSink> sinks) : IVariableHandle
{
public string FullReference => inner.FullReference;
public IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info)
{
var sink = inner.MarkAsAlarmCondition(info);
// Register by the driver-side full reference so the alarm forwarder can look it up
// using AlarmEventArgs.SourceNodeId (which the driver populates with the same tag).
sinks[inner.FullReference] = sink;
return sink;
}
}
}

View File

@@ -16,6 +16,10 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Core.Tests"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>

View File

@@ -0,0 +1,260 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
/// <summary>
/// Subscribes to the four Galaxy alarm attributes (<c>.InAlarm</c>, <c>.Priority</c>,
/// <c>.DescAttrName</c>, <c>.Acked</c>) per alarm-bearing attribute discovered during
/// <c>DiscoverAsync</c>. Maintains one <see cref="AlarmState"/> per alarm, raises
/// <see cref="AlarmTransition"/> on lifecycle transitions (Active / Unacknowledged /
/// Acknowledged / Inactive). Ack path writes <c>.AckMsg</c>. Pure-logic state machine
/// with delegate-based subscribe/write so it's testable against in-memory fakes.
/// </summary>
/// <remarks>
/// Transitions emitted (OPC UA Part 9 alarm lifecycle, simplified for the Galaxy model):
/// <list type="bullet">
/// <item><c>Active</c> — InAlarm false → true. Default to Unacknowledged.</item>
/// <item><c>Acknowledged</c> — Acked false → true while InAlarm is still true.</item>
/// <item><c>Inactive</c> — InAlarm true → false. If still unacknowledged the alarm
/// is marked latched-inactive-unack; next Ack transitions straight to Inactive.</item>
/// </list>
/// </remarks>
public sealed class GalaxyAlarmTracker : IDisposable
{
public const string InAlarmAttr = ".InAlarm";
public const string PriorityAttr = ".Priority";
public const string DescAttrNameAttr = ".DescAttrName";
public const string AckedAttr = ".Acked";
public const string AckMsgAttr = ".AckMsg";
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
private readonly Func<string, Task> _unsubscribe;
private readonly Func<string, object, Task<bool>> _write;
private readonly Func<DateTime> _clock;
// Alarm tag (attribute full ref, e.g. "Tank.Level.HiHi") → state.
private readonly ConcurrentDictionary<string, AlarmState> _alarms =
new(StringComparer.OrdinalIgnoreCase);
// Reverse lookup: probed tag (".InAlarm" etc.) → owning alarm tag.
private readonly ConcurrentDictionary<string, (string AlarmTag, AlarmField Field)> _probeToAlarm =
new(StringComparer.OrdinalIgnoreCase);
private bool _disposed;
public event EventHandler<AlarmTransition>? TransitionRaised;
public GalaxyAlarmTracker(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<string, object, Task<bool>> write)
: this(subscribe, unsubscribe, write, () => DateTime.UtcNow) { }
internal GalaxyAlarmTracker(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<string, object, Task<bool>> write,
Func<DateTime> clock)
{
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
_write = write ?? throw new ArgumentNullException(nameof(write));
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
}
public int TrackedAlarmCount => _alarms.Count;
/// <summary>
/// Advise the four alarm attributes for <paramref name="alarmTag"/>. Idempotent —
/// repeat calls for the same alarm tag are a no-op. Subscribe failure for any of the
/// four rolls back the alarm entry so a stale callback cannot promote a phantom.
/// </summary>
public async Task TrackAsync(string alarmTag)
{
if (_disposed || string.IsNullOrWhiteSpace(alarmTag)) return;
if (_alarms.ContainsKey(alarmTag)) return;
var state = new AlarmState { AlarmTag = alarmTag };
if (!_alarms.TryAdd(alarmTag, state)) return;
var probes = new[]
{
(Tag: alarmTag + InAlarmAttr, Field: AlarmField.InAlarm),
(Tag: alarmTag + PriorityAttr, Field: AlarmField.Priority),
(Tag: alarmTag + DescAttrNameAttr, Field: AlarmField.DescAttrName),
(Tag: alarmTag + AckedAttr, Field: AlarmField.Acked),
};
foreach (var p in probes)
{
_probeToAlarm[p.Tag] = (alarmTag, p.Field);
}
try
{
foreach (var p in probes)
{
await _subscribe(p.Tag, OnProbeCallback).ConfigureAwait(false);
}
}
catch
{
// Rollback so a partial advise doesn't leak state.
_alarms.TryRemove(alarmTag, out _);
foreach (var p in probes)
{
_probeToAlarm.TryRemove(p.Tag, out _);
try { await _unsubscribe(p.Tag).ConfigureAwait(false); } catch { }
}
throw;
}
}
/// <summary>
/// Drop every tracked alarm. Unadvises all 4 probes per alarm as best-effort.
/// </summary>
public async Task ClearAsync()
{
_alarms.Clear();
foreach (var kv in _probeToAlarm.ToList())
{
_probeToAlarm.TryRemove(kv.Key, out _);
try { await _unsubscribe(kv.Key).ConfigureAwait(false); } catch { }
}
}
/// <summary>
/// Operator ack — write the comment text into <c>&lt;alarmTag&gt;.AckMsg</c>.
/// Returns false when the runtime reports the write failed.
/// </summary>
public Task<bool> AcknowledgeAsync(string alarmTag, string comment)
{
if (_disposed || string.IsNullOrWhiteSpace(alarmTag))
return Task.FromResult(false);
return _write(alarmTag + AckMsgAttr, comment ?? string.Empty);
}
/// <summary>
/// Subscription callback entry point. Exposed for tests and for the Backend to route
/// fan-out callbacks through. Runs the state machine and fires TransitionRaised
/// outside the lock.
/// </summary>
public void OnProbeCallback(string probeTag, Vtq vtq)
{
if (_disposed) return;
if (!_probeToAlarm.TryGetValue(probeTag, out var link)) return;
if (!_alarms.TryGetValue(link.AlarmTag, out var state)) return;
AlarmTransition? transition = null;
var now = _clock();
lock (state.Lock)
{
switch (link.Field)
{
case AlarmField.InAlarm:
{
var wasActive = state.InAlarm;
var isActive = vtq.Value is bool b && b;
state.InAlarm = isActive;
state.LastUpdateUtc = now;
if (!wasActive && isActive)
{
state.Acked = false;
state.LastTransitionUtc = now;
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Active, state.Priority, state.DescAttrName, now);
}
else if (wasActive && !isActive)
{
state.LastTransitionUtc = now;
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Inactive, state.Priority, state.DescAttrName, now);
}
break;
}
case AlarmField.Priority:
if (vtq.Value is int pi) state.Priority = pi;
else if (vtq.Value is short ps) state.Priority = ps;
else if (vtq.Value is long pl && pl <= int.MaxValue) state.Priority = (int)pl;
state.LastUpdateUtc = now;
break;
case AlarmField.DescAttrName:
state.DescAttrName = vtq.Value as string;
state.LastUpdateUtc = now;
break;
case AlarmField.Acked:
{
var wasAcked = state.Acked;
var isAcked = vtq.Value is bool b && b;
state.Acked = isAcked;
state.LastUpdateUtc = now;
// Fire Acknowledged only when transitioning false→true. Don't fire on initial
// subscribe callback (wasAcked==isAcked in that case because the state starts
// with Acked=false and the initial probe is usually true for an un-active alarm).
if (!wasAcked && isAcked && state.InAlarm)
{
state.LastTransitionUtc = now;
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Acknowledged, state.Priority, state.DescAttrName, now);
}
break;
}
}
}
if (transition is { } t)
{
TransitionRaised?.Invoke(this, t);
}
}
public IReadOnlyList<AlarmSnapshot> SnapshotStates()
{
return _alarms.Values.Select(s =>
{
lock (s.Lock)
return new AlarmSnapshot(s.AlarmTag, s.InAlarm, s.Acked, s.Priority, s.DescAttrName);
}).ToList();
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_alarms.Clear();
_probeToAlarm.Clear();
}
private sealed class AlarmState
{
public readonly object Lock = new();
public string AlarmTag = "";
public bool InAlarm;
public bool Acked = true; // default ack'd so first false→true on subscribe doesn't misfire
public int Priority;
public string? DescAttrName;
public DateTime LastUpdateUtc;
public DateTime LastTransitionUtc;
}
private enum AlarmField { InAlarm, Priority, DescAttrName, Acked }
}
public enum AlarmStateTransition { Active, Acknowledged, Inactive }
public sealed record AlarmTransition(
string AlarmTag,
AlarmStateTransition Transition,
int Priority,
string? DescAttrName,
DateTime AtUtc);
public sealed record AlarmSnapshot(
string AlarmTag,
bool InAlarm,
bool Acked,
int Priority,
string? DescAttrName);

View File

@@ -127,6 +127,33 @@ public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxy
Tags = System.Array.Empty<HistoryTagValues>(),
});
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadProcessedResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadAtTimeResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadEventsResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
});
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
@@ -138,6 +165,7 @@ public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxy
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
};
/// <summary>

View File

@@ -0,0 +1,46 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
/// <summary>
/// Maps a raw OPC DA quality byte (as returned by Wonderware Historian's <c>OpcQuality</c>)
/// to an OPC UA <c>StatusCode</c> uint. Preserves specific codes (BadNotConnected,
/// UncertainSubNormal, etc.) instead of collapsing to Good/Uncertain/Bad categories.
/// Mirrors v1 <c>QualityMapper.MapToOpcUaStatusCode</c> without pulling in OPC UA types —
/// the returned value is the 32-bit OPC UA <c>StatusCode</c> wire encoding that the Proxy
/// surfaces directly as <c>DataValueSnapshot.StatusCode</c>.
/// </summary>
public static class HistorianQualityMapper
{
/// <summary>
/// Map an 8-bit OPC DA quality byte to the corresponding OPC UA StatusCode. The byte
/// family bits decide the category (Good &gt;= 192, Uncertain 64-191, Bad 0-63); the
/// low-nibble subcode selects the specific code.
/// </summary>
public static uint Map(byte q) => q switch
{
// Good family (192+)
192 => 0x00000000u, // Good
216 => 0x00D80000u, // Good_LocalOverride
// Uncertain family (64-191)
64 => 0x40000000u, // Uncertain
68 => 0x40900000u, // Uncertain_LastUsableValue
80 => 0x40930000u, // Uncertain_SensorNotAccurate
84 => 0x40940000u, // Uncertain_EngineeringUnitsExceeded
88 => 0x40950000u, // Uncertain_SubNormal
// Bad family (0-63)
0 => 0x80000000u, // Bad
4 => 0x80890000u, // Bad_ConfigurationError
8 => 0x808A0000u, // Bad_NotConnected
12 => 0x808B0000u, // Bad_DeviceFailure
16 => 0x808C0000u, // Bad_SensorFailure
20 => 0x80050000u, // Bad_CommunicationError
24 => 0x808D0000u, // Bad_OutOfService
32 => 0x80320000u, // Bad_WaitingForInitialData
// Unknown code — fall back to the category so callers still get a sensible bucket.
_ when q >= 192 => 0x00000000u,
_ when q >= 64 => 0x40000000u,
_ => 0x80000000u,
};
}

View File

@@ -38,6 +38,9 @@ public interface IGalaxyBackend
Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct);
Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct);
Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(HistoryReadProcessedRequest req, CancellationToken ct);
Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(HistoryReadAtTimeRequest req, CancellationToken ct);
Task<HistoryReadEventsResponse> HistoryReadEventsAsync(HistoryReadEventsRequest req, CancellationToken ct);
Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct);
}

View File

@@ -4,6 +4,7 @@ using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
@@ -18,6 +19,8 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// </summary>
public sealed class MxAccessClient : IDisposable
{
private static readonly ILogger Log = Serilog.Log.ForContext<MxAccessClient>();
private readonly StaPump _pump;
private readonly IMxProxy _proxy;
private readonly string _clientName;
@@ -40,6 +43,16 @@ public sealed class MxAccessClient : IDisposable
/// <summary>Fires whenever the connection transitions Connected ↔ Disconnected.</summary>
public event EventHandler<bool>? ConnectionStateChanged;
/// <summary>
/// Fires once per failed subscription replay after a reconnect. Carries the tag reference
/// and the exception so the backend can propagate the degradation signal (e.g. mark the
/// subscription bad on the Proxy side rather than silently losing its callback). Added for
/// PR 6 low finding #2 — the replay loop previously ate per-tag failures silently and an
/// operator would only find out that a specific subscription stopped updating through a
/// data-quality complaint from downstream.
/// </summary>
public event EventHandler<SubscriptionReplayFailedEventArgs>? SubscriptionReplayFailed;
public MxAccessClient(StaPump pump, IMxProxy proxy, string clientName, MxAccessClientOptions? options = null)
{
_pump = pump;
@@ -124,16 +137,29 @@ public sealed class MxAccessClient : IDisposable
if (idle <= _options.StaleThreshold) continue;
// Probe: try a no-op COM call. If the proxy is dead, the call will throw — that's
// our reconnect signal.
// our reconnect signal. PR 6 low finding #1: AddItem allocates an MXAccess item
// handle; we must RemoveItem it on the same pump turn or the long-running monitor
// leaks one handle per probe cycle (one every MonitorInterval seconds, indefinitely).
bool probeOk;
try
{
probeOk = await _pump.InvokeAsync(() =>
{
// AddItem on the connection handle is cheap and round-trips through COM.
// We use a sentinel "$Heartbeat" reference; if it fails the connection is gone.
try { _proxy.AddItem(_connectionHandle, "$Heartbeat"); return true; }
int probeHandle = 0;
try
{
probeHandle = _proxy.AddItem(_connectionHandle, "$Heartbeat");
return probeHandle > 0;
}
catch { return false; }
finally
{
if (probeHandle > 0)
{
try { _proxy.RemoveItem(_connectionHandle, probeHandle); }
catch { /* proxy is dying; best-effort cleanup */ }
}
}
});
}
catch { probeOk = false; }
@@ -162,16 +188,33 @@ public sealed class MxAccessClient : IDisposable
_reconnectCount++;
ConnectionStateChanged?.Invoke(this, true);
// Replay every subscription that was active before the disconnect.
// Replay every subscription that was active before the disconnect. PR 6 low
// finding #2: surface per-tag failures — log them and raise
// SubscriptionReplayFailed so the backend can propagate the degraded state
// (previously swallowed silently; downstream quality dropped without a signal).
var snapshot = _addressToHandle.Keys.ToArray();
_addressToHandle.Clear();
_handleToAddress.Clear();
var failed = 0;
foreach (var fullRef in snapshot)
{
try { await SubscribeOnPumpAsync(fullRef); }
catch { /* skip — operator can re-subscribe */ }
catch (Exception subEx)
{
failed++;
Log.Warning(subEx,
"MXAccess subscription replay failed for {TagReference} after reconnect #{Reconnect}",
fullRef, _reconnectCount);
SubscriptionReplayFailed?.Invoke(this,
new SubscriptionReplayFailedEventArgs(fullRef, subEx));
}
}
if (failed > 0)
Log.Warning("Subscription replay completed — {Failed} of {Total} failed", failed, snapshot.Length);
else
Log.Information("Subscription replay completed — {Total} re-subscribed cleanly", snapshot.Length);
_lastObservedActivityUtc = DateTime.UtcNow;
}
catch

View File

@@ -0,0 +1,20 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// Fired by <see cref="MxAccessClient.SubscriptionReplayFailed"/> when a previously-active
/// subscription fails to be restored after a reconnect. The backend should treat the tag as
/// unhealthy until the next successful resubscribe.
/// </summary>
public sealed class SubscriptionReplayFailedEventArgs : EventArgs
{
public SubscriptionReplayFailedEventArgs(string tagReference, Exception exception)
{
TagReference = tagReference;
Exception = exception;
}
public string TagReference { get; }
public Exception Exception { get; }
}

View File

@@ -4,9 +4,11 @@ using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
@@ -34,12 +36,18 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
_refToSubs = new(System.StringComparer.OrdinalIgnoreCase);
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
#pragma warning disable CS0067 // alarm wire-up deferred to PR 9
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
#pragma warning restore CS0067
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
private readonly System.EventHandler<bool> _onConnectionStateChanged;
private readonly GalaxyRuntimeProbeManager _probeManager;
private readonly System.EventHandler<HostStateTransition> _onProbeStateChanged;
private readonly GalaxyAlarmTracker _alarmTracker;
private readonly System.EventHandler<AlarmTransition> _onAlarmTransition;
// Cached during DiscoverAsync so SubscribeAlarmsAsync knows which attributes to advise.
// One entry per IsAlarm=true attribute in the last discovered hierarchy.
private readonly System.Collections.Concurrent.ConcurrentBag<string> _discoveredAlarmTags = new();
public MxAccessGalaxyBackend(GalaxyRepository repository, MxAccessClient mx, IHistorianDataSource? historian = null)
{
@@ -62,8 +70,65 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
});
};
_mx.ConnectionStateChanged += _onConnectionStateChanged;
// PR 13: per-platform runtime probes. ScanState subscriptions fire OnProbeCallback,
// which runs the state machine and raises StateChanged on transitions we care about.
// We forward each transition through the same OnHostStatusChanged IPC event that the
// gateway-level ConnectionStateChanged uses — tagged with the platform's TagName so the
// Admin UI can show per-host health independently from the top-level transport status.
_probeManager = new GalaxyRuntimeProbeManager(
subscribe: (probe, cb) => _mx.SubscribeAsync(probe, cb),
unsubscribe: probe => _mx.UnsubscribeAsync(probe));
_onProbeStateChanged = (_, t) =>
{
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
{
HostName = t.TagName,
RuntimeStatus = t.NewState switch
{
HostRuntimeState.Running => "Running",
HostRuntimeState.Stopped => "Stopped",
_ => "Unknown",
},
LastObservedUtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
});
};
_probeManager.StateChanged += _onProbeStateChanged;
// PR 14: alarm subsystem. Per IsAlarm=true attribute discovered, subscribe to the four
// alarm-state attributes (.InAlarm/.Priority/.DescAttrName/.Acked), track lifecycle,
// and raise GalaxyAlarmEvent on transitions — forwarded through the existing
// OnAlarmEvent IPC event that the PR 4 ConnectionSink already wires into AlarmEvent frames.
_alarmTracker = new GalaxyAlarmTracker(
subscribe: (tag, cb) => _mx.SubscribeAsync(tag, cb),
unsubscribe: tag => _mx.UnsubscribeAsync(tag),
write: (tag, v) => _mx.WriteAsync(tag, v));
_onAlarmTransition = (_, t) => OnAlarmEvent?.Invoke(this, new GalaxyAlarmEvent
{
EventId = Guid.NewGuid().ToString("N"),
ObjectTagName = t.AlarmTag,
AlarmName = t.AlarmTag,
Severity = t.Priority,
StateTransition = t.Transition switch
{
AlarmStateTransition.Active => "Active",
AlarmStateTransition.Acknowledged => "Acknowledged",
AlarmStateTransition.Inactive => "Inactive",
_ => "Unknown",
},
Message = t.DescAttrName ?? t.AlarmTag,
UtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
});
_alarmTracker.TransitionRaised += _onAlarmTransition;
}
/// <summary>
/// Exposed for tests. Production flow: DiscoverAsync completes → backend calls
/// <c>SyncProbesAsync</c> with the runtime hosts (WinPlatform + AppEngine gobjects) to
/// advise ScanState per host.
/// </summary>
internal GalaxyRuntimeProbeManager ProbeManager => _probeManager;
public async Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{
try
@@ -103,6 +168,34 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : Array.Empty<GalaxyAttributeInfo>(),
}).ToArray();
// PR 14: cache alarm-bearing attribute full refs so SubscribeAlarmsAsync can advise
// them on demand. Format matches the Galaxy reference grammar <tag>.<attr>.
var freshAlarmTags = attributes
.Where(a => a.IsAlarm)
.Select(a => nameByGobject.TryGetValue(a.GobjectId, out var tn)
? tn + "." + a.AttributeName
: null)
.Where(s => !string.IsNullOrWhiteSpace(s))
.Cast<string>()
.ToArray();
while (_discoveredAlarmTags.TryTake(out _)) { }
foreach (var t in freshAlarmTags) _discoveredAlarmTags.Add(t);
// PR 13: Sync the per-platform probe manager against the just-discovered hierarchy
// so ScanState subscriptions track the current runtime set. Best-effort — probe
// failures don't block Discover from returning, since the gateway-level signal from
// MxAccessClient.ConnectionStateChanged still flows and the Admin UI degrades to
// that level if any per-host probe couldn't advise.
try
{
var targets = hierarchy
.Where(o => o.CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|| o.CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine)
.Select(o => new HostProbeTarget(o.TagName, o.CategoryId));
await _probeManager.SyncAsync(targets).ConfigureAwait(false);
}
catch { /* swallow — Discover succeeded; probes are a diagnostic enrichment */ }
return new DiscoverHierarchyResponse { Success = true, Objects = objects };
}
catch (Exception ex)
@@ -240,8 +333,40 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
}
}
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask;
/// <summary>
/// PR 14: advise every alarm-bearing attribute's 4-attr quartet. Best-effort per-alarm —
/// a subscribe failure on one alarm doesn't abort the whole call, since operators prefer
/// partial alarm coverage to none. Idempotent on repeat calls (tracker internally
/// skips already-tracked alarms).
/// </summary>
public async Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct)
{
foreach (var tag in _discoveredAlarmTags)
{
try { await _alarmTracker.TrackAsync(tag).ConfigureAwait(false); }
catch { /* swallow per-alarm — tracker rolls back its own state on failure */ }
}
}
/// <summary>
/// PR 14: route operator ack through the tracker's AckMsg write path. EventId on the
/// incoming request maps directly to the alarm full reference (Proxy-side naming
/// convention from GalaxyProxyDriver.RaiseAlarmEvent → ev.EventId).
/// </summary>
public async Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct)
{
// EventId carries a per-transition Guid.ToString("N"); there's no reverse map from
// event id to alarm tag yet, so v1's convention (ack targets the condition) is matched
// by reading the alarm name from the Comment envelope: v1 packed "<tag>|<comment>".
// Until the Proxy is updated to send the alarm tag separately, fall back to treating
// the EventId as the alarm tag — Client CLI passes it through unchanged.
var tag = req.EventId;
if (!string.IsNullOrWhiteSpace(tag))
{
try { await _alarmTracker.AcknowledgeAsync(tag, req.Comment ?? string.Empty).ConfigureAwait(false); }
catch { /* swallow — ack failures surface via MxAccessClient.WriteAsync logs */ }
}
}
public async Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
{
@@ -282,11 +407,133 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
}
}
public async Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadProcessedResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Values = Array.Empty<GalaxyDataValue>(),
};
if (req.IntervalMs <= 0)
return new HistoryReadProcessedResponse
{
Success = false,
Error = "HistoryReadProcessed requires IntervalMs > 0",
Values = Array.Empty<GalaxyDataValue>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
try
{
var samples = await _historian.ReadAggregateAsync(
req.TagReference, start, end, req.IntervalMs, req.AggregateColumn, ct).ConfigureAwait(false);
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
return new HistoryReadProcessedResponse { Success = true, Values = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadProcessedResponse
{
Success = false,
Error = $"Historian aggregate read failed: {ex.Message}",
Values = Array.Empty<GalaxyDataValue>(),
};
}
}
public async Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadAtTimeResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Values = Array.Empty<GalaxyDataValue>(),
};
if (req.TimestampsUtcUnixMs.Length == 0)
return new HistoryReadAtTimeResponse { Success = true, Values = Array.Empty<GalaxyDataValue>() };
var timestamps = req.TimestampsUtcUnixMs
.Select(ms => DateTimeOffset.FromUnixTimeMilliseconds(ms).UtcDateTime)
.ToArray();
try
{
var samples = await _historian.ReadAtTimeAsync(req.TagReference, timestamps, ct).ConfigureAwait(false);
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
return new HistoryReadAtTimeResponse { Success = true, Values = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadAtTimeResponse
{
Success = false,
Error = $"Historian at-time read failed: {ex.Message}",
Values = Array.Empty<GalaxyDataValue>(),
};
}
}
public async Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadEventsResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Events = Array.Empty<GalaxyHistoricalEvent>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
try
{
var events = await _historian.ReadEventsAsync(req.SourceName, start, end, req.MaxEvents, ct).ConfigureAwait(false);
var wire = events.Select(e => new GalaxyHistoricalEvent
{
EventId = e.Id.ToString(),
SourceName = e.Source,
EventTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.EventTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
ReceivedTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.ReceivedTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
DisplayText = e.DisplayText,
Severity = e.Severity,
}).ToArray();
return new HistoryReadEventsResponse { Success = true, Events = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadEventsResponse
{
Success = false,
Error = $"Historian event read failed: {ex.Message}",
Events = Array.Empty<GalaxyHistoricalEvent>(),
};
}
}
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
public void Dispose()
{
_alarmTracker.TransitionRaised -= _onAlarmTransition;
_alarmTracker.Dispose();
_probeManager.StateChanged -= _onProbeStateChanged;
_probeManager.Dispose();
_mx.ConnectionStateChanged -= _onConnectionStateChanged;
_historian?.Dispose();
}
@@ -313,19 +560,26 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value),
ValueMessagePackType = 0,
StatusCode = MapHistorianQualityToOpcUa(sample.Quality),
StatusCode = HistorianQualityMapper.Map(sample.Quality),
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
private static uint MapHistorianQualityToOpcUa(byte q)
/// <summary>
/// Maps a <see cref="HistorianAggregateSample"/> (one aggregate bucket) to the IPC wire
/// shape. A null <see cref="HistorianAggregateSample.Value"/> means the aggregate was
/// unavailable for the bucket — the Proxy translates that to OPC UA <c>BadNoData</c>.
/// </summary>
private static GalaxyDataValue ToWire(string reference, HistorianAggregateSample sample) => new()
{
// Category-only mapping — mirrors QualityMapper.MapToOpcUaStatusCode for the common ranges.
// The Proxy may refine this when it decodes the wire frame.
if (q >= 192) return 0x00000000u; // Good
if (q >= 64) return 0x40000000u; // Uncertain
return 0x80000000u; // Bad
}
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value.Value),
ValueMessagePackType = 0,
StatusCode = sample.Value is null ? 0x800E0000u /* BadNoData */ : 0x00000000u,
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new()
{
@@ -335,6 +589,7 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
};
private static string MapCategory(int categoryId) => categoryId switch

View File

@@ -0,0 +1,273 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
/// <summary>
/// Per-platform + per-AppEngine runtime probe. Subscribes to <c>&lt;TagName&gt;.ScanState</c>
/// for each $WinPlatform and $AppEngine gobject, tracks Unknown → Running → Stopped
/// transitions, and fires <see cref="StateChanged"/> so <see cref="Backend.MxAccessGalaxyBackend"/>
/// can forward per-host events through the existing IPC <c>OnHostStatusChanged</c> event.
/// Pure-logic state machine with an injected clock so it's deterministically testable —
/// port of v1 <c>GalaxyRuntimeProbeManager</c> without the OPC UA node-manager coupling.
/// </summary>
/// <remarks>
/// State machine rules (documented in v1's <c>runtimestatus.md</c> and preserved here):
/// <list type="bullet">
/// <item><c>ScanState</c> is on-change-only — a stably-Running host may go hours without a
/// callback. Running → Stopped is driven by an explicit <c>ScanState=false</c> callback,
/// never by starvation.</item>
/// <item>Unknown → Running is a startup transition and does NOT fire StateChanged (would
/// paint every host as "just recovered" at startup, which is noise).</item>
/// <item>Stopped → Running and Running → Stopped fire StateChanged. Unknown → Stopped
/// fires StateChanged because that's a first-known-bad signal operators need.</item>
/// <item>All public methods are thread-safe. Callbacks fire outside the internal lock to
/// avoid lock inversion with caller-owned state.</item>
/// </list>
/// </remarks>
public sealed class GalaxyRuntimeProbeManager : IDisposable
{
public const int CategoryWinPlatform = 1;
public const int CategoryAppEngine = 3;
public const string ProbeAttribute = ".ScanState";
private readonly Func<DateTime> _clock;
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
private readonly Func<string, Task> _unsubscribe;
private readonly object _lock = new();
// probe tag → per-host state
private readonly Dictionary<string, HostProbeState> _byProbe = new(StringComparer.OrdinalIgnoreCase);
// tag name → probe tag (for reverse lookup on the desired-set diff)
private readonly Dictionary<string, string> _probeByTagName = new(StringComparer.OrdinalIgnoreCase);
private bool _disposed;
/// <summary>
/// Fires on every state transition that operators should react to. See class remarks
/// for the rules on which transitions fire.
/// </summary>
public event EventHandler<HostStateTransition>? StateChanged;
public GalaxyRuntimeProbeManager(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe)
: this(subscribe, unsubscribe, () => DateTime.UtcNow) { }
internal GalaxyRuntimeProbeManager(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<DateTime> clock)
{
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
}
/// <summary>Number of probes currently advised. Test/dashboard hook.</summary>
public int ActiveProbeCount
{
get { lock (_lock) return _byProbe.Count; }
}
/// <summary>
/// Snapshot every currently-tracked host's state. One entry per probe.
/// </summary>
public IReadOnlyList<HostProbeSnapshot> SnapshotStates()
{
lock (_lock)
{
return _byProbe.Select(kv => new HostProbeSnapshot(
TagName: kv.Value.TagName,
State: kv.Value.State,
LastChangedUtc: kv.Value.LastStateChangeUtc)).ToList();
}
}
/// <summary>
/// Query the current runtime state for <paramref name="tagName"/>. Returns
/// <see cref="HostRuntimeState.Unknown"/> when the host is not tracked.
/// </summary>
public HostRuntimeState GetState(string tagName)
{
lock (_lock)
{
if (_probeByTagName.TryGetValue(tagName, out var probe)
&& _byProbe.TryGetValue(probe, out var state))
return state.State;
return HostRuntimeState.Unknown;
}
}
/// <summary>
/// Diff the desired host set (filtered $WinPlatform / $AppEngine from the latest Discover)
/// against the currently-tracked set and advise / unadvise as needed. Idempotent:
/// calling twice with the same set does nothing.
/// </summary>
public async Task SyncAsync(IEnumerable<HostProbeTarget> desiredHosts)
{
if (_disposed) return;
var desired = desiredHosts
.Where(h => !string.IsNullOrWhiteSpace(h.TagName))
.ToDictionary(h => h.TagName, StringComparer.OrdinalIgnoreCase);
List<string> toAdvise;
List<string> toUnadvise;
lock (_lock)
{
toAdvise = desired.Keys
.Where(tag => !_probeByTagName.ContainsKey(tag))
.ToList();
toUnadvise = _probeByTagName.Keys
.Where(tag => !desired.ContainsKey(tag))
.Select(tag => _probeByTagName[tag])
.ToList();
foreach (var tag in toAdvise)
{
var probe = tag + ProbeAttribute;
_probeByTagName[tag] = probe;
_byProbe[probe] = new HostProbeState
{
TagName = tag,
State = HostRuntimeState.Unknown,
LastStateChangeUtc = _clock(),
};
}
foreach (var probe in toUnadvise)
{
_byProbe.Remove(probe);
}
foreach (var removedTag in _probeByTagName.Keys.Where(t => !desired.ContainsKey(t)).ToList())
{
_probeByTagName.Remove(removedTag);
}
}
foreach (var tag in toAdvise)
{
var probe = tag + ProbeAttribute;
try
{
await _subscribe(probe, OnProbeCallback);
}
catch
{
// Rollback on subscribe failure so a later Tick can't transition a never-advised
// probe into a false Stopped state. Callers can re-Sync later to retry.
lock (_lock)
{
_byProbe.Remove(probe);
_probeByTagName.Remove(tag);
}
}
}
foreach (var probe in toUnadvise)
{
try { await _unsubscribe(probe); } catch { /* best-effort cleanup */ }
}
}
/// <summary>
/// Public entry point for tests and internal callbacks. Production flow: MxAccessClient's
/// SubscribeAsync delivers VTQ updates through the callback wired in <see cref="SyncAsync"/>,
/// which calls this method under the lock to update state and fires
/// <see cref="StateChanged"/> outside the lock for any transition that matters.
/// </summary>
public void OnProbeCallback(string probeTag, Vtq vtq)
{
if (_disposed) return;
HostStateTransition? transition = null;
lock (_lock)
{
if (!_byProbe.TryGetValue(probeTag, out var state)) return;
var isRunning = vtq.Quality >= 192 && vtq.Value is bool b && b;
var now = _clock();
var previous = state.State;
state.LastCallbackUtc = now;
if (isRunning)
{
state.GoodUpdateCount++;
if (previous != HostRuntimeState.Running)
{
state.State = HostRuntimeState.Running;
state.LastStateChangeUtc = now;
if (previous == HostRuntimeState.Stopped)
{
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Running, now);
}
}
}
else
{
state.FailureCount++;
if (previous != HostRuntimeState.Stopped)
{
state.State = HostRuntimeState.Stopped;
state.LastStateChangeUtc = now;
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Stopped, now);
}
}
}
if (transition is { } t)
{
StateChanged?.Invoke(this, t);
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
lock (_lock)
{
_byProbe.Clear();
_probeByTagName.Clear();
}
}
private sealed class HostProbeState
{
public string TagName { get; set; } = "";
public HostRuntimeState State { get; set; }
public DateTime LastStateChangeUtc { get; set; }
public DateTime? LastCallbackUtc { get; set; }
public long GoodUpdateCount { get; set; }
public long FailureCount { get; set; }
}
}
public enum HostRuntimeState
{
Unknown,
Running,
Stopped,
}
public sealed record HostStateTransition(
string TagName,
HostRuntimeState OldState,
HostRuntimeState NewState,
DateTime AtUtc);
public sealed record HostProbeSnapshot(
string TagName,
HostRuntimeState State,
DateTime LastChangedUtc);
public readonly record struct HostProbeTarget(string TagName, int CategoryId)
{
public bool IsRuntimeHost =>
CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|| CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine;
}

View File

@@ -85,6 +85,33 @@ public sealed class StubGalaxyBackend : IGalaxyBackend
Tags = System.Array.Empty<HistoryTagValues>(),
});
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadProcessedResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadAtTimeResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadEventsResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
});
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse
{

View File

@@ -80,6 +80,27 @@ public sealed class GalaxyFrameHandler(IGalaxyBackend backend, ILogger logger) :
await writer.WriteAsync(MessageKind.HistoryReadResponse, resp, ct);
return;
}
case MessageKind.HistoryReadProcessedRequest:
{
var resp = await backend.HistoryReadProcessedAsync(
Deserialize<HistoryReadProcessedRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadProcessedResponse, resp, ct);
return;
}
case MessageKind.HistoryReadAtTimeRequest:
{
var resp = await backend.HistoryReadAtTimeAsync(
Deserialize<HistoryReadAtTimeRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadAtTimeResponse, resp, ct);
return;
}
case MessageKind.HistoryReadEventsRequest:
{
var resp = await backend.HistoryReadEventsAsync(
Deserialize<HistoryReadEventsRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadEventsResponse, resp, ct);
return;
}
case MessageKind.RecycleHostRequest:
{
var resp = await backend.RecycleAsync(Deserialize<RecycleHostRequest>(body), ct);

View File

@@ -114,16 +114,31 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
var folder = builder.Folder(obj.ContainedName, obj.ContainedName);
foreach (var attr in obj.Attributes)
{
folder.Variable(
var fullName = $"{obj.TagName}.{attr.AttributeName}";
var handle = folder.Variable(
attr.AttributeName,
attr.AttributeName,
new DriverAttributeInfo(
FullName: $"{obj.TagName}.{attr.AttributeName}",
FullName: fullName,
DriverDataType: MapDataType(attr.MxDataType),
IsArray: attr.IsArray,
ArrayDim: attr.ArrayDim,
SecurityClass: MapSecurity(attr.SecurityClassification),
IsHistorized: attr.IsHistorized));
IsHistorized: attr.IsHistorized,
IsAlarm: attr.IsAlarm));
// PR 15: when Galaxy flags the attribute as alarm-bearing (AlarmExtension
// primitive), register an alarm-condition sink so the generic node manager
// can route OnAlarmEvent payloads for this tag to the concrete address-space
// builder. Severity default Medium — the live severity arrives through
// AlarmEventArgs once MxAccessGalaxyBackend's tracker starts firing.
if (attr.IsAlarm)
{
handle.MarkAsAlarmCondition(new AlarmConditionInfo(
SourceName: fullName,
InitialSeverity: AlarmSeverity.Medium,
InitialDescription: null));
}
}
}
}
@@ -296,10 +311,108 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
return new HistoryReadResult(samples, ContinuationPoint: null);
}
public Task<HistoryReadResult> ReadProcessedAsync(
public async Task<HistoryReadResult> ReadProcessedAsync(
string fullReference, DateTime startUtc, DateTime endUtc, TimeSpan interval,
HistoryAggregateType aggregate, CancellationToken cancellationToken)
=> throw new NotSupportedException("Galaxy historian processed reads are not supported in v2; use ReadRawAsync.");
{
var client = RequireClient();
var column = MapAggregateToColumn(aggregate);
var resp = await client.CallAsync<HistoryReadProcessedRequest, HistoryReadProcessedResponse>(
MessageKind.HistoryReadProcessedRequest,
new HistoryReadProcessedRequest
{
SessionId = _sessionId,
TagReference = fullReference,
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
IntervalMs = (long)interval.TotalMilliseconds,
AggregateColumn = column,
},
MessageKind.HistoryReadProcessedResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadProcessed failed: {resp.Error}");
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
return new HistoryReadResult(samples, ContinuationPoint: null);
}
public async Task<HistoryReadResult> ReadAtTimeAsync(
string fullReference, IReadOnlyList<DateTime> timestampsUtc, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<HistoryReadAtTimeRequest, HistoryReadAtTimeResponse>(
MessageKind.HistoryReadAtTimeRequest,
new HistoryReadAtTimeRequest
{
SessionId = _sessionId,
TagReference = fullReference,
TimestampsUtcUnixMs = [.. timestampsUtc.Select(t => new DateTimeOffset(t, TimeSpan.Zero).ToUnixTimeMilliseconds())],
},
MessageKind.HistoryReadAtTimeResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadAtTime failed: {resp.Error}");
// ReadAtTime returns one sample per requested timestamp in the same order — the Host
// pads with bad-quality snapshots when a timestamp can't be interpolated, so response
// length matches request length exactly. We trust that contract rather than
// re-aligning here, because the Host is the source-of-truth for interpolation policy.
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
return new HistoryReadResult(samples, ContinuationPoint: null);
}
public async Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName, DateTime startUtc, DateTime endUtc, int maxEvents, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<HistoryReadEventsRequest, HistoryReadEventsResponse>(
MessageKind.HistoryReadEventsRequest,
new HistoryReadEventsRequest
{
SessionId = _sessionId,
SourceName = sourceName,
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
MaxEvents = maxEvents,
},
MessageKind.HistoryReadEventsResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadEvents failed: {resp.Error}");
IReadOnlyList<HistoricalEvent> events = [.. resp.Events.Select(ToHistoricalEvent)];
return new HistoricalEventsResult(events, ContinuationPoint: null);
}
internal static HistoricalEvent ToHistoricalEvent(GalaxyHistoricalEvent wire) => new(
EventId: wire.EventId,
SourceName: wire.SourceName,
EventTimeUtc: DateTimeOffset.FromUnixTimeMilliseconds(wire.EventTimeUtcUnixMs).UtcDateTime,
ReceivedTimeUtc: DateTimeOffset.FromUnixTimeMilliseconds(wire.ReceivedTimeUtcUnixMs).UtcDateTime,
Message: wire.DisplayText,
Severity: wire.Severity);
/// <summary>
/// Maps the OPC UA Part 13 aggregate enum onto the Wonderware Historian
/// AnalogSummaryQuery column names consumed by <c>HistorianDataSource.ReadAggregateAsync</c>.
/// Kept on the Proxy side so Galaxy.Host stays OPC-UA-free.
/// </summary>
internal static string MapAggregateToColumn(HistoryAggregateType aggregate) => aggregate switch
{
HistoryAggregateType.Average => "Average",
HistoryAggregateType.Minimum => "Minimum",
HistoryAggregateType.Maximum => "Maximum",
HistoryAggregateType.Count => "ValueCount",
HistoryAggregateType.Total => throw new NotSupportedException(
"HistoryAggregateType.Total is not supported by the Wonderware Historian AnalogSummary " +
"query — use Average × Count on the caller side, or switch to Average/Minimum/Maximum/Count."),
_ => throw new NotSupportedException($"Unknown HistoryAggregateType {aggregate}"),
};
// ---- IRediscoverable ----

View File

@@ -16,6 +16,10 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>

View File

@@ -30,6 +30,15 @@ public sealed class GalaxyAttributeInfo
[Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; }
/// <summary>
/// True when the attribute has an AlarmExtension primitive in the Galaxy repository
/// (<c>primitive_definition.primitive_name = 'AlarmExtension'</c>). The generic
/// node-manager uses this to enrich the variable's OPC UA node with an
/// <c>AlarmConditionState</c> during address-space build. Added in PR 9 as the
/// discovery-side foundation for the alarm event wire-up that follows in PR 10+.
/// </summary>
[Key(6)] public bool IsAlarm { get; set; }
}
[MessagePackObject]

View File

@@ -48,8 +48,14 @@ public enum MessageKind : byte
AlarmEvent = 0x51,
AlarmAckRequest = 0x52,
HistoryReadRequest = 0x60,
HistoryReadResponse = 0x61,
HistoryReadRequest = 0x60,
HistoryReadResponse = 0x61,
HistoryReadProcessedRequest = 0x62,
HistoryReadProcessedResponse = 0x63,
HistoryReadAtTimeRequest = 0x64,
HistoryReadAtTimeResponse = 0x65,
HistoryReadEventsRequest = 0x66,
HistoryReadEventsResponse = 0x67,
HostConnectivityStatus = 0x70,
RuntimeStatusChange = 0x71,

View File

@@ -26,3 +26,85 @@ public sealed class HistoryReadResponse
[Key(1)] public string? Error { get; set; }
[Key(2)] public HistoryTagValues[] Tags { get; set; } = System.Array.Empty<HistoryTagValues>();
}
/// <summary>
/// Processed (aggregated) historian read — OPC UA HistoryReadProcessed service. The
/// aggregate column is a string (e.g. "Average", "Minimum") mapped by the Proxy from the
/// OPC UA HistoryAggregateType enum so Galaxy.Host stays OPC-UA-free.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadProcessedRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string TagReference { get; set; } = string.Empty;
[Key(2)] public long StartUtcUnixMs { get; set; }
[Key(3)] public long EndUtcUnixMs { get; set; }
[Key(4)] public long IntervalMs { get; set; }
[Key(5)] public string AggregateColumn { get; set; } = "Average";
}
[MessagePackObject]
public sealed class HistoryReadProcessedResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
}
/// <summary>
/// At-time historian read — OPC UA HistoryReadAtTime service. Returns one sample per
/// requested timestamp (interpolated when no exact match exists). The per-timestamp array
/// is flow-encoded as Unix milliseconds to avoid MessagePack DateTime quirks.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadAtTimeRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string TagReference { get; set; } = string.Empty;
[Key(2)] public long[] TimestampsUtcUnixMs { get; set; } = System.Array.Empty<long>();
}
[MessagePackObject]
public sealed class HistoryReadAtTimeResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
}
/// <summary>
/// Historical events read — OPC UA HistoryReadEvents service and Alarm &amp; Condition
/// history. <c>SourceName</c> null means "all sources". Distinct from the live
/// <see cref="GalaxyAlarmEvent"/> stream because historical rows carry both
/// <c>EventTime</c> (when the event occurred in the process) and <c>ReceivedTime</c>
/// (when the Historian persisted it) and have no StateTransition — the Historian logs
/// the instantaneous event, not the OPC UA alarm lifecycle.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadEventsRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string? SourceName { get; set; }
[Key(2)] public long StartUtcUnixMs { get; set; }
[Key(3)] public long EndUtcUnixMs { get; set; }
[Key(4)] public int MaxEvents { get; set; } = 1000;
}
[MessagePackObject]
public sealed class GalaxyHistoricalEvent
{
[Key(0)] public string EventId { get; set; } = string.Empty;
[Key(1)] public string? SourceName { get; set; }
[Key(2)] public long EventTimeUtcUnixMs { get; set; }
[Key(3)] public long ReceivedTimeUtcUnixMs { get; set; }
[Key(4)] public string? DisplayText { get; set; }
[Key(5)] public ushort Severity { get; set; }
}
[MessagePackObject]
public sealed class HistoryReadEventsResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyHistoricalEvent[] Events { get; set; } = System.Array.Empty<GalaxyHistoricalEvent>();
}

View File

@@ -0,0 +1,165 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
/// <summary>
/// AutomationDirect DirectLOGIC address-translation helpers. DL205 / DL260 / DL350 CPUs
/// address V-memory in OCTAL while the Modbus wire uses DECIMAL PDU addresses — operators
/// see "V2000" in the PLC ladder-logic editor but the Modbus client must write PDU 0x0400.
/// The formulas differ between user V-memory (simple octal-to-decimal) and system V-memory
/// (fixed bank mappings), so the two cases are separate methods rather than one overloaded
/// "ToPdu" call.
/// </summary>
/// <remarks>
/// See <c>docs/v2/dl205.md</c> §V-memory for the full CPU-family matrix + rationale.
/// References: D2-USER-M appendix (DL205/D2-260), H2-ECOM-M §6.5 (absolute vs relative
/// addressing), AutomationDirect forum guidance on V40400 system-base.
/// </remarks>
public static class DirectLogicAddress
{
/// <summary>
/// Convert a DirectLOGIC user V-memory address (octal) to a 0-based Modbus PDU address.
/// Accepts bare octal (<c>"2000"</c>) or <c>V</c>-prefixed (<c>"V2000"</c>). Range
/// depends on CPU model — DL205 D2-260 user memory is V1400-V7377 + V10000-V17777
/// octal, DL260 extends to V77777 octal.
/// </summary>
/// <exception cref="ArgumentException">Input is null / empty / contains non-octal digits (8,9).</exception>
/// <exception cref="OverflowException">Parsed value exceeds ushort.MaxValue (0xFFFF).</exception>
public static ushort UserVMemoryToPdu(string vAddress)
{
if (string.IsNullOrWhiteSpace(vAddress))
throw new ArgumentException("V-memory address must not be empty", nameof(vAddress));
var s = vAddress.Trim();
if (s[0] == 'V' || s[0] == 'v') s = s.Substring(1);
if (s.Length == 0)
throw new ArgumentException($"V-memory address '{vAddress}' has no digits", nameof(vAddress));
// Octal conversion. Reject 8/9 digits up-front — int.Parse in the obvious base would
// accept them silently because .NET has no built-in base-8 parser.
uint result = 0;
foreach (var ch in s)
{
if (ch < '0' || ch > '7')
throw new ArgumentException(
$"V-memory address '{vAddress}' contains non-octal digit '{ch}' — DirectLOGIC V-addresses are octal (0-7)",
nameof(vAddress));
result = result * 8 + (uint)(ch - '0');
if (result > ushort.MaxValue)
throw new OverflowException(
$"V-memory address '{vAddress}' exceeds the 16-bit Modbus PDU address range");
}
return (ushort)result;
}
/// <summary>
/// DirectLOGIC system V-memory starts at octal V40400 on DL260 / H2-ECOM100 in factory
/// "absolute" addressing mode. Unlike user V-memory, the mapping is NOT a simple
/// octal-to-decimal conversion — the CPU relocates the system bank to Modbus PDU 0x2100
/// (decimal 8448). This helper returns the CPU-family base plus a user-supplied offset
/// within the system bank.
/// </summary>
public const ushort SystemVMemoryBasePdu = 0x2100;
/// <param name="offsetWithinSystemBank">
/// 0-based register offset within the system bank. Pass 0 for V40400 itself; pass 1 for
/// V40401 (octal), and so on. NOT an octal-decoded value — the system bank lives at
/// consecutive PDU addresses, so the offset is plain decimal.
/// </param>
public static ushort SystemVMemoryToPdu(ushort offsetWithinSystemBank)
{
var pdu = SystemVMemoryBasePdu + offsetWithinSystemBank;
if (pdu > ushort.MaxValue)
throw new OverflowException(
$"System V-memory offset {offsetWithinSystemBank} maps past 0xFFFF");
return (ushort)pdu;
}
// Bit-memory bases per DL260 user manual §I/O-configuration.
// Numbers after X / Y / C / SP are OCTAL in DirectLOGIC notation. The Modbus base is
// added to the octal-decoded offset; e.g. Y017 = Modbus coil 2048 + octal(17) = 2048 + 15 = 2063.
/// <summary>
/// DL260 Y-output coil base. Y0 octal → Modbus coil address 2048 (0-based).
/// </summary>
public const ushort YOutputBaseCoil = 2048;
/// <summary>
/// DL260 C-relay coil base. C0 octal → Modbus coil address 3072 (0-based).
/// </summary>
public const ushort CRelayBaseCoil = 3072;
/// <summary>
/// DL260 X-input discrete-input base. X0 octal → Modbus discrete input 0.
/// </summary>
public const ushort XInputBaseDiscrete = 0;
/// <summary>
/// DL260 SP special-relay discrete-input base. SP0 octal → Modbus discrete input 1024.
/// Read-only; writing SP relays is rejected with Illegal Data Address.
/// </summary>
public const ushort SpecialBaseDiscrete = 1024;
/// <summary>
/// Translate a DirectLOGIC Y-output address (e.g. <c>"Y0"</c>, <c>"Y17"</c>) to its
/// 0-based Modbus coil address on DL260. The trailing number is OCTAL, matching the
/// ladder-logic editor's notation.
/// </summary>
public static ushort YOutputToCoil(string yAddress) =>
AddOctalOffset(YOutputBaseCoil, StripPrefix(yAddress, 'Y'));
/// <summary>
/// Translate a DirectLOGIC C-relay address (e.g. <c>"C0"</c>, <c>"C1777"</c>) to its
/// 0-based Modbus coil address.
/// </summary>
public static ushort CRelayToCoil(string cAddress) =>
AddOctalOffset(CRelayBaseCoil, StripPrefix(cAddress, 'C'));
/// <summary>
/// Translate a DirectLOGIC X-input address (e.g. <c>"X0"</c>, <c>"X17"</c>) to its
/// 0-based Modbus discrete-input address. Reading an unpopulated X returns 0, not an
/// exception — the CPU sizes the table to configured I/O, not installed modules.
/// </summary>
public static ushort XInputToDiscrete(string xAddress) =>
AddOctalOffset(XInputBaseDiscrete, StripPrefix(xAddress, 'X'));
/// <summary>
/// Translate a DirectLOGIC SP-special-relay address (e.g. <c>"SP0"</c>) to its 0-based
/// Modbus discrete-input address. Accepts <c>"SP"</c> prefix case-insensitively.
/// </summary>
public static ushort SpecialToDiscrete(string spAddress)
{
if (string.IsNullOrWhiteSpace(spAddress))
throw new ArgumentException("SP address must not be empty", nameof(spAddress));
var s = spAddress.Trim();
if (s.Length >= 2 && (s[0] == 'S' || s[0] == 's') && (s[1] == 'P' || s[1] == 'p'))
s = s.Substring(2);
return AddOctalOffset(SpecialBaseDiscrete, s);
}
private static string StripPrefix(string address, char expectedPrefix)
{
if (string.IsNullOrWhiteSpace(address))
throw new ArgumentException("Address must not be empty", nameof(address));
var s = address.Trim();
if (s.Length > 0 && char.ToUpperInvariant(s[0]) == char.ToUpperInvariant(expectedPrefix))
s = s.Substring(1);
return s;
}
private static ushort AddOctalOffset(ushort baseAddr, string octalDigits)
{
if (octalDigits.Length == 0)
throw new ArgumentException("Address has no digits", nameof(octalDigits));
uint offset = 0;
foreach (var ch in octalDigits)
{
if (ch < '0' || ch > '7')
throw new ArgumentException(
$"Address contains non-octal digit '{ch}' — DirectLOGIC I/O addresses are octal (0-7)",
nameof(octalDigits));
offset = offset * 8 + (uint)(ch - '0');
}
var result = baseAddr + offset;
if (result > ushort.MaxValue)
throw new OverflowException($"Address {baseAddr}+{offset} exceeds 0xFFFF");
return (ushort)result;
}
}

View File

@@ -0,0 +1,25 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
/// <summary>
/// Abstraction over the Modbus TCP socket. Takes a <c>PDU</c> (function code + data, excluding
/// the 7-byte MBAP header) and returns the response PDU — the transport owns transaction-id
/// pairing, framing, and socket I/O. Tests supply in-memory fakes.
/// </summary>
public interface IModbusTransport : IAsyncDisposable
{
Task ConnectAsync(CancellationToken ct);
/// <summary>
/// Send a Modbus PDU (function code + function-specific data) and read the response PDU.
/// Throws <see cref="ModbusException"/> when the server returns an exception PDU
/// (function code + 0x80 + exception code).
/// </summary>
Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct);
}
public sealed class ModbusException(byte functionCode, byte exceptionCode, string message)
: Exception(message)
{
public byte FunctionCode { get; } = functionCode;
public byte ExceptionCode { get; } = exceptionCode;
}

View File

@@ -0,0 +1,161 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
/// <summary>
/// Mitsubishi MELSEC PLC family selector for address-translation helpers. The Q/L/iQ-R
/// families write bit-device addresses (X, Y) in <b>hexadecimal</b> in GX Works and the
/// CPU manuals; the FX and iQ-F families write them in <b>octal</b> (same convention as
/// AutomationDirect DirectLOGIC). Mixing the two up is the #1 MELSEC driver bug source —
/// an operator typing <c>X20</c> into a Q-series tag config means decimal 32, but the
/// same string on an FX3U means decimal 16, so the helper must know the family to route
/// correctly.
/// </summary>
public enum MelsecFamily
{
/// <summary>
/// MELSEC-Q / MELSEC-L / MELSEC iQ-R. X and Y device numbers are interpreted as
/// <b>hexadecimal</b>; <c>X20</c> means decimal 32.
/// </summary>
Q_L_iQR,
/// <summary>
/// MELSEC-F (FX3U / FX3GE / FX3G) and MELSEC iQ-F (FX5U). X and Y device numbers
/// are interpreted as <b>octal</b> (same as DirectLOGIC); <c>X20</c> means decimal 16.
/// iQ-F has a GX Works3 project toggle that can flip to decimal — if a site uses
/// that, configure the tag's Address directly as a decimal PDU address and do not
/// route through this helper.
/// </summary>
F_iQF,
}
/// <summary>
/// Mitsubishi MELSEC address-translation helpers for the QJ71MT91 / LJ71MT91 / RJ71EN71 /
/// iQ-R built-in / iQ-F / FX3U-ENET-P502 Modbus modules. MELSEC does NOT hard-wire
/// Modbus-to-device mappings like DL260 does — every site configures its own "Modbus
/// Device Assignment Parameter" block of up to 16 entries. The helpers here cover only
/// the <b>address-notation</b> portion of the translation (hex X20 vs octal X20 + adding
/// the bank base); the caller is still responsible for knowing the assignment-block
/// offset for their site.
/// </summary>
/// <remarks>
/// See <c>docs/v2/mitsubishi.md</c> §device-assignment + §X-Y-hex-trap for the full
/// matrix and primary-source citations.
/// </remarks>
public static class MelsecAddress
{
/// <summary>
/// Translate a MELSEC X-input address (e.g. <c>"X0"</c>, <c>"X10"</c>) to a 0-based
/// Modbus discrete-input address, given the PLC family's address notation (hex or
/// octal) and the Modbus Device Assignment block's X-range base.
/// </summary>
/// <param name="xAddress">MELSEC X address. <c>X</c> prefix optional, case-insensitive.</param>
/// <param name="family">The PLC family — determines whether the trailing digits are hex or octal.</param>
/// <param name="xBankBase">
/// 0-based Modbus DI address the assignment-block has configured X0 to land at.
/// Typical default on QJ71MT91 sample projects: 0. Pass the site-specific value.
/// </param>
public static ushort XInputToDiscrete(string xAddress, MelsecFamily family, ushort xBankBase = 0) =>
AddFamilyOffset(xBankBase, StripPrefix(xAddress, 'X'), family);
/// <summary>
/// Translate a MELSEC Y-output address to a 0-based Modbus coil address. Same rules
/// as <see cref="XInputToDiscrete"/> for hex/octal parsing.
/// </summary>
public static ushort YOutputToCoil(string yAddress, MelsecFamily family, ushort yBankBase = 0) =>
AddFamilyOffset(yBankBase, StripPrefix(yAddress, 'Y'), family);
/// <summary>
/// Translate a MELSEC M-relay address (internal relay) to a 0-based Modbus coil
/// address. M-addresses are <b>decimal</b> on every MELSEC family — unlike X/Y which
/// are hex on Q/L/iQ-R. Includes the bank base that the assignment-block configured.
/// </summary>
public static ushort MRelayToCoil(string mAddress, ushort mBankBase = 0)
{
var digits = StripPrefix(mAddress, 'M');
if (!ushort.TryParse(digits, out var offset))
throw new ArgumentException(
$"M-relay address '{mAddress}' is not a valid decimal integer", nameof(mAddress));
var result = mBankBase + offset;
if (result > ushort.MaxValue)
throw new OverflowException($"M-relay {mAddress} + base {mBankBase} exceeds 0xFFFF");
return (ushort)result;
}
/// <summary>
/// Translate a MELSEC D-register address (data register) to a 0-based Modbus holding
/// register address. D-addresses are <b>decimal</b>. Default assignment convention is
/// D0 → HR 0 (pass <paramref name="dBankBase"/> = 0); sites with shifted layouts pass
/// their configured base.
/// </summary>
public static ushort DRegisterToHolding(string dAddress, ushort dBankBase = 0)
{
var digits = StripPrefix(dAddress, 'D');
if (!ushort.TryParse(digits, out var offset))
throw new ArgumentException(
$"D-register address '{dAddress}' is not a valid decimal integer", nameof(dAddress));
var result = dBankBase + offset;
if (result > ushort.MaxValue)
throw new OverflowException($"D-register {dAddress} + base {dBankBase} exceeds 0xFFFF");
return (ushort)result;
}
private static string StripPrefix(string address, char expectedPrefix)
{
if (string.IsNullOrWhiteSpace(address))
throw new ArgumentException("Address must not be empty", nameof(address));
var s = address.Trim();
if (s.Length > 0 && char.ToUpperInvariant(s[0]) == char.ToUpperInvariant(expectedPrefix))
s = s.Substring(1);
if (s.Length == 0)
throw new ArgumentException($"Address '{address}' has no digits after prefix", nameof(address));
return s;
}
private static ushort AddFamilyOffset(ushort baseAddr, string digits, MelsecFamily family)
{
uint offset = family switch
{
MelsecFamily.Q_L_iQR => ParseHex(digits),
MelsecFamily.F_iQF => ParseOctal(digits),
_ => throw new ArgumentOutOfRangeException(nameof(family), family, "Unknown MELSEC family"),
};
var result = baseAddr + offset;
if (result > ushort.MaxValue)
throw new OverflowException($"Address {baseAddr}+{offset} exceeds 0xFFFF");
return (ushort)result;
}
private static uint ParseHex(string digits)
{
uint result = 0;
foreach (var ch in digits)
{
uint nibble;
if (ch >= '0' && ch <= '9') nibble = (uint)(ch - '0');
else if (ch >= 'A' && ch <= 'F') nibble = (uint)(ch - 'A' + 10);
else if (ch >= 'a' && ch <= 'f') nibble = (uint)(ch - 'a' + 10);
else throw new ArgumentException(
$"Address contains non-hex digit '{ch}' — Q/L/iQ-R X/Y addresses are hexadecimal",
nameof(digits));
result = result * 16 + nibble;
if (result > ushort.MaxValue)
throw new OverflowException($"Hex address exceeds 0xFFFF");
}
return result;
}
private static uint ParseOctal(string digits)
{
uint result = 0;
foreach (var ch in digits)
{
if (ch < '0' || ch > '7')
throw new ArgumentException(
$"Address contains non-octal digit '{ch}' — FX/iQ-F X/Y addresses are octal (0-7)",
nameof(digits));
result = result * 8 + (uint)(ch - '0');
if (result > ushort.MaxValue)
throw new OverflowException($"Octal address exceeds 0xFFFF");
}
return result;
}
}

View File

@@ -0,0 +1,732 @@
using System.Buffers.Binary;
using System.Text.Json;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
/// <summary>
/// Modbus TCP implementation of <see cref="IDriver"/> + <see cref="ITagDiscovery"/> +
/// <see cref="IReadable"/> + <see cref="IWritable"/>. First native-protocol greenfield
/// driver for the v2 stack — validates the driver-agnostic <c>IAddressSpaceBuilder</c> +
/// <c>IReadable</c>/<c>IWritable</c> abstractions generalize beyond Galaxy.
/// </summary>
/// <remarks>
/// Scope limits: synchronous Read/Write only, no subscriptions (Modbus has no push model;
/// subscriptions would need a polling loop over the declared tags — additive PR). Historian
/// + alarm capabilities are out of scope (the protocol doesn't express them).
/// </remarks>
public sealed class ModbusDriver(ModbusDriverOptions options, string driverInstanceId,
Func<ModbusDriverOptions, IModbusTransport>? transportFactory = null)
: IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IDisposable, IAsyncDisposable
{
// Active polling subscriptions. Each subscription owns a background Task that polls the
// tags at its configured interval, diffs against _lastKnownValues, and fires OnDataChange
// per changed tag. UnsubscribeAsync cancels the task via the CTS stored on the handle.
private readonly System.Collections.Concurrent.ConcurrentDictionary<long, SubscriptionState> _subscriptions = new();
private long _nextSubscriptionId;
public event EventHandler<DataChangeEventArgs>? OnDataChange;
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
// Single-host probe state — Modbus driver talks to exactly one endpoint so the "hosts"
// collection has at most one entry. HostName is the Host:Port string so the Admin UI can
// display the PLC endpoint uniformly with Galaxy platforms/engines.
private readonly object _probeLock = new();
private HostState _hostState = HostState.Unknown;
private DateTime _hostStateChangedUtc = DateTime.UtcNow;
private CancellationTokenSource? _probeCts;
private readonly ModbusDriverOptions _options = options;
private readonly Func<ModbusDriverOptions, IModbusTransport> _transportFactory =
transportFactory ?? (o => new ModbusTcpTransport(o.Host, o.Port, o.Timeout, o.AutoReconnect));
private IModbusTransport? _transport;
private DriverHealth _health = new(DriverState.Unknown, null, null);
private readonly Dictionary<string, ModbusTagDefinition> _tagsByName = new(StringComparer.OrdinalIgnoreCase);
public string DriverInstanceId => driverInstanceId;
public string DriverType => "Modbus";
public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
_health = new DriverHealth(DriverState.Initializing, null, null);
try
{
_transport = _transportFactory(_options);
await _transport.ConnectAsync(cancellationToken).ConfigureAwait(false);
foreach (var t in _options.Tags) _tagsByName[t.Name] = t;
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
// PR 23: kick off the probe loop once the transport is up. Initial state stays
// Unknown until the first probe tick succeeds — avoids broadcasting a premature
// Running transition before any register round-trip has happened.
if (_options.Probe.Enabled)
{
_probeCts = new CancellationTokenSource();
_ = Task.Run(() => ProbeLoopAsync(_probeCts.Token), _probeCts.Token);
}
}
catch (Exception ex)
{
_health = new DriverHealth(DriverState.Faulted, null, ex.Message);
throw;
}
}
public async Task ReinitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
await ShutdownAsync(cancellationToken);
await InitializeAsync(driverConfigJson, cancellationToken);
}
public async Task ShutdownAsync(CancellationToken cancellationToken)
{
try { _probeCts?.Cancel(); } catch { }
_probeCts?.Dispose();
_probeCts = null;
foreach (var state in _subscriptions.Values)
{
try { state.Cts.Cancel(); } catch { }
state.Cts.Dispose();
}
_subscriptions.Clear();
if (_transport is not null) await _transport.DisposeAsync().ConfigureAwait(false);
_transport = null;
_health = new DriverHealth(DriverState.Unknown, _health.LastSuccessfulRead, null);
}
public DriverHealth GetHealth() => _health;
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask;
// ---- ITagDiscovery ----
public Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(builder);
var folder = builder.Folder("Modbus", "Modbus");
foreach (var t in _options.Tags)
{
folder.Variable(t.Name, t.Name, new DriverAttributeInfo(
FullName: t.Name,
DriverDataType: MapDataType(t.DataType),
IsArray: false,
ArrayDim: null,
SecurityClass: t.Writable ? SecurityClassification.Operate : SecurityClassification.ViewOnly,
IsHistorized: false,
IsAlarm: false));
}
return Task.CompletedTask;
}
// ---- IReadable ----
public async Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
{
var transport = RequireTransport();
var now = DateTime.UtcNow;
var results = new DataValueSnapshot[fullReferences.Count];
for (var i = 0; i < fullReferences.Count; i++)
{
if (!_tagsByName.TryGetValue(fullReferences[i], out var tag))
{
results[i] = new DataValueSnapshot(null, StatusBadNodeIdUnknown, null, now);
continue;
}
try
{
var value = await ReadOneAsync(transport, tag, cancellationToken).ConfigureAwait(false);
results[i] = new DataValueSnapshot(value, 0u, now, now);
_health = new DriverHealth(DriverState.Healthy, now, null);
}
catch (ModbusException mex)
{
results[i] = new DataValueSnapshot(null, MapModbusExceptionToStatus(mex.ExceptionCode), null, now);
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, mex.Message);
}
catch (Exception ex)
{
// Non-Modbus-layer failure: socket dropped, timeout, malformed response. Surface
// as communication error so callers can distinguish it from tag-level faults.
results[i] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now);
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message);
}
}
return results;
}
private async Task<object> ReadOneAsync(IModbusTransport transport, ModbusTagDefinition tag, CancellationToken ct)
{
switch (tag.Region)
{
case ModbusRegion.Coils:
{
var pdu = new byte[] { 0x01, (byte)(tag.Address >> 8), (byte)(tag.Address & 0xFF), 0x00, 0x01 };
var resp = await transport.SendAsync(_options.UnitId, pdu, ct).ConfigureAwait(false);
return (resp[2] & 0x01) == 1;
}
case ModbusRegion.DiscreteInputs:
{
var pdu = new byte[] { 0x02, (byte)(tag.Address >> 8), (byte)(tag.Address & 0xFF), 0x00, 0x01 };
var resp = await transport.SendAsync(_options.UnitId, pdu, ct).ConfigureAwait(false);
return (resp[2] & 0x01) == 1;
}
case ModbusRegion.HoldingRegisters:
case ModbusRegion.InputRegisters:
{
var quantity = RegisterCount(tag);
var fc = tag.Region == ModbusRegion.HoldingRegisters ? (byte)0x03 : (byte)0x04;
// Auto-chunk when the tag's register span exceeds the caller-configured cap.
// Affects long strings (FC03/04 > 125 regs is spec-forbidden; DL205 caps at 128,
// Mitsubishi Q caps at 64). Non-string tags max out at 4 regs so the cap never
// triggers for numerics.
var cap = _options.MaxRegistersPerRead == 0 ? (ushort)125 : _options.MaxRegistersPerRead;
var data = quantity <= cap
? await ReadRegisterBlockAsync(transport, fc, tag.Address, quantity, ct).ConfigureAwait(false)
: await ReadRegisterBlockChunkedAsync(transport, fc, tag.Address, quantity, cap, ct).ConfigureAwait(false);
return DecodeRegister(data, tag);
}
default:
throw new InvalidOperationException($"Unknown region {tag.Region}");
}
}
private async Task<byte[]> ReadRegisterBlockAsync(
IModbusTransport transport, byte fc, ushort address, ushort quantity, CancellationToken ct)
{
var pdu = new byte[] { fc, (byte)(address >> 8), (byte)(address & 0xFF),
(byte)(quantity >> 8), (byte)(quantity & 0xFF) };
var resp = await transport.SendAsync(_options.UnitId, pdu, ct).ConfigureAwait(false);
// resp = [fc][byte-count][data...]
var data = new byte[resp[1]];
Buffer.BlockCopy(resp, 2, data, 0, resp[1]);
return data;
}
private async Task<byte[]> ReadRegisterBlockChunkedAsync(
IModbusTransport transport, byte fc, ushort address, ushort totalRegs, ushort cap, CancellationToken ct)
{
var assembled = new byte[totalRegs * 2];
ushort done = 0;
while (done < totalRegs)
{
var chunk = (ushort)Math.Min(cap, totalRegs - done);
var chunkBytes = await ReadRegisterBlockAsync(transport, fc, (ushort)(address + done), chunk, ct).ConfigureAwait(false);
Buffer.BlockCopy(chunkBytes, 0, assembled, done * 2, chunkBytes.Length);
done += chunk;
}
return assembled;
}
// ---- IWritable ----
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
IReadOnlyList<WriteRequest> writes, CancellationToken cancellationToken)
{
var transport = RequireTransport();
var results = new WriteResult[writes.Count];
for (var i = 0; i < writes.Count; i++)
{
var w = writes[i];
if (!_tagsByName.TryGetValue(w.FullReference, out var tag))
{
results[i] = new WriteResult(StatusBadNodeIdUnknown);
continue;
}
if (!tag.Writable || tag.Region is ModbusRegion.DiscreteInputs or ModbusRegion.InputRegisters)
{
results[i] = new WriteResult(StatusBadNotWritable);
continue;
}
try
{
await WriteOneAsync(transport, tag, w.Value, cancellationToken).ConfigureAwait(false);
results[i] = new WriteResult(0u);
}
catch (ModbusException mex)
{
results[i] = new WriteResult(MapModbusExceptionToStatus(mex.ExceptionCode));
}
catch (Exception)
{
results[i] = new WriteResult(StatusBadInternalError);
}
}
return results;
}
private async Task WriteOneAsync(IModbusTransport transport, ModbusTagDefinition tag, object? value, CancellationToken ct)
{
switch (tag.Region)
{
case ModbusRegion.Coils:
{
var on = Convert.ToBoolean(value);
var pdu = new byte[] { 0x05, (byte)(tag.Address >> 8), (byte)(tag.Address & 0xFF),
on ? (byte)0xFF : (byte)0x00, 0x00 };
await transport.SendAsync(_options.UnitId, pdu, ct).ConfigureAwait(false);
return;
}
case ModbusRegion.HoldingRegisters:
{
var bytes = EncodeRegister(value, tag);
if (bytes.Length == 2)
{
var pdu = new byte[] { 0x06, (byte)(tag.Address >> 8), (byte)(tag.Address & 0xFF),
bytes[0], bytes[1] };
await transport.SendAsync(_options.UnitId, pdu, ct).ConfigureAwait(false);
}
else
{
// FC 16 (Write Multiple Registers) for 32-bit types.
var qty = (ushort)(bytes.Length / 2);
var writeCap = _options.MaxRegistersPerWrite == 0 ? (ushort)123 : _options.MaxRegistersPerWrite;
if (qty > writeCap)
throw new InvalidOperationException(
$"Write of {qty} registers to {tag.Name} exceeds MaxRegistersPerWrite={writeCap}. " +
$"Split the tag (e.g. shorter StringLength) — partial FC16 chunks would lose atomicity.");
var pdu = new byte[6 + 1 + bytes.Length];
pdu[0] = 0x10;
pdu[1] = (byte)(tag.Address >> 8); pdu[2] = (byte)(tag.Address & 0xFF);
pdu[3] = (byte)(qty >> 8); pdu[4] = (byte)(qty & 0xFF);
pdu[5] = (byte)bytes.Length;
Buffer.BlockCopy(bytes, 0, pdu, 6, bytes.Length);
await transport.SendAsync(_options.UnitId, pdu, ct).ConfigureAwait(false);
}
return;
}
default:
throw new InvalidOperationException($"Writes not supported for region {tag.Region}");
}
}
// ---- ISubscribable (polling overlay) ----
public Task<ISubscriptionHandle> SubscribeAsync(
IReadOnlyList<string> fullReferences, TimeSpan publishingInterval, CancellationToken cancellationToken)
{
var id = Interlocked.Increment(ref _nextSubscriptionId);
var cts = new CancellationTokenSource();
var interval = publishingInterval < TimeSpan.FromMilliseconds(100)
? TimeSpan.FromMilliseconds(100) // floor — Modbus can't sustain < 100ms polling reliably
: publishingInterval;
var handle = new ModbusSubscriptionHandle(id);
var state = new SubscriptionState(handle, [.. fullReferences], interval, cts);
_subscriptions[id] = state;
_ = Task.Run(() => PollLoopAsync(state, cts.Token), cts.Token);
return Task.FromResult<ISubscriptionHandle>(handle);
}
public Task UnsubscribeAsync(ISubscriptionHandle handle, CancellationToken cancellationToken)
{
if (handle is ModbusSubscriptionHandle h && _subscriptions.TryRemove(h.Id, out var state))
{
state.Cts.Cancel();
state.Cts.Dispose();
}
return Task.CompletedTask;
}
private async Task PollLoopAsync(SubscriptionState state, CancellationToken ct)
{
// Initial-data push: read every tag once at subscribe time so OPC UA clients see the
// current value per Part 4 convention, even if the value never changes thereafter.
try { await PollOnceAsync(state, forceRaise: true, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
catch { /* first-read error — polling continues */ }
while (!ct.IsCancellationRequested)
{
try { await Task.Delay(state.Interval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
try { await PollOnceAsync(state, forceRaise: false, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
catch { /* transient polling error — loop continues, health surface reflects it */ }
}
}
private async Task PollOnceAsync(SubscriptionState state, bool forceRaise, CancellationToken ct)
{
var snapshots = await ReadAsync(state.TagReferences, ct).ConfigureAwait(false);
for (var i = 0; i < state.TagReferences.Count; i++)
{
var tagRef = state.TagReferences[i];
var current = snapshots[i];
var lastSeen = state.LastValues.TryGetValue(tagRef, out var prev) ? prev : default;
// Raise on first read (forceRaise) OR when the boxed value differs from last-known.
if (forceRaise || !Equals(lastSeen?.Value, current.Value) || lastSeen?.StatusCode != current.StatusCode)
{
state.LastValues[tagRef] = current;
OnDataChange?.Invoke(this, new DataChangeEventArgs(state.Handle, tagRef, current));
}
}
}
private sealed record SubscriptionState(
ModbusSubscriptionHandle Handle,
IReadOnlyList<string> TagReferences,
TimeSpan Interval,
CancellationTokenSource Cts)
{
public System.Collections.Concurrent.ConcurrentDictionary<string, DataValueSnapshot> LastValues { get; }
= new(StringComparer.OrdinalIgnoreCase);
}
private sealed record ModbusSubscriptionHandle(long Id) : ISubscriptionHandle
{
public string DiagnosticId => $"modbus-sub-{Id}";
}
// ---- IHostConnectivityProbe ----
public IReadOnlyList<HostConnectivityStatus> GetHostStatuses()
{
lock (_probeLock)
return [new HostConnectivityStatus(HostName, _hostState, _hostStateChangedUtc)];
}
/// <summary>
/// Host identifier surfaced to <c>IHostConnectivityProbe.GetHostStatuses</c> and the Admin UI.
/// Formatted as <c>host:port</c> so multiple Modbus drivers in the same server disambiguate
/// by endpoint without needing the driver-instance-id in the Admin dashboard.
/// </summary>
public string HostName => $"{_options.Host}:{_options.Port}";
private async Task ProbeLoopAsync(CancellationToken ct)
{
var transport = _transport; // captured reference; disposal tears the loop down via ct
while (!ct.IsCancellationRequested)
{
var success = false;
try
{
using var probeCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
probeCts.CancelAfter(_options.Probe.Timeout);
var pdu = new byte[] { 0x03,
(byte)(_options.Probe.ProbeAddress >> 8),
(byte)(_options.Probe.ProbeAddress & 0xFF), 0x00, 0x01 };
_ = await transport!.SendAsync(_options.UnitId, pdu, probeCts.Token).ConfigureAwait(false);
success = true;
}
catch (OperationCanceledException) when (ct.IsCancellationRequested)
{
return;
}
catch
{
// transport / timeout / exception PDU — treated as Stopped below
}
TransitionTo(success ? HostState.Running : HostState.Stopped);
try { await Task.Delay(_options.Probe.Interval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
}
}
private void TransitionTo(HostState newState)
{
HostState old;
lock (_probeLock)
{
old = _hostState;
if (old == newState) return;
_hostState = newState;
_hostStateChangedUtc = DateTime.UtcNow;
}
OnHostStatusChanged?.Invoke(this, new HostStatusChangedEventArgs(HostName, old, newState));
}
// ---- codec ----
/// <summary>
/// How many 16-bit registers a given tag occupies. Accounts for multi-register logical
/// types (Int32/Float32 = 2 regs, Int64/Float64 = 4 regs) and for strings (rounded up
/// from 2 chars per register).
/// </summary>
internal static ushort RegisterCount(ModbusTagDefinition tag) => tag.DataType switch
{
ModbusDataType.Int16 or ModbusDataType.UInt16 or ModbusDataType.BitInRegister or ModbusDataType.Bcd16 => 1,
ModbusDataType.Int32 or ModbusDataType.UInt32 or ModbusDataType.Float32 or ModbusDataType.Bcd32 => 2,
ModbusDataType.Int64 or ModbusDataType.UInt64 or ModbusDataType.Float64 => 4,
ModbusDataType.String => (ushort)((tag.StringLength + 1) / 2), // 2 chars per register
_ => throw new InvalidOperationException($"Non-register data type {tag.DataType}"),
};
/// <summary>
/// Word-swap the input into the big-endian layout the decoders expect. For 2-register
/// types this reverses the two words; for 4-register types it reverses the four words
/// (PLC stored [hi-mid, low-mid, hi-high, low-high] → memory [hi-high, low-high, hi-mid, low-mid]).
/// </summary>
private static byte[] NormalizeWordOrder(ReadOnlySpan<byte> data, ModbusByteOrder order)
{
if (order == ModbusByteOrder.BigEndian) return data.ToArray();
var result = new byte[data.Length];
for (var word = 0; word < data.Length / 2; word++)
{
var srcWord = data.Length / 2 - 1 - word;
result[word * 2] = data[srcWord * 2];
result[word * 2 + 1] = data[srcWord * 2 + 1];
}
return result;
}
internal static object DecodeRegister(ReadOnlySpan<byte> data, ModbusTagDefinition tag)
{
switch (tag.DataType)
{
case ModbusDataType.Int16: return BinaryPrimitives.ReadInt16BigEndian(data);
case ModbusDataType.UInt16: return BinaryPrimitives.ReadUInt16BigEndian(data);
case ModbusDataType.Bcd16:
{
var raw = BinaryPrimitives.ReadUInt16BigEndian(data);
return (int)DecodeBcd(raw, nibbles: 4);
}
case ModbusDataType.Bcd32:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
var raw = BinaryPrimitives.ReadUInt32BigEndian(b);
return (int)DecodeBcd(raw, nibbles: 8);
}
case ModbusDataType.BitInRegister:
{
var raw = BinaryPrimitives.ReadUInt16BigEndian(data);
return (raw & (1 << tag.BitIndex)) != 0;
}
case ModbusDataType.Int32:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
return BinaryPrimitives.ReadInt32BigEndian(b);
}
case ModbusDataType.UInt32:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
return BinaryPrimitives.ReadUInt32BigEndian(b);
}
case ModbusDataType.Float32:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
return BinaryPrimitives.ReadSingleBigEndian(b);
}
case ModbusDataType.Int64:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
return BinaryPrimitives.ReadInt64BigEndian(b);
}
case ModbusDataType.UInt64:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
return BinaryPrimitives.ReadUInt64BigEndian(b);
}
case ModbusDataType.Float64:
{
var b = NormalizeWordOrder(data, tag.ByteOrder);
return BinaryPrimitives.ReadDoubleBigEndian(b);
}
case ModbusDataType.String:
{
// ASCII, 2 chars per register. HighByteFirst (standard) packs the first char in
// the high byte of each register; LowByteFirst (DL205/DL260) packs the first char
// in the low byte. Respect StringLength (truncate nul-padded regions).
var chars = new char[tag.StringLength];
for (var i = 0; i < tag.StringLength; i++)
{
var regIdx = i / 2;
var highByte = data[regIdx * 2];
var lowByte = data[regIdx * 2 + 1];
byte b;
if (tag.StringByteOrder == ModbusStringByteOrder.HighByteFirst)
b = (i % 2 == 0) ? highByte : lowByte;
else
b = (i % 2 == 0) ? lowByte : highByte;
if (b == 0) return new string(chars, 0, i);
chars[i] = (char)b;
}
return new string(chars);
}
default:
throw new InvalidOperationException($"Non-register data type {tag.DataType}");
}
}
internal static byte[] EncodeRegister(object? value, ModbusTagDefinition tag)
{
switch (tag.DataType)
{
case ModbusDataType.Int16:
{
var v = Convert.ToInt16(value);
var b = new byte[2]; BinaryPrimitives.WriteInt16BigEndian(b, v); return b;
}
case ModbusDataType.UInt16:
{
var v = Convert.ToUInt16(value);
var b = new byte[2]; BinaryPrimitives.WriteUInt16BigEndian(b, v); return b;
}
case ModbusDataType.Bcd16:
{
var v = Convert.ToUInt32(value);
if (v > 9999) throw new OverflowException($"BCD16 value {v} exceeds 4 decimal digits");
var raw = (ushort)EncodeBcd(v, nibbles: 4);
var b = new byte[2]; BinaryPrimitives.WriteUInt16BigEndian(b, raw); return b;
}
case ModbusDataType.Bcd32:
{
var v = Convert.ToUInt32(value);
if (v > 99_999_999u) throw new OverflowException($"BCD32 value {v} exceeds 8 decimal digits");
var raw = EncodeBcd(v, nibbles: 8);
var b = new byte[4]; BinaryPrimitives.WriteUInt32BigEndian(b, raw);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.Int32:
{
var v = Convert.ToInt32(value);
var b = new byte[4]; BinaryPrimitives.WriteInt32BigEndian(b, v);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.UInt32:
{
var v = Convert.ToUInt32(value);
var b = new byte[4]; BinaryPrimitives.WriteUInt32BigEndian(b, v);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.Float32:
{
var v = Convert.ToSingle(value);
var b = new byte[4]; BinaryPrimitives.WriteSingleBigEndian(b, v);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.Int64:
{
var v = Convert.ToInt64(value);
var b = new byte[8]; BinaryPrimitives.WriteInt64BigEndian(b, v);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.UInt64:
{
var v = Convert.ToUInt64(value);
var b = new byte[8]; BinaryPrimitives.WriteUInt64BigEndian(b, v);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.Float64:
{
var v = Convert.ToDouble(value);
var b = new byte[8]; BinaryPrimitives.WriteDoubleBigEndian(b, v);
return NormalizeWordOrder(b, tag.ByteOrder);
}
case ModbusDataType.String:
{
var s = Convert.ToString(value) ?? string.Empty;
var regs = (tag.StringLength + 1) / 2;
var b = new byte[regs * 2];
for (var i = 0; i < tag.StringLength && i < s.Length; i++)
{
var regIdx = i / 2;
var destIdx = tag.StringByteOrder == ModbusStringByteOrder.HighByteFirst
? (i % 2 == 0 ? regIdx * 2 : regIdx * 2 + 1)
: (i % 2 == 0 ? regIdx * 2 + 1 : regIdx * 2);
b[destIdx] = (byte)s[i];
}
// remaining bytes stay 0 — nul-padded per PLC convention
return b;
}
case ModbusDataType.BitInRegister:
throw new InvalidOperationException(
"BitInRegister writes require a read-modify-write; not supported in PR 24 (separate follow-up).");
default:
throw new InvalidOperationException($"Non-register data type {tag.DataType}");
}
}
private static DriverDataType MapDataType(ModbusDataType t) => t switch
{
ModbusDataType.Bool or ModbusDataType.BitInRegister => DriverDataType.Boolean,
ModbusDataType.Int16 or ModbusDataType.Int32 => DriverDataType.Int32,
ModbusDataType.UInt16 or ModbusDataType.UInt32 => DriverDataType.Int32,
ModbusDataType.Int64 or ModbusDataType.UInt64 => DriverDataType.Int32, // widening to Int32 loses precision; PR 25 adds Int64 to DriverDataType
ModbusDataType.Float32 => DriverDataType.Float32,
ModbusDataType.Float64 => DriverDataType.Float64,
ModbusDataType.String => DriverDataType.String,
ModbusDataType.Bcd16 or ModbusDataType.Bcd32 => DriverDataType.Int32,
_ => DriverDataType.Int32,
};
/// <summary>
/// Decode an N-nibble binary-coded-decimal value. Each nibble of <paramref name="raw"/>
/// encodes one decimal digit (most-significant nibble first). Rejects nibbles &gt; 9 —
/// the hardware sometimes produces garbage during transitions and silent non-BCD reads
/// would quietly corrupt the caller's data.
/// </summary>
internal static uint DecodeBcd(uint raw, int nibbles)
{
uint result = 0;
for (var i = nibbles - 1; i >= 0; i--)
{
var digit = (raw >> (i * 4)) & 0xF;
if (digit > 9)
throw new InvalidDataException(
$"Non-BCD nibble 0x{digit:X} at position {i} of raw=0x{raw:X}");
result = result * 10 + digit;
}
return result;
}
/// <summary>
/// Encode a decimal value as N-nibble BCD. Caller is responsible for range-checking
/// against the nibble capacity (10^nibbles - 1).
/// </summary>
internal static uint EncodeBcd(uint value, int nibbles)
{
uint result = 0;
for (var i = 0; i < nibbles; i++)
{
var digit = value % 10;
result |= digit << (i * 4);
value /= 10;
}
return result;
}
private IModbusTransport RequireTransport() =>
_transport ?? throw new InvalidOperationException("ModbusDriver not initialized");
private const uint StatusBadInternalError = 0x80020000u;
private const uint StatusBadNodeIdUnknown = 0x80340000u;
private const uint StatusBadNotWritable = 0x803B0000u;
private const uint StatusBadOutOfRange = 0x803C0000u;
private const uint StatusBadNotSupported = 0x803D0000u;
private const uint StatusBadDeviceFailure = 0x80550000u;
private const uint StatusBadCommunicationError = 0x80050000u;
/// <summary>
/// Map a server-returned Modbus exception code to the most informative OPC UA
/// StatusCode. Keeps the driver's outward-facing status surface aligned with what a
/// Modbus engineer would expect when reading the spec: exception 02 (Illegal Data
/// Address) surfaces as BadOutOfRange so clients can distinguish "tag wrong" from
/// generic BadInternalError, exception 04 (Server Failure) as BadDeviceFailure so
/// operators see a CPU-mode problem rather than a driver bug, etc. Per
/// <c>docs/v2/dl205.md</c>, DL205/DL260 returns only codes 01-04 — no proprietary
/// extensions.
/// </summary>
internal static uint MapModbusExceptionToStatus(byte exceptionCode) => exceptionCode switch
{
0x01 => StatusBadNotSupported, // Illegal Function — FC not in supported list
0x02 => StatusBadOutOfRange, // Illegal Data Address — register outside mapped range
0x03 => StatusBadOutOfRange, // Illegal Data Value — quantity over per-FC cap
0x04 => StatusBadDeviceFailure, // Server Failure — CPU in PROGRAM mode during protected write
0x05 or 0x06 => StatusBadDeviceFailure, // Acknowledge / Server Busy — long-running op / busy
0x0A or 0x0B => StatusBadCommunicationError, // Gateway path unavailable / target failed to respond
_ => StatusBadInternalError,
};
public void Dispose() => DisposeAsync().AsTask().GetAwaiter().GetResult();
public async ValueTask DisposeAsync()
{
if (_transport is not null) await _transport.DisposeAsync().ConfigureAwait(false);
_transport = null;
}
}

View File

@@ -0,0 +1,161 @@
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
/// <summary>
/// Modbus TCP driver configuration. Bound from the driver's <c>DriverConfig</c> JSON at
/// <c>DriverHost.RegisterAsync</c>. Every register the driver exposes appears in
/// <see cref="Tags"/>; names become the OPC UA browse name + full reference.
/// </summary>
public sealed class ModbusDriverOptions
{
public string Host { get; init; } = "127.0.0.1";
public int Port { get; init; } = 502;
public byte UnitId { get; init; } = 1;
public TimeSpan Timeout { get; init; } = TimeSpan.FromSeconds(2);
/// <summary>Pre-declared tag map. Modbus has no discovery protocol — the driver returns exactly these.</summary>
public IReadOnlyList<ModbusTagDefinition> Tags { get; init; } = [];
/// <summary>
/// Background connectivity-probe settings. When <see cref="ModbusProbeOptions.Enabled"/>
/// is true the driver runs a tick loop that issues a cheap FC03 at register 0 every
/// <see cref="ModbusProbeOptions.Interval"/> and raises <c>OnHostStatusChanged</c> on
/// Running ↔ Stopped transitions. The Admin UI / OPC UA clients see the state through
/// <see cref="IHostConnectivityProbe"/>.
/// </summary>
public ModbusProbeOptions Probe { get; init; } = new();
/// <summary>
/// Maximum registers per FC03 (Read Holding Registers) / FC04 (Read Input Registers)
/// transaction. Modbus-TCP spec allows 125; many device families impose lower caps:
/// AutomationDirect DL205/DL260 cap at <c>128</c>, Mitsubishi Q/FX3U cap at <c>64</c>,
/// Omron CJ/CS cap at <c>125</c>. Set to the lowest cap across the devices this driver
/// instance talks to; the driver auto-chunks larger reads into consecutive requests.
/// Default <c>125</c> — the spec maximum, safe against any conforming server. Setting
/// to <c>0</c> disables the cap (discouraged — the spec upper bound still applies).
/// </summary>
public ushort MaxRegistersPerRead { get; init; } = 125;
/// <summary>
/// Maximum registers per FC16 (Write Multiple Registers) transaction. Spec maximum is
/// <c>123</c>; DL205/DL260 cap at <c>100</c>. Matching caller-vs-device semantics:
/// exceeding the cap currently throws (writes aren't auto-chunked because a partial
/// write across two FC16 calls is no longer atomic — caller must explicitly opt in
/// by shortening the tag's <c>StringLength</c> or splitting it into multiple tags).
/// </summary>
public ushort MaxRegistersPerWrite { get; init; } = 123;
/// <summary>
/// When <c>true</c> (default) the built-in <see cref="ModbusTcpTransport"/> detects
/// mid-transaction socket failures (<see cref="System.IO.EndOfStreamException"/>,
/// <see cref="System.Net.Sockets.SocketException"/>) and transparently reconnects +
/// retries the PDU exactly once. Required for DL205/DL260 because the H2-ECOM100
/// does not send TCP keepalives — intermediate NAT / firewall devices silently close
/// idle sockets and the first send after the drop would otherwise surface as a
/// connection error to the caller even though the PLC is up.
/// </summary>
public bool AutoReconnect { get; init; } = true;
}
public sealed class ModbusProbeOptions
{
public bool Enabled { get; init; } = true;
public TimeSpan Interval { get; init; } = TimeSpan.FromSeconds(5);
public TimeSpan Timeout { get; init; } = TimeSpan.FromSeconds(2);
/// <summary>Register to read for the probe. Zero is usually safe; override for PLCs that lock register 0.</summary>
public ushort ProbeAddress { get; init; } = 0;
}
/// <summary>
/// One Modbus-backed OPC UA variable. Address is zero-based (Modbus spec numbering, not
/// the documentation's 1-based coil/register conventions). Multi-register types
/// (Int32/UInt32/Float32 = 2 regs; Int64/UInt64/Float64 = 4 regs) respect the
/// <see cref="ByteOrder"/> field — real-world PLCs disagree on word ordering.
/// </summary>
/// <param name="Name">
/// Tag name, used for both the OPC UA browse name and the driver's full reference. Must be
/// unique within the driver.
/// </param>
/// <param name="Region">Coils / DiscreteInputs / InputRegisters / HoldingRegisters.</param>
/// <param name="Address">Zero-based address within the region.</param>
/// <param name="DataType">
/// Logical data type. See <see cref="ModbusDataType"/> for the register count each encodes.
/// </param>
/// <param name="Writable">When true and Region supports writes (Coils / HoldingRegisters), IWritable routes writes here.</param>
/// <param name="ByteOrder">Word ordering for multi-register types. Ignored for Bool / Int16 / UInt16 / BitInRegister / String.</param>
/// <param name="BitIndex">For <c>DataType = BitInRegister</c>: which bit of the holding register (0-15, LSB-first).</param>
/// <param name="StringLength">For <c>DataType = String</c>: number of ASCII characters (2 per register, rounded up).</param>
/// <param name="StringByteOrder">
/// Per-register byte order for <c>DataType = String</c>. Standard Modbus packs the first
/// character in the high byte (<see cref="ModbusStringByteOrder.HighByteFirst"/>).
/// AutomationDirect DirectLOGIC (DL205/DL260) and a few legacy families pack the first
/// character in the low byte instead — see <c>docs/v2/dl205.md</c> §strings.
/// </param>
public sealed record ModbusTagDefinition(
string Name,
ModbusRegion Region,
ushort Address,
ModbusDataType DataType,
bool Writable = true,
ModbusByteOrder ByteOrder = ModbusByteOrder.BigEndian,
byte BitIndex = 0,
ushort StringLength = 0,
ModbusStringByteOrder StringByteOrder = ModbusStringByteOrder.HighByteFirst);
public enum ModbusRegion { Coils, DiscreteInputs, InputRegisters, HoldingRegisters }
public enum ModbusDataType
{
Bool,
Int16,
UInt16,
Int32,
UInt32,
Int64,
UInt64,
Float32,
Float64,
/// <summary>Single bit within a holding register. <see cref="ModbusTagDefinition.BitIndex"/> selects 0-15 LSB-first.</summary>
BitInRegister,
/// <summary>ASCII string packed 2 chars per register, <see cref="ModbusTagDefinition.StringLength"/> characters long.</summary>
String,
/// <summary>
/// 16-bit binary-coded decimal. Each nibble encodes one decimal digit (0-9). Register
/// value <c>0x1234</c> decodes as decimal <c>1234</c> — NOT binary <c>0x04D2 = 4660</c>.
/// DL205/DL260 and several Mitsubishi / Omron families store timers, counters, and
/// operator-facing numerics as BCD by default.
/// </summary>
Bcd16,
/// <summary>
/// 32-bit (two-register) BCD. Decodes 8 decimal digits. Word ordering follows
/// <see cref="ModbusTagDefinition.ByteOrder"/> the same way <see cref="Int32"/> does.
/// </summary>
Bcd32,
}
/// <summary>
/// Word ordering for multi-register types. Modbus TCP standard is <see cref="BigEndian"/>
/// (ABCD for 32-bit: high word at the lower address). Many PLCs — Siemens S7, several
/// Allen-Bradley series, some Modicon families — use <see cref="WordSwap"/> (CDAB), which
/// keeps bytes big-endian within each register but reverses the word pair(s).
/// </summary>
public enum ModbusByteOrder
{
BigEndian,
WordSwap,
}
/// <summary>
/// Per-register byte order for ASCII strings packed 2 chars per register. Standard Modbus
/// convention is <see cref="HighByteFirst"/> — the first character of each pair occupies
/// the high byte of the register. AutomationDirect DirectLOGIC (DL205, DL260, DL350) and a
/// handful of legacy controllers pack <see cref="LowByteFirst"/>, which inverts that within
/// each register. Word ordering across multiple registers is always ascending address for
/// strings — only the byte order inside each register flips.
/// </summary>
public enum ModbusStringByteOrder
{
HighByteFirst,
LowByteFirst,
}

View File

@@ -0,0 +1,200 @@
using System.Net.Sockets;
namespace ZB.MOM.WW.OtOpcUa.Driver.Modbus;
/// <summary>
/// Concrete Modbus TCP transport. Wraps a single <see cref="TcpClient"/> and serializes
/// requests so at most one transaction is in-flight at a time — Modbus servers typically
/// support concurrent transactions, but the single-flight model keeps the wire trace
/// easy to diagnose and avoids interleaved-response correlation bugs.
/// </summary>
/// <remarks>
/// <para>
/// Survives mid-transaction socket drops: when a send/read fails with a socket-level
/// error (<see cref="IOException"/>, <see cref="SocketException"/>, <see cref="EndOfStreamException"/>)
/// the transport disposes the dead socket, reconnects, and retries the PDU exactly
/// once. Deliberately limited to a single retry — further failures bubble up so the
/// driver's health surface reflects the real state instead of masking a dead PLC.
/// </para>
/// <para>
/// Why this matters for DL205/DL260: the AutomationDirect H2-ECOM100 does NOT send
/// TCP keepalives per <c>docs/v2/dl205.md</c> §behavioral-oddities, so any NAT/firewall
/// between the gateway and PLC can silently close an idle socket after 2-5 minutes.
/// Also enables OS-level <c>SO_KEEPALIVE</c> so the driver's own side detects a stuck
/// socket in reasonable time even when the application is mostly idle.
/// </para>
/// </remarks>
public sealed class ModbusTcpTransport : IModbusTransport
{
private readonly string _host;
private readonly int _port;
private readonly TimeSpan _timeout;
private readonly bool _autoReconnect;
private readonly SemaphoreSlim _gate = new(1, 1);
private TcpClient? _client;
private NetworkStream? _stream;
private ushort _nextTx;
private bool _disposed;
public ModbusTcpTransport(string host, int port, TimeSpan timeout, bool autoReconnect = true)
{
_host = host;
_port = port;
_timeout = timeout;
_autoReconnect = autoReconnect;
}
public async Task ConnectAsync(CancellationToken ct)
{
// Resolve the host explicitly + prefer IPv4. .NET's TcpClient default-constructor is
// dual-stack (IPv6 first, fallback to IPv4) — but most Modbus TCP devices (PLCs and
// simulators like pymodbus) bind 0.0.0.0 only, so the IPv6 attempt times out and we
// burn the entire ConnectAsync budget before even trying IPv4. Resolving first +
// dialing the IPv4 address directly sidesteps that.
var addresses = await System.Net.Dns.GetHostAddressesAsync(_host, ct).ConfigureAwait(false);
var ipv4 = System.Linq.Enumerable.FirstOrDefault(addresses,
a => a.AddressFamily == System.Net.Sockets.AddressFamily.InterNetwork);
var target = ipv4 ?? (addresses.Length > 0 ? addresses[0] : System.Net.IPAddress.Loopback);
_client = new TcpClient(target.AddressFamily);
EnableKeepAlive(_client);
using var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
cts.CancelAfter(_timeout);
await _client.ConnectAsync(target, _port, cts.Token).ConfigureAwait(false);
_stream = _client.GetStream();
}
/// <summary>
/// Enable SO_KEEPALIVE with aggressive probe timing. DL205/DL260 doesn't send keepalives
/// itself; having the OS probe the socket every ~30s lets the driver notice a dead PLC
/// or broken NAT path long before the default 2-hour Windows idle timeout fires.
/// Non-fatal if the underlying OS rejects the option (some older Linux / container
/// sandboxes don't expose the fine-grained timing levers — the driver still works,
/// application-level probe still detects problems).
/// </summary>
private static void EnableKeepAlive(TcpClient client)
{
try
{
client.Client.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.KeepAlive, true);
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime, 30);
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveInterval, 10);
client.Client.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveRetryCount, 3);
}
catch { /* best-effort; older OSes may not expose the granular knobs */ }
}
public async Task<byte[]> SendAsync(byte unitId, byte[] pdu, CancellationToken ct)
{
if (_disposed) throw new ObjectDisposedException(nameof(ModbusTcpTransport));
if (_stream is null) throw new InvalidOperationException("Transport not connected");
await _gate.WaitAsync(ct).ConfigureAwait(false);
try
{
try
{
return await SendOnceAsync(unitId, pdu, ct).ConfigureAwait(false);
}
catch (Exception ex) when (_autoReconnect && IsSocketLevelFailure(ex))
{
// Mid-transaction drop: tear down the dead socket, reconnect, resend. Single
// retry — if it fails again, let it propagate so health/status reflect reality.
await TearDownAsync().ConfigureAwait(false);
await ConnectAsync(ct).ConfigureAwait(false);
return await SendOnceAsync(unitId, pdu, ct).ConfigureAwait(false);
}
}
finally
{
_gate.Release();
}
}
private async Task<byte[]> SendOnceAsync(byte unitId, byte[] pdu, CancellationToken ct)
{
if (_stream is null) throw new InvalidOperationException("Transport not connected");
var txId = ++_nextTx;
// MBAP: [TxId(2)][Proto=0(2)][Length(2)][UnitId(1)] + PDU
var adu = new byte[7 + pdu.Length];
adu[0] = (byte)(txId >> 8);
adu[1] = (byte)(txId & 0xFF);
// protocol id already zero
var len = (ushort)(1 + pdu.Length); // unit id + pdu
adu[4] = (byte)(len >> 8);
adu[5] = (byte)(len & 0xFF);
adu[6] = unitId;
Buffer.BlockCopy(pdu, 0, adu, 7, pdu.Length);
using var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
cts.CancelAfter(_timeout);
await _stream.WriteAsync(adu.AsMemory(), cts.Token).ConfigureAwait(false);
await _stream.FlushAsync(cts.Token).ConfigureAwait(false);
var header = new byte[7];
await ReadExactlyAsync(_stream, header, cts.Token).ConfigureAwait(false);
var respTxId = (ushort)((header[0] << 8) | header[1]);
if (respTxId != txId)
throw new InvalidDataException($"Modbus TxId mismatch: expected {txId} got {respTxId}");
var respLen = (ushort)((header[4] << 8) | header[5]);
if (respLen < 1) throw new InvalidDataException($"Modbus response length too small: {respLen}");
var respPdu = new byte[respLen - 1];
await ReadExactlyAsync(_stream, respPdu, cts.Token).ConfigureAwait(false);
// Exception PDU: function code has high bit set.
if ((respPdu[0] & 0x80) != 0)
{
var fc = (byte)(respPdu[0] & 0x7F);
var ex = respPdu[1];
throw new ModbusException(fc, ex, $"Modbus exception fc={fc} code={ex}");
}
return respPdu;
}
/// <summary>
/// Distinguish socket-layer failures (eligible for reconnect-and-retry) from
/// protocol-layer failures (must propagate — retrying the same PDU won't help if the
/// PLC just returned exception 02 Illegal Data Address).
/// </summary>
private static bool IsSocketLevelFailure(Exception ex) =>
ex is EndOfStreamException
|| ex is IOException
|| ex is SocketException
|| ex is ObjectDisposedException;
private async Task TearDownAsync()
{
try { if (_stream is not null) await _stream.DisposeAsync().ConfigureAwait(false); }
catch { /* best-effort */ }
_stream = null;
try { _client?.Dispose(); } catch { }
_client = null;
}
private static async Task ReadExactlyAsync(Stream s, byte[] buf, CancellationToken ct)
{
var read = 0;
while (read < buf.Length)
{
var n = await s.ReadAsync(buf.AsMemory(read), ct).ConfigureAwait(false);
if (n == 0) throw new EndOfStreamException("Modbus socket closed mid-response");
read += n;
}
}
public async ValueTask DisposeAsync()
{
if (_disposed) return;
_disposed = true;
try
{
if (_stream is not null) await _stream.DisposeAsync().ConfigureAwait(false);
}
catch { /* best-effort */ }
_client?.Dispose();
_gate.Dispose();
}
}

View File

@@ -0,0 +1,27 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Modbus</RootNamespace>
</PropertyGroup>
<ItemGroup>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.Abstractions\ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,497 @@
using Opc.Ua;
using Opc.Ua.Client;
using Opc.Ua.Configuration;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient;
/// <summary>
/// OPC UA Client (gateway) driver. Opens a <see cref="Session"/> against a remote OPC UA
/// server and re-exposes its address space through the local OtOpcUa server. PR 66 ships
/// the scaffold: <see cref="IDriver"/> only (connect / close / health). Browse, read,
/// write, subscribe, and probe land in PRs 67-69.
/// </summary>
/// <remarks>
/// <para>
/// Builds its own <see cref="ApplicationConfiguration"/> rather than reusing
/// <c>Client.Shared</c> — Client.Shared is oriented at the interactive CLI; this
/// driver is an always-on service component with different session-lifetime needs
/// (keep-alive monitor, session transfer on reconnect, multi-year uptime).
/// </para>
/// <para>
/// <b>Session lifetime</b>: a single <see cref="Session"/> per driver instance.
/// Subscriptions multiplex onto that session; SDK reconnect handler takes the session
/// down and brings it back up on remote-server restart — the driver must re-send
/// subscriptions + TransferSubscriptions on reconnect to avoid dangling
/// monitored-item handles. That mechanic lands in PR 69.
/// </para>
/// </remarks>
public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string driverInstanceId)
: IDriver, ITagDiscovery, IReadable, IWritable, IDisposable, IAsyncDisposable
{
// OPC UA StatusCode constants the driver surfaces for local-side faults. Upstream-server
// StatusCodes are passed through verbatim per driver-specs.md §8 "cascading quality" —
// downstream clients need to distinguish 'remote source down' from 'local driver failure'.
private const uint StatusBadNodeIdInvalid = 0x80330000u;
private const uint StatusBadInternalError = 0x80020000u;
private const uint StatusBadCommunicationError = 0x80050000u;
private readonly OpcUaClientDriverOptions _options = options;
private readonly SemaphoreSlim _gate = new(1, 1);
/// <summary>Active OPC UA session. Null until <see cref="InitializeAsync"/> returns cleanly.</summary>
internal ISession? Session { get; private set; }
/// <summary>Per-connection gate. PRs 67+ serialize read/write/browse on this.</summary>
internal SemaphoreSlim Gate => _gate;
private DriverHealth _health = new(DriverState.Unknown, null, null);
private bool _disposed;
public string DriverInstanceId => driverInstanceId;
public string DriverType => "OpcUaClient";
public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
_health = new DriverHealth(DriverState.Initializing, null, null);
try
{
var appConfig = await BuildApplicationConfigurationAsync(cancellationToken).ConfigureAwait(false);
// Endpoint selection: let the stack pick the best matching endpoint for the
// requested security policy/mode so the driver doesn't have to hand-validate.
// UseSecurity=false when SecurityMode=None shortcuts around cert validation
// entirely and is the typical dev-bench configuration.
var useSecurity = _options.SecurityMode != OpcUaSecurityMode.None;
// The non-obsolete SelectEndpointAsync overloads all require an ITelemetryContext
// parameter. Passing null is valid — the SDK falls through to its built-in default
// trace sink. Plumbing a telemetry context through every driver surface is out of
// scope; the driver emits its own logs via the health surface anyway.
var selected = await CoreClientUtils.SelectEndpointAsync(
appConfig, _options.EndpointUrl, useSecurity,
telemetry: null!,
ct: cancellationToken).ConfigureAwait(false);
var endpointConfig = EndpointConfiguration.Create(appConfig);
endpointConfig.OperationTimeout = (int)_options.Timeout.TotalMilliseconds;
var endpoint = new ConfiguredEndpoint(null, selected, endpointConfig);
var identity = _options.AuthType switch
{
OpcUaAuthType.Anonymous => new UserIdentity(new AnonymousIdentityToken()),
// The UserIdentity(string, string) overload was removed in favour of
// (string, byte[]) to make the password encoding explicit. UTF-8 is the
// overwhelmingly common choice for Basic256Sha256-secured sessions.
OpcUaAuthType.Username => new UserIdentity(
_options.Username ?? string.Empty,
System.Text.Encoding.UTF8.GetBytes(_options.Password ?? string.Empty)),
OpcUaAuthType.Certificate => throw new NotSupportedException(
"Certificate authentication lands in a follow-up PR; for now use Anonymous or Username"),
_ => new UserIdentity(new AnonymousIdentityToken()),
};
// All Session.Create* static methods are marked [Obsolete] in SDK 1.5.378; the
// non-obsolete path is DefaultSessionFactory.Instance.CreateAsync (which is the
// 8-arg signature matching our driver config — ApplicationConfiguration +
// ConfiguredEndpoint, no transport-waiting-connection or reverse-connect-manager
// required for the standard opc.tcp direct-connect case).
// DefaultSessionFactory's parameterless ctor is also obsolete in 1.5.378; the
// current constructor requires an ITelemetryContext. Passing null is tolerated —
// the factory falls back to its internal default sink, same as the telemetry:null
// on SelectEndpointAsync above.
var session = await new DefaultSessionFactory(telemetry: null!).CreateAsync(
appConfig,
endpoint,
false, // updateBeforeConnect
_options.SessionName,
(uint)_options.SessionTimeout.TotalMilliseconds,
identity,
null, // preferredLocales
cancellationToken).ConfigureAwait(false);
session.KeepAliveInterval = (int)_options.KeepAliveInterval.TotalMilliseconds;
Session = session;
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
}
catch (Exception ex)
{
try { if (Session is Session s) await s.CloseAsync().ConfigureAwait(false); } catch { }
Session = null;
_health = new DriverHealth(DriverState.Faulted, null, ex.Message);
throw;
}
}
/// <summary>
/// Build a minimal in-memory <see cref="ApplicationConfiguration"/>. Certificates live
/// under the OS user profile — on Windows that's <c>%LocalAppData%\OtOpcUa\pki</c>
/// — so multiple driver instances in the same OtOpcUa server process share one
/// certificate store without extra config.
/// </summary>
private async Task<ApplicationConfiguration> BuildApplicationConfigurationAsync(CancellationToken ct)
{
// The default ctor is obsolete in favour of the ITelemetryContext overload; suppress
// locally rather than plumbing a telemetry context all the way through the driver
// surface — the driver emits no per-request telemetry of its own and the SDK's
// internal fallback is fine for a gateway use case.
#pragma warning disable CS0618
var app = new ApplicationInstance
{
ApplicationName = _options.SessionName,
ApplicationType = ApplicationType.Client,
};
#pragma warning restore CS0618
var pkiRoot = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData),
"OtOpcUa", "pki");
var config = new ApplicationConfiguration
{
ApplicationName = _options.SessionName,
ApplicationType = ApplicationType.Client,
ApplicationUri = _options.ApplicationUri,
SecurityConfiguration = new SecurityConfiguration
{
ApplicationCertificate = new CertificateIdentifier
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(pkiRoot, "own"),
SubjectName = $"CN={_options.SessionName}",
},
TrustedPeerCertificates = new CertificateTrustList
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(pkiRoot, "trusted"),
},
TrustedIssuerCertificates = new CertificateTrustList
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(pkiRoot, "issuers"),
},
RejectedCertificateStore = new CertificateTrustList
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(pkiRoot, "rejected"),
},
AutoAcceptUntrustedCertificates = _options.AutoAcceptCertificates,
},
TransportQuotas = new TransportQuotas { OperationTimeout = (int)_options.Timeout.TotalMilliseconds },
ClientConfiguration = new ClientConfiguration
{
DefaultSessionTimeout = (int)_options.SessionTimeout.TotalMilliseconds,
},
DisableHiResClock = true,
};
await config.ValidateAsync(ApplicationType.Client, ct).ConfigureAwait(false);
// Attach a cert-validator handler that honours the AutoAccept flag. Without this,
// AutoAcceptUntrustedCertificates on the config alone isn't always enough in newer
// SDK versions — the validator raises an event the app has to handle.
if (_options.AutoAcceptCertificates)
{
config.CertificateValidator.CertificateValidation += (s, e) =>
{
if (e.Error.StatusCode == StatusCodes.BadCertificateUntrusted)
e.Accept = true;
};
}
// Ensure an application certificate exists. The SDK auto-generates one if missing.
app.ApplicationConfiguration = config;
await app.CheckApplicationInstanceCertificatesAsync(silent: true, lifeTimeInMonths: null, ct)
.ConfigureAwait(false);
return config;
}
public async Task ReinitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
await ShutdownAsync(cancellationToken).ConfigureAwait(false);
await InitializeAsync(driverConfigJson, cancellationToken).ConfigureAwait(false);
}
public async Task ShutdownAsync(CancellationToken cancellationToken)
{
try { if (Session is Session s) await s.CloseAsync(cancellationToken).ConfigureAwait(false); }
catch { /* best-effort */ }
try { Session?.Dispose(); } catch { }
Session = null;
_health = new DriverHealth(DriverState.Unknown, _health.LastSuccessfulRead, null);
}
public DriverHealth GetHealth() => _health;
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask;
// ---- IReadable ----
public async Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
{
var session = RequireSession();
var results = new DataValueSnapshot[fullReferences.Count];
var now = DateTime.UtcNow;
// Parse NodeIds up-front. Tags whose reference doesn't parse get BadNodeIdInvalid
// and are omitted from the wire request — saves a round-trip against the upstream
// server for a fault the driver can detect locally.
var toSend = new ReadValueIdCollection();
var indexMap = new List<int>(fullReferences.Count); // maps wire-index -> results-index
for (var i = 0; i < fullReferences.Count; i++)
{
if (!TryParseNodeId(session, fullReferences[i], out var nodeId))
{
results[i] = new DataValueSnapshot(null, StatusBadNodeIdInvalid, null, now);
continue;
}
toSend.Add(new ReadValueId { NodeId = nodeId, AttributeId = Attributes.Value });
indexMap.Add(i);
}
if (toSend.Count == 0) return results;
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
try
{
var resp = await session.ReadAsync(
requestHeader: null,
maxAge: 0,
timestampsToReturn: TimestampsToReturn.Both,
nodesToRead: toSend,
ct: cancellationToken).ConfigureAwait(false);
var values = resp.Results;
for (var w = 0; w < values.Count; w++)
{
var r = indexMap[w];
var dv = values[w];
// Preserve the upstream StatusCode verbatim — including Bad codes per
// §8's cascading-quality rule. Also preserve SourceTimestamp so downstream
// clients can detect stale upstream data.
results[r] = new DataValueSnapshot(
Value: dv.Value,
StatusCode: dv.StatusCode.Code,
SourceTimestampUtc: dv.SourceTimestamp == DateTime.MinValue ? null : dv.SourceTimestamp,
ServerTimestampUtc: dv.ServerTimestamp == DateTime.MinValue ? now : dv.ServerTimestamp);
}
_health = new DriverHealth(DriverState.Healthy, now, null);
}
catch (Exception ex)
{
// Transport / timeout / session-dropped — fan out the same fault across every
// tag in this batch. Per-tag StatusCode stays BadCommunicationError (not
// BadInternalError) so operators distinguish "upstream unreachable" from
// "driver bug".
for (var w = 0; w < indexMap.Count; w++)
{
var r = indexMap[w];
results[r] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now);
}
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message);
}
}
finally { _gate.Release(); }
return results;
}
// ---- IWritable ----
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
IReadOnlyList<Core.Abstractions.WriteRequest> writes, CancellationToken cancellationToken)
{
var session = RequireSession();
var results = new WriteResult[writes.Count];
var toSend = new WriteValueCollection();
var indexMap = new List<int>(writes.Count);
for (var i = 0; i < writes.Count; i++)
{
if (!TryParseNodeId(session, writes[i].FullReference, out var nodeId))
{
results[i] = new WriteResult(StatusBadNodeIdInvalid);
continue;
}
toSend.Add(new WriteValue
{
NodeId = nodeId,
AttributeId = Attributes.Value,
Value = new DataValue(new Variant(writes[i].Value)),
});
indexMap.Add(i);
}
if (toSend.Count == 0) return results;
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
try
{
var resp = await session.WriteAsync(
requestHeader: null,
nodesToWrite: toSend,
ct: cancellationToken).ConfigureAwait(false);
var codes = resp.Results;
for (var w = 0; w < codes.Count; w++)
{
var r = indexMap[w];
// Pass upstream WriteResult StatusCode through verbatim. Success codes
// include Good (0) and any warning-level Good* variants; anything with
// the severity bits set is a Bad.
results[r] = new WriteResult(codes[w].Code);
}
}
catch (Exception)
{
for (var w = 0; w < indexMap.Count; w++)
results[indexMap[w]] = new WriteResult(StatusBadCommunicationError);
}
}
finally { _gate.Release(); }
return results;
}
/// <summary>
/// Parse a tag's full-reference string as a NodeId. Accepts the standard OPC UA
/// serialized forms (<c>ns=2;s=…</c>, <c>i=2253</c>, <c>ns=4;g=…</c>, <c>ns=3;b=…</c>).
/// Empty + malformed strings return false; the driver surfaces that as
/// <see cref="StatusBadNodeIdInvalid"/> without a wire round-trip.
/// </summary>
internal static bool TryParseNodeId(ISession session, string fullReference, out NodeId nodeId)
{
nodeId = NodeId.Null;
if (string.IsNullOrWhiteSpace(fullReference)) return false;
try
{
nodeId = NodeId.Parse(session.MessageContext, fullReference);
return !NodeId.IsNull(nodeId);
}
catch
{
return false;
}
}
private ISession RequireSession() =>
Session ?? throw new InvalidOperationException("OpcUaClientDriver not initialized");
// ---- ITagDiscovery ----
public async Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(builder);
var session = RequireSession();
var root = !string.IsNullOrEmpty(_options.BrowseRoot)
? NodeId.Parse(session.MessageContext, _options.BrowseRoot)
: ObjectIds.ObjectsFolder;
var rootFolder = builder.Folder("Remote", "Remote");
var visited = new HashSet<NodeId>();
var discovered = 0;
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
await BrowseRecursiveAsync(session, root, rootFolder, visited,
depth: 0, discovered: () => discovered, increment: () => discovered++,
ct: cancellationToken).ConfigureAwait(false);
}
finally { _gate.Release(); }
}
private async Task BrowseRecursiveAsync(
ISession session, NodeId node, IAddressSpaceBuilder folder, HashSet<NodeId> visited,
int depth, Func<int> discovered, Action increment, CancellationToken ct)
{
if (depth >= _options.MaxBrowseDepth) return;
if (discovered() >= _options.MaxDiscoveredNodes) return;
if (!visited.Add(node)) return;
var browseDescriptions = new BrowseDescriptionCollection
{
new()
{
NodeId = node,
BrowseDirection = BrowseDirection.Forward,
ReferenceTypeId = ReferenceTypeIds.HierarchicalReferences,
IncludeSubtypes = true,
NodeClassMask = (uint)(NodeClass.Object | NodeClass.Variable),
ResultMask = (uint)(BrowseResultMask.BrowseName | BrowseResultMask.DisplayName
| BrowseResultMask.NodeClass | BrowseResultMask.TypeDefinition),
}
};
BrowseResponse resp;
try
{
resp = await session.BrowseAsync(
requestHeader: null,
view: null,
requestedMaxReferencesPerNode: 0,
nodesToBrowse: browseDescriptions,
ct: ct).ConfigureAwait(false);
}
catch
{
// Transient browse failure on a sub-tree — don't kill the whole discovery, just
// skip this branch. The driver's health surface will reflect the cascade via the
// probe loop (PR 69).
return;
}
if (resp.Results.Count == 0) return;
var refs = resp.Results[0].References;
foreach (var rf in refs)
{
if (discovered() >= _options.MaxDiscoveredNodes) break;
var childId = ExpandedNodeId.ToNodeId(rf.NodeId, session.NamespaceUris);
if (NodeId.IsNull(childId)) continue;
var browseName = rf.BrowseName?.Name ?? childId.ToString();
var displayName = rf.DisplayName?.Text ?? browseName;
if (rf.NodeClass == NodeClass.Object)
{
var subFolder = folder.Folder(browseName, displayName);
increment();
await BrowseRecursiveAsync(session, childId, subFolder, visited,
depth + 1, discovered, increment, ct).ConfigureAwait(false);
}
else if (rf.NodeClass == NodeClass.Variable)
{
// Serialize the NodeId so the IReadable/IWritable surface receives a
// round-trippable string. Deferring the DataType + AccessLevel fetch to a
// follow-up PR — initial browse uses a conservative ViewOnly + Int32 default.
var nodeIdString = childId.ToString() ?? string.Empty;
folder.Variable(browseName, displayName, new DriverAttributeInfo(
FullName: nodeIdString,
DriverDataType: DriverDataType.Int32,
IsArray: false,
ArrayDim: null,
SecurityClass: SecurityClassification.ViewOnly,
IsHistorized: false,
IsAlarm: false));
increment();
}
}
}
public void Dispose() => DisposeAsync().AsTask().GetAwaiter().GetResult();
public async ValueTask DisposeAsync()
{
if (_disposed) return;
_disposed = true;
try { await ShutdownAsync(CancellationToken.None).ConfigureAwait(false); }
catch { /* disposal is best-effort */ }
_gate.Dispose();
}
}

View File

@@ -0,0 +1,105 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient;
/// <summary>
/// OPC UA Client (gateway) driver configuration. Bound from <c>DriverConfig</c> JSON at
/// driver-host registration time. Models the settings documented in
/// <c>docs/v2/driver-specs.md</c> §8.
/// </summary>
/// <remarks>
/// This driver connects to a REMOTE OPC UA server and re-exposes its address space
/// through the local OtOpcUa server — the opposite direction from the usual "server
/// exposes PLC data" flow. Tier A (pure managed, OPC Foundation reference SDK); universal
/// protections cover it.
/// </remarks>
public sealed class OpcUaClientDriverOptions
{
/// <summary>Remote OPC UA endpoint URL, e.g. <c>opc.tcp://plc.internal:4840</c>.</summary>
public string EndpointUrl { get; init; } = "opc.tcp://localhost:4840";
/// <summary>Security policy. One of <c>None</c>, <c>Basic256Sha256</c>, <c>Aes128_Sha256_RsaOaep</c>.</summary>
public string SecurityPolicy { get; init; } = "None";
/// <summary>Security mode.</summary>
public OpcUaSecurityMode SecurityMode { get; init; } = OpcUaSecurityMode.None;
/// <summary>Authentication type.</summary>
public OpcUaAuthType AuthType { get; init; } = OpcUaAuthType.Anonymous;
/// <summary>User name (required only for <see cref="OpcUaAuthType.Username"/>).</summary>
public string? Username { get; init; }
/// <summary>Password (required only for <see cref="OpcUaAuthType.Username"/>).</summary>
public string? Password { get; init; }
/// <summary>Server-negotiated session timeout. Default 120s per driver-specs.md §8.</summary>
public TimeSpan SessionTimeout { get; init; } = TimeSpan.FromSeconds(120);
/// <summary>Client-side keep-alive interval.</summary>
public TimeSpan KeepAliveInterval { get; init; } = TimeSpan.FromSeconds(5);
/// <summary>Initial reconnect delay after a session drop.</summary>
public TimeSpan ReconnectPeriod { get; init; } = TimeSpan.FromSeconds(5);
/// <summary>
/// When <c>true</c>, the driver accepts any self-signed / untrusted server certificate.
/// Dev-only — must be <c>false</c> in production so MITM attacks against the opc.tcp
/// channel fail closed.
/// </summary>
public bool AutoAcceptCertificates { get; init; } = false;
/// <summary>
/// Application URI the driver reports during session creation. Must match the
/// subject-alt-name on the client certificate if one is used, which is why it's a
/// config knob rather than hard-coded.
/// </summary>
public string ApplicationUri { get; init; } = "urn:localhost:OtOpcUa:GatewayClient";
/// <summary>
/// Friendly name sent to the remote server for diagnostics. Shows up in the remote
/// server's session-list so operators can identify which gateway instance is calling.
/// </summary>
public string SessionName { get; init; } = "OtOpcUa-Gateway";
/// <summary>Connect + per-operation timeout.</summary>
public TimeSpan Timeout { get; init; } = TimeSpan.FromSeconds(10);
/// <summary>
/// Root NodeId to mirror. Default <c>null</c> = <c>ObjectsFolder</c> (i=85). Set to
/// a scoped root to restrict the address space the driver exposes locally — useful
/// when the remote server has tens of thousands of nodes and only a subset is
/// needed downstream.
/// </summary>
public string? BrowseRoot { get; init; }
/// <summary>
/// Cap on total nodes discovered during <c>DiscoverAsync</c>. Default 10_000 —
/// bounds memory on runaway remote servers without being so low that normal
/// deployments hit it. When the cap is reached discovery stops and a warning is
/// written to the driver health surface; the partially-discovered tree is still
/// projected into the local address space.
/// </summary>
public int MaxDiscoveredNodes { get; init; } = 10_000;
/// <summary>
/// Max hierarchical depth of the browse. Default 10 — deep enough for realistic
/// OPC UA information models, shallow enough that cyclic graphs can't spin the
/// browse forever.
/// </summary>
public int MaxBrowseDepth { get; init; } = 10;
}
/// <summary>OPC UA message security mode.</summary>
public enum OpcUaSecurityMode
{
None,
Sign,
SignAndEncrypt,
}
/// <summary>User authentication type sent to the remote server.</summary>
public enum OpcUaAuthType
{
Anonymous,
Username,
Certificate,
}

View File

@@ -0,0 +1,28 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient</RootNamespace>
<AssemblyName>ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient</AssemblyName>
</PropertyGroup>
<ItemGroup>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.Abstractions\ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj"/>
</ItemGroup>
<ItemGroup>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Client" Version="1.5.378.106"/>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Configuration" Version="1.5.378.106"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.Tests"/>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,216 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.S7;
/// <summary>
/// Siemens S7 memory area. The driver's tag-address parser maps every S7 tag string into
/// exactly one of these + an offset. Values match the on-wire S7 area codes only
/// incidentally — S7.Net uses its own <c>DataType</c> enum (<c>DataBlock</c>, <c>Memory</c>,
/// <c>Input</c>, <c>Output</c>, <c>Timer</c>, <c>Counter</c>) so the adapter layer translates.
/// </summary>
public enum S7Area
{
DataBlock,
Memory, // M (Merker / marker byte)
Input, // I (process-image input)
Output, // Q (process-image output)
Timer,
Counter,
}
/// <summary>
/// Access width for a DB / M / I / Q address. Timers and counters are always 16-bit
/// opaque (not user-addressable via size suffixes).
/// </summary>
public enum S7Size
{
Bit, // X
Byte, // B
Word, // W — 16-bit
DWord, // D — 32-bit
}
/// <summary>
/// Parsed form of an S7 tag-address string. Produced by <see cref="S7AddressParser.Parse"/>.
/// </summary>
/// <param name="Area">Memory area (DB, M, I, Q, T, C).</param>
/// <param name="DbNumber">Data block number; only meaningful when <paramref name="Area"/> is <see cref="S7Area.DataBlock"/>.</param>
/// <param name="Size">Access width. Always <see cref="S7Size.Word"/> for Timer and Counter.</param>
/// <param name="ByteOffset">Byte offset into the area (for DB/M/I/Q) or the timer/counter number.</param>
/// <param name="BitOffset">Bit position 0-7 when <paramref name="Size"/> is <see cref="S7Size.Bit"/>; 0 otherwise.</param>
public readonly record struct S7ParsedAddress(
S7Area Area,
int DbNumber,
S7Size Size,
int ByteOffset,
int BitOffset);
/// <summary>
/// Parses Siemens S7 address strings into <see cref="S7ParsedAddress"/>. Accepts the
/// Siemens TIA-Portal / STEP 7 Classic syntax documented in <c>docs/v2/driver-specs.md</c> §5:
/// <list type="bullet">
/// <item><c>DB{n}.DB{X|B|W|D}{offset}[.bit]</c> — e.g. <c>DB1.DBX0.0</c>, <c>DB1.DBW0</c>, <c>DB1.DBD4</c></item>
/// <item><c>M{B|W|D}{offset}</c> or <c>M{offset}.{bit}</c> — e.g. <c>MB0</c>, <c>MW0</c>, <c>MD4</c>, <c>M0.0</c></item>
/// <item><c>I{B|W|D}{offset}</c> or <c>I{offset}.{bit}</c> — e.g. <c>IB0</c>, <c>IW0</c>, <c>ID0</c>, <c>I0.0</c></item>
/// <item><c>Q{B|W|D}{offset}</c> or <c>Q{offset}.{bit}</c> — e.g. <c>QB0</c>, <c>QW0</c>, <c>QD0</c>, <c>Q0.0</c></item>
/// <item><c>T{n}</c> — e.g. <c>T0</c>, <c>T15</c></item>
/// <item><c>C{n}</c> — e.g. <c>C0</c>, <c>C10</c></item>
/// </list>
/// Grammar is case-insensitive. Leading/trailing whitespace tolerated. Bit specifiers
/// must be 0-7; byte offsets must be non-negative; DB numbers must be &gt;= 1.
/// </summary>
/// <remarks>
/// Parse is deliberately strict — the parser rejects syntactic garbage up-front so a bad
/// tag config fails at driver init time instead of surfacing as a misleading
/// <c>BadInternalError</c> on every Read against that tag.
/// </remarks>
public static class S7AddressParser
{
/// <summary>
/// Parse an S7 address. Throws <see cref="FormatException"/> on any syntax error with
/// the offending input echoed in the message so operators can correlate to the tag
/// config that produced the fault.
/// </summary>
public static S7ParsedAddress Parse(string address)
{
if (string.IsNullOrWhiteSpace(address))
throw new FormatException("S7 address must not be empty");
var s = address.Trim().ToUpperInvariant();
// --- DB{n}.DB{X|B|W|D}{offset}[.bit] ---
if (s.StartsWith("DB") && TryParseDataBlock(s, out var dbResult))
return dbResult;
if (s.Length < 2)
throw new FormatException($"S7 address '{address}' is too short to parse");
var areaChar = s[0];
var rest = s.Substring(1);
switch (areaChar)
{
case 'M': return ParseMIQ(S7Area.Memory, rest, address);
case 'I': return ParseMIQ(S7Area.Input, rest, address);
case 'Q': return ParseMIQ(S7Area.Output, rest, address);
case 'T': return ParseTimerOrCounter(S7Area.Timer, rest, address);
case 'C': return ParseTimerOrCounter(S7Area.Counter, rest, address);
default:
throw new FormatException($"S7 address '{address}' starts with unknown area '{areaChar}' (expected DB/M/I/Q/T/C)");
}
}
/// <summary>
/// Try-parse variant for callers that can't afford an exception on bad input (e.g.
/// config validation pages in the Admin UI). Returns <c>false</c> for any input that
/// would throw from <see cref="Parse"/>.
/// </summary>
public static bool TryParse(string address, out S7ParsedAddress result)
{
try
{
result = Parse(address);
return true;
}
catch (FormatException)
{
result = default;
return false;
}
}
private static bool TryParseDataBlock(string s, out S7ParsedAddress result)
{
result = default;
// Split on first '.': left side must be DB{n}, right side DB{X|B|W|D}{offset}[.bit]
var dot = s.IndexOf('.');
if (dot < 0) return false;
var head = s.Substring(0, dot); // DB{n}
var tail = s.Substring(dot + 1); // DB{X|B|W|D}{offset}[.bit]
if (head.Length < 3) return false;
if (!int.TryParse(head.AsSpan(2), out var dbNumber) || dbNumber < 1)
throw new FormatException($"S7 DB number in '{s}' must be a positive integer");
if (!tail.StartsWith("DB") || tail.Length < 4)
throw new FormatException($"S7 DB address tail '{tail}' must start with DB{{X|B|W|D}}");
var sizeChar = tail[2];
var offsetStart = 3;
var size = sizeChar switch
{
'X' => S7Size.Bit,
'B' => S7Size.Byte,
'W' => S7Size.Word,
'D' => S7Size.DWord,
_ => throw new FormatException($"S7 DB size '{sizeChar}' in '{s}' must be X/B/W/D"),
};
var (byteOffset, bitOffset) = ParseOffsetAndOptionalBit(tail, offsetStart, size, s);
result = new S7ParsedAddress(S7Area.DataBlock, dbNumber, size, byteOffset, bitOffset);
return true;
}
private static S7ParsedAddress ParseMIQ(S7Area area, string rest, string original)
{
if (rest.Length == 0)
throw new FormatException($"S7 address '{original}' has no offset");
var first = rest[0];
S7Size size;
int offsetStart;
switch (first)
{
case 'B': size = S7Size.Byte; offsetStart = 1; break;
case 'W': size = S7Size.Word; offsetStart = 1; break;
case 'D': size = S7Size.DWord; offsetStart = 1; break;
default:
// No size prefix => bit-level address requires explicit .bit. Size stays Bit;
// ParseOffsetAndOptionalBit will demand the dot.
size = S7Size.Bit;
offsetStart = 0;
break;
}
var (byteOffset, bitOffset) = ParseOffsetAndOptionalBit(rest, offsetStart, size, original);
return new S7ParsedAddress(area, DbNumber: 0, size, byteOffset, bitOffset);
}
private static S7ParsedAddress ParseTimerOrCounter(S7Area area, string rest, string original)
{
if (rest.Length == 0)
throw new FormatException($"S7 address '{original}' has no {area} number");
if (!int.TryParse(rest, out var number) || number < 0)
throw new FormatException($"S7 {area} number in '{original}' must be a non-negative integer");
return new S7ParsedAddress(area, DbNumber: 0, S7Size.Word, number, BitOffset: 0);
}
private static (int byteOffset, int bitOffset) ParseOffsetAndOptionalBit(
string s, int start, S7Size size, string original)
{
var offsetEnd = start;
while (offsetEnd < s.Length && s[offsetEnd] >= '0' && s[offsetEnd] <= '9')
offsetEnd++;
if (offsetEnd == start)
throw new FormatException($"S7 address '{original}' has no byte-offset digits");
if (!int.TryParse(s.AsSpan(start, offsetEnd - start), out var byteOffset) || byteOffset < 0)
throw new FormatException($"S7 byte offset in '{original}' must be non-negative");
// No bit-suffix: done unless size is Bit with no prefix, which requires one.
if (offsetEnd == s.Length)
{
if (size == S7Size.Bit)
throw new FormatException($"S7 address '{original}' needs a .{{bit}} suffix for bit access");
return (byteOffset, 0);
}
if (s[offsetEnd] != '.')
throw new FormatException($"S7 address '{original}' has unexpected character after offset");
if (size != S7Size.Bit)
throw new FormatException($"S7 address '{original}' has a bit suffix but the size is {size} — bit access needs X (DB) or no size prefix (M/I/Q)");
if (!int.TryParse(s.AsSpan(offsetEnd + 1), out var bitOffset) || bitOffset is < 0 or > 7)
throw new FormatException($"S7 bit offset in '{original}' must be 0-7");
return (byteOffset, bitOffset);
}
}

View File

@@ -0,0 +1,513 @@
using S7.Net;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.S7;
/// <summary>
/// Siemens S7 native driver — speaks S7comm over ISO-on-TCP (port 102) via the S7netplus
/// library. First implementation of <see cref="IDriver"/> for an in-process .NET Standard
/// PLC protocol that is NOT Modbus, validating that the v2 driver-capability interfaces
/// generalize beyond Modbus + Galaxy.
/// </summary>
/// <remarks>
/// <para>
/// PR 62 ships the scaffold: <see cref="IDriver"/> only (Initialize / Reinitialize /
/// Shutdown / GetHealth). <see cref="ITagDiscovery"/>, <see cref="IReadable"/>,
/// <see cref="IWritable"/>, <see cref="ISubscribable"/>, <see cref="IHostConnectivityProbe"/>
/// land in PRs 63-65 once the address parser (PR 63) is in place.
/// </para>
/// <para>
/// <b>Single-connection policy</b>: S7netplus documented pattern is one
/// <c>Plc</c> instance per PLC, serialized with a <see cref="SemaphoreSlim"/>.
/// Parallelising reads against a single S7 CPU doesn't help — the CPU scans the
/// communication mailbox at most once per cycle (2-10 ms) and queues concurrent
/// requests wire-side anyway. Multiple client-side connections just waste the CPU's
/// 8-64 connection-resource budget.
/// </para>
/// </remarks>
public sealed class S7Driver(S7DriverOptions options, string driverInstanceId)
: IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IDisposable, IAsyncDisposable
{
// ---- ISubscribable + IHostConnectivityProbe state ----
private readonly System.Collections.Concurrent.ConcurrentDictionary<long, SubscriptionState> _subscriptions = new();
private long _nextSubscriptionId;
private readonly object _probeLock = new();
private HostState _hostState = HostState.Unknown;
private DateTime _hostStateChangedUtc = DateTime.UtcNow;
private CancellationTokenSource? _probeCts;
public event EventHandler<DataChangeEventArgs>? OnDataChange;
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
/// <summary>OPC UA StatusCode used when the tag name isn't in the driver's tag map.</summary>
private const uint StatusBadNodeIdUnknown = 0x80340000u;
/// <summary>OPC UA StatusCode used when the tag's data type isn't implemented yet.</summary>
private const uint StatusBadNotSupported = 0x803D0000u;
/// <summary>OPC UA StatusCode used when the tag is declared read-only.</summary>
private const uint StatusBadNotWritable = 0x803B0000u;
/// <summary>OPC UA StatusCode used when write fails validation (e.g. out-of-range value).</summary>
private const uint StatusBadInternalError = 0x80020000u;
/// <summary>OPC UA StatusCode used for socket / timeout / protocol-layer faults.</summary>
private const uint StatusBadCommunicationError = 0x80050000u;
/// <summary>OPC UA StatusCode used when S7 returns <c>ErrorCode.WrongCPU</c> / PUT/GET disabled.</summary>
private const uint StatusBadDeviceFailure = 0x80550000u;
private readonly Dictionary<string, S7TagDefinition> _tagsByName = new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<string, S7ParsedAddress> _parsedByName = new(StringComparer.OrdinalIgnoreCase);
private readonly S7DriverOptions _options = options;
private readonly SemaphoreSlim _gate = new(1, 1);
/// <summary>
/// Per-connection gate. Internal so PRs 63-65 (read/write/subscribe) can serialize on
/// the same semaphore without exposing it publicly. Single-connection-per-PLC is a
/// hard requirement of S7netplus — see class remarks.
/// </summary>
internal SemaphoreSlim Gate => _gate;
/// <summary>
/// Active S7.Net PLC connection. Null until <see cref="InitializeAsync"/> returns; null
/// after <see cref="ShutdownAsync"/>. Read-only outside this class; PR 64's Read/Write
/// will take the <see cref="_gate"/> before touching it.
/// </summary>
internal Plc? Plc { get; private set; }
private DriverHealth _health = new(DriverState.Unknown, null, null);
private bool _disposed;
public string DriverInstanceId => driverInstanceId;
public string DriverType => "S7";
public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
_health = new DriverHealth(DriverState.Initializing, null, null);
try
{
var plc = new Plc(_options.CpuType, _options.Host, _options.Rack, _options.Slot);
// S7netplus writes timeouts into the underlying TcpClient via Plc.WriteTimeout /
// Plc.ReadTimeout (milliseconds). Set before OpenAsync so the handshake itself
// honours the bound.
plc.WriteTimeout = (int)_options.Timeout.TotalMilliseconds;
plc.ReadTimeout = (int)_options.Timeout.TotalMilliseconds;
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(_options.Timeout);
await plc.OpenAsync(cts.Token).ConfigureAwait(false);
Plc = plc;
// Parse every tag's address once at init so config typos fail fast here instead
// of surfacing as BadInternalError on every Read against the bad tag. The parser
// also rejects bit-offset > 7, DB 0, unknown area letters, etc.
_tagsByName.Clear();
_parsedByName.Clear();
foreach (var t in _options.Tags)
{
var parsed = S7AddressParser.Parse(t.Address); // throws FormatException
_tagsByName[t.Name] = t;
_parsedByName[t.Name] = parsed;
}
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
// Kick off the probe loop once the connection is up. Initial HostState stays
// Unknown until the first probe tick succeeds — avoids broadcasting a premature
// Running transition before any PDU round-trip has happened.
if (_options.Probe.Enabled)
{
_probeCts = new CancellationTokenSource();
_ = Task.Run(() => ProbeLoopAsync(_probeCts.Token), _probeCts.Token);
}
}
catch (Exception ex)
{
// Clean up a partially-constructed Plc so a retry from the caller doesn't leak
// the TcpClient. S7netplus's Close() is best-effort and idempotent.
try { Plc?.Close(); } catch { }
Plc = null;
_health = new DriverHealth(DriverState.Faulted, null, ex.Message);
throw;
}
}
public async Task ReinitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
await ShutdownAsync(cancellationToken).ConfigureAwait(false);
await InitializeAsync(driverConfigJson, cancellationToken).ConfigureAwait(false);
}
public Task ShutdownAsync(CancellationToken cancellationToken)
{
try { _probeCts?.Cancel(); } catch { }
_probeCts?.Dispose();
_probeCts = null;
foreach (var state in _subscriptions.Values)
{
try { state.Cts.Cancel(); } catch { }
state.Cts.Dispose();
}
_subscriptions.Clear();
try { Plc?.Close(); } catch { /* best-effort — tearing down anyway */ }
Plc = null;
_health = new DriverHealth(DriverState.Unknown, _health.LastSuccessfulRead, null);
return Task.CompletedTask;
}
public DriverHealth GetHealth() => _health;
/// <summary>
/// Approximate memory footprint. The Plc instance + one 240-960 byte PDU buffer is
/// under 4 KB; return 0 because the <see cref="IDriver"/> contract asks for a
/// driver-attributable growth number and S7.Net doesn't expose one.
/// </summary>
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask;
// ---- IReadable ----
public async Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
{
var plc = RequirePlc();
var now = DateTime.UtcNow;
var results = new DataValueSnapshot[fullReferences.Count];
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
for (var i = 0; i < fullReferences.Count; i++)
{
var name = fullReferences[i];
if (!_tagsByName.TryGetValue(name, out var tag))
{
results[i] = new DataValueSnapshot(null, StatusBadNodeIdUnknown, null, now);
continue;
}
try
{
var value = await ReadOneAsync(plc, tag, cancellationToken).ConfigureAwait(false);
results[i] = new DataValueSnapshot(value, 0u, now, now);
_health = new DriverHealth(DriverState.Healthy, now, null);
}
catch (NotSupportedException)
{
results[i] = new DataValueSnapshot(null, StatusBadNotSupported, null, now);
}
catch (global::S7.Net.PlcException pex)
{
// S7.Net's PlcException carries an ErrorCode; PUT/GET-disabled on
// S7-1200/1500 surfaces here. Map to BadDeviceFailure so operators see a
// device-config problem (toggle PUT/GET in TIA Portal) rather than a
// transient fault — per driver-specs.md §5.
results[i] = new DataValueSnapshot(null, StatusBadDeviceFailure, null, now);
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, pex.Message);
}
catch (Exception ex)
{
results[i] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now);
_health = new DriverHealth(DriverState.Degraded, _health.LastSuccessfulRead, ex.Message);
}
}
}
finally { _gate.Release(); }
return results;
}
private async Task<object> ReadOneAsync(global::S7.Net.Plc plc, S7TagDefinition tag, CancellationToken ct)
{
var addr = _parsedByName[tag.Name];
// S7.Net's string-based ReadAsync returns object where the boxed .NET type depends on
// the size suffix: DBX=bool, DBB=byte, DBW=ushort, DBD=uint. Our S7DataType enum
// specifies the SEMANTIC type (Int16 vs UInt16 vs Float32 etc.); the reinterpret below
// converts the raw unsigned boxed value into the requested type without issuing an
// extra PLC round-trip.
var raw = await plc.ReadAsync(tag.Address, ct).ConfigureAwait(false)
?? throw new System.IO.InvalidDataException($"S7.Net returned null for '{tag.Address}'");
return (tag.DataType, addr.Size, raw) switch
{
(S7DataType.Bool, S7Size.Bit, bool b) => b,
(S7DataType.Byte, S7Size.Byte, byte by) => by,
(S7DataType.UInt16, S7Size.Word, ushort u16) => u16,
(S7DataType.Int16, S7Size.Word, ushort u16) => unchecked((short)u16),
(S7DataType.UInt32, S7Size.DWord, uint u32) => u32,
(S7DataType.Int32, S7Size.DWord, uint u32) => unchecked((int)u32),
(S7DataType.Float32, S7Size.DWord, uint u32) => BitConverter.UInt32BitsToSingle(u32),
(S7DataType.Int64, _, _) => throw new NotSupportedException("S7 Int64 reads land in a follow-up PR"),
(S7DataType.UInt64, _, _) => throw new NotSupportedException("S7 UInt64 reads land in a follow-up PR"),
(S7DataType.Float64, _, _) => throw new NotSupportedException("S7 Float64 (LReal) reads land in a follow-up PR"),
(S7DataType.String, _, _) => throw new NotSupportedException("S7 STRING reads land in a follow-up PR"),
(S7DataType.DateTime, _, _) => throw new NotSupportedException("S7 DateTime reads land in a follow-up PR"),
_ => throw new System.IO.InvalidDataException(
$"S7 Read type-mismatch: tag '{tag.Name}' declared {tag.DataType} but address '{tag.Address}' " +
$"parsed as Size={addr.Size}; S7.Net returned {raw.GetType().Name}"),
};
}
// ---- IWritable ----
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
IReadOnlyList<WriteRequest> writes, CancellationToken cancellationToken)
{
var plc = RequirePlc();
var results = new WriteResult[writes.Count];
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
for (var i = 0; i < writes.Count; i++)
{
var w = writes[i];
if (!_tagsByName.TryGetValue(w.FullReference, out var tag))
{
results[i] = new WriteResult(StatusBadNodeIdUnknown);
continue;
}
if (!tag.Writable)
{
results[i] = new WriteResult(StatusBadNotWritable);
continue;
}
try
{
await WriteOneAsync(plc, tag, w.Value, cancellationToken).ConfigureAwait(false);
results[i] = new WriteResult(0u);
}
catch (NotSupportedException)
{
results[i] = new WriteResult(StatusBadNotSupported);
}
catch (global::S7.Net.PlcException)
{
results[i] = new WriteResult(StatusBadDeviceFailure);
}
catch (Exception)
{
results[i] = new WriteResult(StatusBadInternalError);
}
}
}
finally { _gate.Release(); }
return results;
}
private async Task WriteOneAsync(global::S7.Net.Plc plc, S7TagDefinition tag, object? value, CancellationToken ct)
{
// S7.Net's Plc.WriteAsync(string address, object value) expects the boxed value to
// match the address's size-suffix type: DBX=bool, DBB=byte, DBW=ushort, DBD=uint.
// Our S7DataType lets the caller pass short/int/float; convert to the unsigned
// wire representation before handing off.
var boxed = tag.DataType switch
{
S7DataType.Bool => (object)Convert.ToBoolean(value),
S7DataType.Byte => (object)Convert.ToByte(value),
S7DataType.UInt16 => (object)Convert.ToUInt16(value),
S7DataType.Int16 => (object)unchecked((ushort)Convert.ToInt16(value)),
S7DataType.UInt32 => (object)Convert.ToUInt32(value),
S7DataType.Int32 => (object)unchecked((uint)Convert.ToInt32(value)),
S7DataType.Float32 => (object)BitConverter.SingleToUInt32Bits(Convert.ToSingle(value)),
S7DataType.Int64 => throw new NotSupportedException("S7 Int64 writes land in a follow-up PR"),
S7DataType.UInt64 => throw new NotSupportedException("S7 UInt64 writes land in a follow-up PR"),
S7DataType.Float64 => throw new NotSupportedException("S7 Float64 (LReal) writes land in a follow-up PR"),
S7DataType.String => throw new NotSupportedException("S7 STRING writes land in a follow-up PR"),
S7DataType.DateTime => throw new NotSupportedException("S7 DateTime writes land in a follow-up PR"),
_ => throw new InvalidOperationException($"Unknown S7DataType {tag.DataType}"),
};
await plc.WriteAsync(tag.Address, boxed, ct).ConfigureAwait(false);
}
private global::S7.Net.Plc RequirePlc() =>
Plc ?? throw new InvalidOperationException("S7Driver not initialized");
// ---- ITagDiscovery ----
public Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(builder);
var folder = builder.Folder("S7", "S7");
foreach (var t in _options.Tags)
{
folder.Variable(t.Name, t.Name, new DriverAttributeInfo(
FullName: t.Name,
DriverDataType: MapDataType(t.DataType),
IsArray: false,
ArrayDim: null,
SecurityClass: t.Writable ? SecurityClassification.Operate : SecurityClassification.ViewOnly,
IsHistorized: false,
IsAlarm: false));
}
return Task.CompletedTask;
}
private static DriverDataType MapDataType(S7DataType t) => t switch
{
S7DataType.Bool => DriverDataType.Boolean,
S7DataType.Byte => DriverDataType.Int32, // no 8-bit in DriverDataType yet
S7DataType.Int16 or S7DataType.UInt16 or S7DataType.Int32 or S7DataType.UInt32 => DriverDataType.Int32,
S7DataType.Int64 or S7DataType.UInt64 => DriverDataType.Int32, // widens; lossy for >2^31-1
S7DataType.Float32 => DriverDataType.Float32,
S7DataType.Float64 => DriverDataType.Float64,
S7DataType.String => DriverDataType.String,
S7DataType.DateTime => DriverDataType.DateTime,
_ => DriverDataType.Int32,
};
// ---- ISubscribable (polling overlay) ----
public Task<ISubscriptionHandle> SubscribeAsync(
IReadOnlyList<string> fullReferences, TimeSpan publishingInterval, CancellationToken cancellationToken)
{
var id = Interlocked.Increment(ref _nextSubscriptionId);
var cts = new CancellationTokenSource();
// Floor at 100 ms — S7 CPUs scan 2-10 ms but the comms mailbox is processed at most
// once per scan; sub-100 ms polling just queues wire-side with worse latency.
var interval = publishingInterval < TimeSpan.FromMilliseconds(100)
? TimeSpan.FromMilliseconds(100)
: publishingInterval;
var handle = new S7SubscriptionHandle(id);
var state = new SubscriptionState(handle, [.. fullReferences], interval, cts);
_subscriptions[id] = state;
_ = Task.Run(() => PollLoopAsync(state, cts.Token), cts.Token);
return Task.FromResult<ISubscriptionHandle>(handle);
}
public Task UnsubscribeAsync(ISubscriptionHandle handle, CancellationToken cancellationToken)
{
if (handle is S7SubscriptionHandle h && _subscriptions.TryRemove(h.Id, out var state))
{
state.Cts.Cancel();
state.Cts.Dispose();
}
return Task.CompletedTask;
}
private async Task PollLoopAsync(SubscriptionState state, CancellationToken ct)
{
// Initial-data push per OPC UA Part 4 convention.
try { await PollOnceAsync(state, forceRaise: true, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
catch { /* first-read error — polling continues */ }
while (!ct.IsCancellationRequested)
{
try { await Task.Delay(state.Interval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
try { await PollOnceAsync(state, forceRaise: false, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
catch { /* transient polling error — loop continues, health surface reflects it */ }
}
}
private async Task PollOnceAsync(SubscriptionState state, bool forceRaise, CancellationToken ct)
{
var snapshots = await ReadAsync(state.TagReferences, ct).ConfigureAwait(false);
for (var i = 0; i < state.TagReferences.Count; i++)
{
var tagRef = state.TagReferences[i];
var current = snapshots[i];
var lastSeen = state.LastValues.TryGetValue(tagRef, out var prev) ? prev : default;
if (forceRaise || !Equals(lastSeen?.Value, current.Value) || lastSeen?.StatusCode != current.StatusCode)
{
state.LastValues[tagRef] = current;
OnDataChange?.Invoke(this, new DataChangeEventArgs(state.Handle, tagRef, current));
}
}
}
private sealed record SubscriptionState(
S7SubscriptionHandle Handle,
IReadOnlyList<string> TagReferences,
TimeSpan Interval,
CancellationTokenSource Cts)
{
public System.Collections.Concurrent.ConcurrentDictionary<string, DataValueSnapshot> LastValues { get; }
= new(StringComparer.OrdinalIgnoreCase);
}
private sealed record S7SubscriptionHandle(long Id) : ISubscriptionHandle
{
public string DiagnosticId => $"s7-sub-{Id}";
}
// ---- IHostConnectivityProbe ----
/// <summary>
/// Host identifier surfaced in <see cref="GetHostStatuses"/>. <c>host:port</c> format
/// matches the Modbus driver's convention so the Admin UI dashboard renders both
/// family's rows uniformly.
/// </summary>
public string HostName => $"{_options.Host}:{_options.Port}";
public IReadOnlyList<HostConnectivityStatus> GetHostStatuses()
{
lock (_probeLock)
return [new HostConnectivityStatus(HostName, _hostState, _hostStateChangedUtc)];
}
private async Task ProbeLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
var success = false;
try
{
// Probe via S7.Net's low-cost GetCpuStatus — returns the CPU state (Run/Stop)
// and is intentionally light on the comms mailbox. Single-word Plc.ReadAsync
// would also work but GetCpuStatus doubles as a "PLC actually up" check.
using var probeCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
probeCts.CancelAfter(_options.Probe.Timeout);
var plc = Plc;
if (plc is null) throw new InvalidOperationException("Plc dropped during probe");
await _gate.WaitAsync(probeCts.Token).ConfigureAwait(false);
try
{
_ = await plc.ReadStatusAsync(probeCts.Token).ConfigureAwait(false);
success = true;
}
finally { _gate.Release(); }
}
catch (OperationCanceledException) when (ct.IsCancellationRequested) { return; }
catch { /* transport/timeout/exception — treated as Stopped below */ }
TransitionTo(success ? HostState.Running : HostState.Stopped);
try { await Task.Delay(_options.Probe.Interval, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
}
}
private void TransitionTo(HostState newState)
{
HostState old;
lock (_probeLock)
{
old = _hostState;
if (old == newState) return;
_hostState = newState;
_hostStateChangedUtc = DateTime.UtcNow;
}
OnHostStatusChanged?.Invoke(this, new HostStatusChangedEventArgs(HostName, old, newState));
}
public void Dispose() => DisposeAsync().AsTask().GetAwaiter().GetResult();
public async ValueTask DisposeAsync()
{
if (_disposed) return;
_disposed = true;
try { await ShutdownAsync(CancellationToken.None).ConfigureAwait(false); }
catch { /* disposal is best-effort */ }
_gate.Dispose();
}
}

View File

@@ -0,0 +1,112 @@
using S7NetCpuType = global::S7.Net.CpuType;
namespace ZB.MOM.WW.OtOpcUa.Driver.S7;
/// <summary>
/// Siemens S7 native (S7comm / ISO-on-TCP port 102) driver configuration. Bound from the
/// driver's <c>DriverConfig</c> JSON at <c>DriverHost.RegisterAsync</c>. Unlike the Modbus
/// driver the S7 driver uses the PLC's *native* protocol — port 102 ISO-on-TCP rather
/// than Modbus's 502, and S7-specific area codes (DB, M, I, Q) rather than holding-
/// register / coil tables.
/// </summary>
/// <remarks>
/// <para>
/// The driver requires <b>PUT/GET communication enabled</b> in the TIA Portal
/// hardware config for S7-1200/1500. The factory default disables PUT/GET access,
/// so a driver configured against a freshly-flashed CPU will see a hard error
/// (S7.Net surfaces it as <c>Plc.ReadAsync</c> returning <c>ErrorCode.Accessing</c>).
/// The driver maps that specifically to <c>BadNotSupported</c> and flags it as a
/// configuration alert rather than a transient fault — blind Polly retry is wasted
/// effort when the PLC will keep refusing every request.
/// </para>
/// <para>
/// See <c>docs/v2/driver-specs.md</c> §5 for the full specification.
/// </para>
/// </remarks>
public sealed class S7DriverOptions
{
/// <summary>PLC IP address or hostname.</summary>
public string Host { get; init; } = "127.0.0.1";
/// <summary>TCP port. ISO-on-TCP is 102 on every S7 model; override only for unusual NAT setups.</summary>
public int Port { get; init; } = 102;
/// <summary>
/// CPU family. Determines the ISO-TSAP slot byte that S7.Net uses during connection
/// setup — pick the family that matches the target PLC exactly.
/// </summary>
public S7NetCpuType CpuType { get; init; } = S7NetCpuType.S71500;
/// <summary>
/// Hardware rack number. Almost always 0; relevant only for distributed S7-400 racks
/// with multiple CPUs.
/// </summary>
public short Rack { get; init; } = 0;
/// <summary>
/// CPU slot. Conventions per family: S7-300 = slot 2, S7-400 = slot 2 or 3,
/// S7-1200 / S7-1500 = slot 0 (onboard PN). S7.Net uses this to build the remote
/// TSAP. Wrong slot → connection refused during handshake.
/// </summary>
public short Slot { get; init; } = 0;
/// <summary>Connect + per-operation timeout.</summary>
public TimeSpan Timeout { get; init; } = TimeSpan.FromSeconds(5);
/// <summary>Pre-declared tag map. S7 has a symbol-table protocol but S7.Net does not expose it, so the driver operates off a static tag list configured per-site. Address grammar documented in S7AddressParser (PR 63).</summary>
public IReadOnlyList<S7TagDefinition> Tags { get; init; } = [];
/// <summary>
/// Background connectivity-probe settings. When enabled, the driver runs a tick loop
/// that issues a cheap read against <see cref="S7ProbeOptions.ProbeAddress"/> every
/// <see cref="S7ProbeOptions.Interval"/> and raises <c>OnHostStatusChanged</c> on
/// Running ↔ Stopped transitions.
/// </summary>
public S7ProbeOptions Probe { get; init; } = new();
}
public sealed class S7ProbeOptions
{
public bool Enabled { get; init; } = true;
public TimeSpan Interval { get; init; } = TimeSpan.FromSeconds(5);
public TimeSpan Timeout { get; init; } = TimeSpan.FromSeconds(2);
/// <summary>
/// Address to probe for liveness. DB1.DBW0 is the convention if the PLC project
/// reserves a small fingerprint DB for health checks (per <c>docs/v2/s7.md</c>);
/// if not, pick any valid Merker word like <c>MW0</c>.
/// </summary>
public string ProbeAddress { get; init; } = "MW0";
}
/// <summary>
/// One S7 variable as exposed by the driver. Addresses use S7.Net syntax — see
/// <c>S7AddressParser</c> (PR 63) for the grammar.
/// </summary>
/// <param name="Name">Tag name; OPC UA browse name + driver full reference.</param>
/// <param name="Address">S7 address string, e.g. <c>DB1.DBW0</c>, <c>M0.0</c>, <c>I0.0</c>, <c>QD4</c>. Grammar documented in <c>S7AddressParser</c> (PR 63).</param>
/// <param name="DataType">Logical data type — drives the underlying S7.Net read/write width.</param>
/// <param name="Writable">When true the driver accepts writes for this tag.</param>
/// <param name="StringLength">For <c>DataType = String</c>: S7-string max length. Default 254 (S7 max).</param>
public sealed record S7TagDefinition(
string Name,
string Address,
S7DataType DataType,
bool Writable = true,
int StringLength = 254);
public enum S7DataType
{
Bool,
Byte,
Int16,
UInt16,
Int32,
UInt32,
Int64,
UInt64,
Float32,
Float64,
String,
DateTime,
}

View File

@@ -0,0 +1,27 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.S7</RootNamespace>
<AssemblyName>ZB.MOM.WW.OtOpcUa.Driver.S7</AssemblyName>
</PropertyGroup>
<ItemGroup>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.Abstractions\ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj"/>
</ItemGroup>
<ItemGroup>
<PackageReference Include="S7netplus" Version="0.20.0"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.S7.Tests"/>
</ItemGroup>
</Project>

View File

@@ -1,15 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Host.Configuration;
using ZB.MOM.WW.OtOpcUa.Host.Historian;
namespace ZB.MOM.WW.OtOpcUa.Historian.Aveva
{
/// <summary>
/// Reflection entry point invoked by <c>HistorianPluginLoader</c> in the Host. Kept
/// deliberately simple so the plugin contract is a single static factory method.
/// </summary>
public static class AvevaHistorianPluginEntry
{
public static IHistorianDataSource Create(HistorianConfiguration config)
=> new HistorianDataSource(config);
}
}

View File

@@ -1,181 +0,0 @@
using System;
using System.Collections.Generic;
using System.Linq;
using ZB.MOM.WW.OtOpcUa.Host.Configuration;
using ZB.MOM.WW.OtOpcUa.Host.Historian;
namespace ZB.MOM.WW.OtOpcUa.Historian.Aveva
{
/// <summary>
/// Thread-safe, pure-logic endpoint picker for the Wonderware Historian cluster. Tracks which
/// configured nodes are healthy, places failed nodes in a time-bounded cooldown, and hands
/// out an ordered list of eligible candidates for the data source to try in sequence.
/// </summary>
/// <remarks>
/// Design notes:
/// <list type="bullet">
/// <item>No SDK dependency — fully unit-testable with an injected clock.</item>
/// <item>Per-node state is guarded by a single lock; operations are microsecond-scale
/// so contention is a non-issue.</item>
/// <item>Cooldown is purely passive: a node re-enters the healthy pool the next time
/// it is queried after its cooldown window elapses. There is no background probe.</item>
/// <item>Nodes are returned in configuration order so operators can express a
/// preference (primary first, fallback second).</item>
/// <item>When <see cref="HistorianConfiguration.ServerNames"/> is empty, the picker is
/// initialized with a single entry from <see cref="HistorianConfiguration.ServerName"/>
/// so legacy deployments continue to work unchanged.</item>
/// </list>
/// </remarks>
internal sealed class HistorianClusterEndpointPicker
{
private readonly Func<DateTime> _clock;
private readonly TimeSpan _cooldown;
private readonly object _lock = new object();
private readonly List<NodeEntry> _nodes;
public HistorianClusterEndpointPicker(HistorianConfiguration config)
: this(config, () => DateTime.UtcNow) { }
internal HistorianClusterEndpointPicker(HistorianConfiguration config, Func<DateTime> clock)
{
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
_cooldown = TimeSpan.FromSeconds(Math.Max(0, config.FailureCooldownSeconds));
var names = (config.ServerNames != null && config.ServerNames.Count > 0)
? config.ServerNames
: new List<string> { config.ServerName };
_nodes = names
.Where(n => !string.IsNullOrWhiteSpace(n))
.Select(n => n.Trim())
.Distinct(StringComparer.OrdinalIgnoreCase)
.Select(n => new NodeEntry { Name = n })
.ToList();
}
/// <summary>
/// Gets the total number of configured cluster nodes. Stable — nodes are never added
/// or removed after construction.
/// </summary>
public int NodeCount
{
get
{
lock (_lock)
return _nodes.Count;
}
}
/// <summary>
/// Returns an ordered snapshot of nodes currently eligible for a connection attempt,
/// with any node whose cooldown has elapsed automatically restored to the pool.
/// An empty list means all nodes are in active cooldown.
/// </summary>
public IReadOnlyList<string> GetHealthyNodes()
{
lock (_lock)
{
var now = _clock();
return _nodes
.Where(n => IsHealthyAt(n, now))
.Select(n => n.Name)
.ToList();
}
}
/// <summary>
/// Gets the count of nodes currently eligible for a connection attempt (i.e., not in cooldown).
/// </summary>
public int HealthyNodeCount
{
get
{
lock (_lock)
{
var now = _clock();
return _nodes.Count(n => IsHealthyAt(n, now));
}
}
}
/// <summary>
/// Places <paramref name="node"/> into cooldown starting at the current clock time.
/// Increments the node's failure counter and stores the latest error message for
/// surfacing on the dashboard. Unknown node names are ignored.
/// </summary>
public void MarkFailed(string node, string? error)
{
lock (_lock)
{
var entry = FindEntry(node);
if (entry == null)
return;
var now = _clock();
entry.FailureCount++;
entry.LastError = error;
entry.LastFailureTime = now;
entry.CooldownUntil = _cooldown.TotalMilliseconds > 0 ? now + _cooldown : (DateTime?)null;
}
}
/// <summary>
/// Marks <paramref name="node"/> as healthy immediately — clears any active cooldown but
/// leaves the cumulative failure counter intact for operator diagnostics. Unknown node
/// names are ignored.
/// </summary>
public void MarkHealthy(string node)
{
lock (_lock)
{
var entry = FindEntry(node);
if (entry == null)
return;
entry.CooldownUntil = null;
}
}
/// <summary>
/// Captures the current per-node state for the health dashboard. Freshly computed from
/// <see cref="_clock"/> so recently-expired cooldowns are reported as healthy.
/// </summary>
public List<HistorianClusterNodeState> SnapshotNodeStates()
{
lock (_lock)
{
var now = _clock();
return _nodes.Select(n => new HistorianClusterNodeState
{
Name = n.Name,
IsHealthy = IsHealthyAt(n, now),
CooldownUntil = IsHealthyAt(n, now) ? null : n.CooldownUntil,
FailureCount = n.FailureCount,
LastError = n.LastError,
LastFailureTime = n.LastFailureTime
}).ToList();
}
}
private static bool IsHealthyAt(NodeEntry entry, DateTime now)
{
return entry.CooldownUntil == null || entry.CooldownUntil <= now;
}
private NodeEntry? FindEntry(string node)
{
for (var i = 0; i < _nodes.Count; i++)
if (string.Equals(_nodes[i].Name, node, StringComparison.OrdinalIgnoreCase))
return _nodes[i];
return null;
}
private sealed class NodeEntry
{
public string Name { get; set; } = "";
public DateTime? CooldownUntil { get; set; }
public int FailureCount { get; set; }
public string? LastError { get; set; }
public DateTime? LastFailureTime { get; set; }
}
}
}

View File

@@ -1,704 +0,0 @@
using System;
using System.Collections.Generic;
using StringCollection = System.Collections.Specialized.StringCollection;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA;
using Opc.Ua;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Host.Configuration;
using ZB.MOM.WW.OtOpcUa.Host.Domain;
using ZB.MOM.WW.OtOpcUa.Host.Historian;
namespace ZB.MOM.WW.OtOpcUa.Historian.Aveva
{
/// <summary>
/// Reads historical data from the Wonderware Historian via the aahClientManaged SDK.
/// </summary>
public sealed class HistorianDataSource : IHistorianDataSource
{
private static readonly ILogger Log = Serilog.Log.ForContext<HistorianDataSource>();
private readonly HistorianConfiguration _config;
private readonly object _connectionLock = new object();
private readonly object _eventConnectionLock = new object();
private readonly IHistorianConnectionFactory _factory;
private HistorianAccess? _connection;
private HistorianAccess? _eventConnection;
private bool _disposed;
// Runtime query health state. Guarded by _healthLock — updated on every read
// method exit (success or failure) so the dashboard can distinguish "plugin
// loaded but never queried" from "plugin loaded and queries are failing".
private readonly object _healthLock = new object();
private long _totalSuccesses;
private long _totalFailures;
private int _consecutiveFailures;
private DateTime? _lastSuccessTime;
private DateTime? _lastFailureTime;
private string? _lastError;
private string? _activeProcessNode;
private string? _activeEventNode;
// Cluster endpoint picker — shared across process + event paths so a node that
// fails on one silo is skipped on the other. Initialized from config at construction.
private readonly HistorianClusterEndpointPicker _picker;
/// <summary>
/// Initializes a Historian reader that translates OPC UA history requests into Wonderware Historian SDK queries.
/// </summary>
/// <param name="config">The Historian SDK connection settings used for runtime history lookups.</param>
public HistorianDataSource(HistorianConfiguration config)
: this(config, new SdkHistorianConnectionFactory(), null) { }
/// <summary>
/// Initializes a Historian reader with a custom connection factory for testing. When
/// <paramref name="picker"/> is <see langword="null"/> a new picker is built from
/// <paramref name="config"/>, preserving backward compatibility with existing tests.
/// </summary>
internal HistorianDataSource(
HistorianConfiguration config,
IHistorianConnectionFactory factory,
HistorianClusterEndpointPicker? picker = null)
{
_config = config;
_factory = factory;
_picker = picker ?? new HistorianClusterEndpointPicker(config);
}
/// <summary>
/// Iterates the picker's healthy node list, cloning the configuration per attempt and
/// handing it to the factory. Marks each tried node as healthy on success or failed on
/// exception. Returns the winning connection + node name; throws when no nodes succeed.
/// </summary>
private (HistorianAccess Connection, string Node) ConnectToAnyHealthyNode(HistorianConnectionType type)
{
var candidates = _picker.GetHealthyNodes();
if (candidates.Count == 0)
{
var total = _picker.NodeCount;
throw new InvalidOperationException(
total == 0
? "No historian nodes configured"
: $"All {total} historian nodes are in cooldown — no healthy endpoints to connect to");
}
Exception? lastException = null;
foreach (var node in candidates)
{
var attemptConfig = CloneConfigWithServerName(node);
try
{
var conn = _factory.CreateAndConnect(attemptConfig, type);
_picker.MarkHealthy(node);
return (conn, node);
}
catch (Exception ex)
{
_picker.MarkFailed(node, ex.Message);
lastException = ex;
Log.Warning(ex,
"Historian node {Node} failed during connect attempt; trying next candidate", node);
}
}
var inner = lastException?.Message ?? "(no detail)";
throw new InvalidOperationException(
$"All {candidates.Count} healthy historian candidate(s) failed during connect: {inner}",
lastException);
}
private HistorianConfiguration CloneConfigWithServerName(string serverName)
{
return new HistorianConfiguration
{
Enabled = _config.Enabled,
ServerName = serverName,
ServerNames = _config.ServerNames,
FailureCooldownSeconds = _config.FailureCooldownSeconds,
IntegratedSecurity = _config.IntegratedSecurity,
UserName = _config.UserName,
Password = _config.Password,
Port = _config.Port,
CommandTimeoutSeconds = _config.CommandTimeoutSeconds,
MaxValuesPerRead = _config.MaxValuesPerRead
};
}
/// <inheritdoc />
public HistorianHealthSnapshot GetHealthSnapshot()
{
var nodeStates = _picker.SnapshotNodeStates();
var healthyCount = 0;
foreach (var n in nodeStates)
if (n.IsHealthy)
healthyCount++;
lock (_healthLock)
{
return new HistorianHealthSnapshot
{
TotalQueries = _totalSuccesses + _totalFailures,
TotalSuccesses = _totalSuccesses,
TotalFailures = _totalFailures,
ConsecutiveFailures = _consecutiveFailures,
LastSuccessTime = _lastSuccessTime,
LastFailureTime = _lastFailureTime,
LastError = _lastError,
ProcessConnectionOpen = Volatile.Read(ref _connection) != null,
EventConnectionOpen = Volatile.Read(ref _eventConnection) != null,
ActiveProcessNode = _activeProcessNode,
ActiveEventNode = _activeEventNode,
NodeCount = nodeStates.Count,
HealthyNodeCount = healthyCount,
Nodes = nodeStates
};
}
}
private void RecordSuccess()
{
lock (_healthLock)
{
_totalSuccesses++;
_lastSuccessTime = DateTime.UtcNow;
_consecutiveFailures = 0;
_lastError = null;
}
}
private void RecordFailure(string error)
{
lock (_healthLock)
{
_totalFailures++;
_lastFailureTime = DateTime.UtcNow;
_consecutiveFailures++;
_lastError = error;
}
}
private void EnsureConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(HistorianDataSource));
// Fast path: already connected (no lock needed)
if (Volatile.Read(ref _connection) != null)
return;
// Create and wait for connection outside the lock so concurrent history
// requests are not serialized behind a slow Historian handshake. The cluster
// picker iterates configured nodes and returns the first that successfully connects.
var (conn, winningNode) = ConnectToAnyHealthyNode(HistorianConnectionType.Process);
lock (_connectionLock)
{
if (_disposed)
{
conn.CloseConnection(out _);
conn.Dispose();
throw new ObjectDisposedException(nameof(HistorianDataSource));
}
if (_connection != null)
{
// Another thread connected while we were waiting
conn.CloseConnection(out _);
conn.Dispose();
return;
}
_connection = conn;
lock (_healthLock)
_activeProcessNode = winningNode;
Log.Information("Historian SDK connection opened to {Server}:{Port}", winningNode, _config.Port);
}
}
private void HandleConnectionError(Exception? ex = null)
{
lock (_connectionLock)
{
if (_connection == null)
return;
try
{
_connection.CloseConnection(out _);
_connection.Dispose();
}
catch (Exception disposeEx)
{
Log.Debug(disposeEx, "Error disposing Historian SDK connection during error recovery");
}
_connection = null;
string? failedNode;
lock (_healthLock)
{
failedNode = _activeProcessNode;
_activeProcessNode = null;
}
if (failedNode != null)
_picker.MarkFailed(failedNode, ex?.Message ?? "mid-query failure");
Log.Warning(ex, "Historian SDK connection reset (node={Node}) — will reconnect on next request",
failedNode ?? "(unknown)");
}
}
private void EnsureEventConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(HistorianDataSource));
if (Volatile.Read(ref _eventConnection) != null)
return;
var (conn, winningNode) = ConnectToAnyHealthyNode(HistorianConnectionType.Event);
lock (_eventConnectionLock)
{
if (_disposed)
{
conn.CloseConnection(out _);
conn.Dispose();
throw new ObjectDisposedException(nameof(HistorianDataSource));
}
if (_eventConnection != null)
{
conn.CloseConnection(out _);
conn.Dispose();
return;
}
_eventConnection = conn;
lock (_healthLock)
_activeEventNode = winningNode;
Log.Information("Historian SDK event connection opened to {Server}:{Port}",
winningNode, _config.Port);
}
}
private void HandleEventConnectionError(Exception? ex = null)
{
lock (_eventConnectionLock)
{
if (_eventConnection == null)
return;
try
{
_eventConnection.CloseConnection(out _);
_eventConnection.Dispose();
}
catch (Exception disposeEx)
{
Log.Debug(disposeEx, "Error disposing Historian SDK event connection during error recovery");
}
_eventConnection = null;
string? failedNode;
lock (_healthLock)
{
failedNode = _activeEventNode;
_activeEventNode = null;
}
if (failedNode != null)
_picker.MarkFailed(failedNode, ex?.Message ?? "mid-query failure");
Log.Warning(ex, "Historian SDK event connection reset (node={Node}) — will reconnect on next request",
failedNode ?? "(unknown)");
}
}
/// <inheritdoc />
public Task<List<DataValue>> ReadRawAsync(
string tagName, DateTime startTime, DateTime endTime, int maxValues,
CancellationToken ct = default)
{
var results = new List<DataValue>();
try
{
EnsureConnected();
using var query = _connection!.CreateHistoryQuery();
var args = new HistoryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = startTime,
EndDateTime = endTime,
RetrievalMode = HistorianRetrievalMode.Full
};
if (maxValues > 0)
args.BatchSize = (uint)maxValues;
else if (_config.MaxValuesPerRead > 0)
args.BatchSize = (uint)_config.MaxValuesPerRead;
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK raw query start failed for {Tag}: {Error}", tagName, error.ErrorCode);
RecordFailure($"raw StartQuery: {error.ErrorCode}");
HandleConnectionError();
return Task.FromResult(results);
}
var count = 0;
var limit = maxValues > 0 ? maxValues : _config.MaxValuesPerRead;
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
var result = query.QueryResult;
var timestamp = DateTime.SpecifyKind(result.StartDateTime, DateTimeKind.Utc);
object? value;
if (!string.IsNullOrEmpty(result.StringValue) && result.Value == 0)
value = result.StringValue;
else
value = result.Value;
var quality = (byte)(result.OpcQuality & 0xFF);
results.Add(new DataValue
{
Value = new Variant(value),
SourceTimestamp = timestamp,
ServerTimestamp = timestamp,
StatusCode = QualityMapper.MapToOpcUaStatusCode(QualityMapper.MapFromMxAccessQuality(quality))
});
count++;
if (limit > 0 && count >= limit)
break;
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException)
{
throw;
}
catch (ObjectDisposedException)
{
throw;
}
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead raw failed for {Tag}", tagName);
RecordFailure($"raw: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead raw: {Tag} returned {Count} values ({Start} to {End})",
tagName, results.Count, startTime, endTime);
return Task.FromResult(results);
}
/// <inheritdoc />
public Task<List<DataValue>> ReadAggregateAsync(
string tagName, DateTime startTime, DateTime endTime,
double intervalMs, string aggregateColumn,
CancellationToken ct = default)
{
var results = new List<DataValue>();
try
{
EnsureConnected();
using var query = _connection!.CreateAnalogSummaryQuery();
var args = new AnalogSummaryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = startTime,
EndDateTime = endTime,
Resolution = (ulong)intervalMs
};
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK aggregate query start failed for {Tag}: {Error}", tagName,
error.ErrorCode);
RecordFailure($"aggregate StartQuery: {error.ErrorCode}");
HandleConnectionError();
return Task.FromResult(results);
}
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
var result = query.QueryResult;
var timestamp = DateTime.SpecifyKind(result.StartDateTime, DateTimeKind.Utc);
var value = ExtractAggregateValue(result, aggregateColumn);
results.Add(new DataValue
{
Value = new Variant(value),
SourceTimestamp = timestamp,
ServerTimestamp = timestamp,
StatusCode = value != null ? StatusCodes.Good : StatusCodes.BadNoData
});
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException)
{
throw;
}
catch (ObjectDisposedException)
{
throw;
}
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead aggregate failed for {Tag}", tagName);
RecordFailure($"aggregate: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead aggregate ({Aggregate}): {Tag} returned {Count} values",
aggregateColumn, tagName, results.Count);
return Task.FromResult(results);
}
/// <inheritdoc />
public Task<List<DataValue>> ReadAtTimeAsync(
string tagName, DateTime[] timestamps,
CancellationToken ct = default)
{
var results = new List<DataValue>();
if (timestamps == null || timestamps.Length == 0)
return Task.FromResult(results);
try
{
EnsureConnected();
foreach (var timestamp in timestamps)
{
ct.ThrowIfCancellationRequested();
using var query = _connection!.CreateHistoryQuery();
var args = new HistoryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = timestamp,
EndDateTime = timestamp,
RetrievalMode = HistorianRetrievalMode.Interpolated,
BatchSize = 1
};
if (!query.StartQuery(args, out var error))
{
results.Add(new DataValue
{
Value = Variant.Null,
SourceTimestamp = timestamp,
ServerTimestamp = timestamp,
StatusCode = StatusCodes.BadNoData
});
continue;
}
if (query.MoveNext(out error))
{
var result = query.QueryResult;
object? value;
if (!string.IsNullOrEmpty(result.StringValue) && result.Value == 0)
value = result.StringValue;
else
value = result.Value;
var quality = (byte)(result.OpcQuality & 0xFF);
results.Add(new DataValue
{
Value = new Variant(value),
SourceTimestamp = timestamp,
ServerTimestamp = timestamp,
StatusCode = QualityMapper.MapToOpcUaStatusCode(
QualityMapper.MapFromMxAccessQuality(quality))
});
}
else
{
results.Add(new DataValue
{
Value = Variant.Null,
SourceTimestamp = timestamp,
ServerTimestamp = timestamp,
StatusCode = StatusCodes.BadNoData
});
}
query.EndQuery(out _);
}
RecordSuccess();
}
catch (OperationCanceledException)
{
throw;
}
catch (ObjectDisposedException)
{
throw;
}
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead at-time failed for {Tag}", tagName);
RecordFailure($"at-time: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead at-time: {Tag} returned {Count} values for {Timestamps} timestamps",
tagName, results.Count, timestamps.Length);
return Task.FromResult(results);
}
/// <inheritdoc />
public Task<List<HistorianEventDto>> ReadEventsAsync(
string? sourceName, DateTime startTime, DateTime endTime, int maxEvents,
CancellationToken ct = default)
{
var results = new List<HistorianEventDto>();
try
{
EnsureEventConnected();
using var query = _eventConnection!.CreateEventQuery();
var args = new EventQueryArgs
{
StartDateTime = startTime,
EndDateTime = endTime,
EventCount = maxEvents > 0 ? (uint)maxEvents : (uint)_config.MaxValuesPerRead,
QueryType = HistorianEventQueryType.Events,
EventOrder = HistorianEventOrder.Ascending
};
if (!string.IsNullOrEmpty(sourceName))
{
query.AddEventFilter("Source", HistorianComparisionType.Equal, sourceName, out _);
}
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK event query start failed: {Error}", error.ErrorCode);
RecordFailure($"events StartQuery: {error.ErrorCode}");
HandleEventConnectionError();
return Task.FromResult(results);
}
var count = 0;
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
results.Add(ToDto(query.QueryResult));
count++;
if (maxEvents > 0 && count >= maxEvents)
break;
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException)
{
throw;
}
catch (ObjectDisposedException)
{
throw;
}
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead events failed for source {Source}", sourceName ?? "(all)");
RecordFailure($"events: {ex.Message}");
HandleEventConnectionError(ex);
}
Log.Debug("HistoryRead events: source={Source} returned {Count} events ({Start} to {End})",
sourceName ?? "(all)", results.Count, startTime, endTime);
return Task.FromResult(results);
}
private static HistorianEventDto ToDto(HistorianEvent evt)
{
return new HistorianEventDto
{
Id = evt.Id,
Source = evt.Source,
EventTime = evt.EventTime,
ReceivedTime = evt.ReceivedTime,
DisplayText = evt.DisplayText,
Severity = (ushort)evt.Severity
};
}
/// <summary>
/// Extracts the requested aggregate value from an <see cref="AnalogSummaryQueryResult"/> by column name.
/// </summary>
internal static double? ExtractAggregateValue(AnalogSummaryQueryResult result, string column)
{
switch (column)
{
case "Average": return result.Average;
case "Minimum": return result.Minimum;
case "Maximum": return result.Maximum;
case "ValueCount": return result.ValueCount;
case "First": return result.First;
case "Last": return result.Last;
case "StdDev": return result.StdDev;
default: return null;
}
}
/// <summary>
/// Closes the Historian SDK connection and releases resources.
/// </summary>
public void Dispose()
{
if (_disposed)
return;
_disposed = true;
try
{
_connection?.CloseConnection(out _);
_connection?.Dispose();
}
catch (Exception ex)
{
Log.Warning(ex, "Error closing Historian SDK connection");
}
try
{
_eventConnection?.CloseConnection(out _);
_eventConnection?.Dispose();
}
catch (Exception ex)
{
Log.Warning(ex, "Error closing Historian SDK event connection");
}
_connection = null;
_eventConnection = null;
}
}
}

View File

@@ -1,81 +0,0 @@
using System;
using System.Threading;
using ArchestrA;
using ZB.MOM.WW.OtOpcUa.Host.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Historian.Aveva
{
/// <summary>
/// Creates and opens Historian SDK connections. Extracted so tests can inject
/// fakes that control connection success, failure, and timeout behavior.
/// </summary>
internal interface IHistorianConnectionFactory
{
/// <summary>
/// Creates a new Historian SDK connection, opens it, and waits until it is ready.
/// Throws on connection failure or timeout.
/// </summary>
HistorianAccess CreateAndConnect(HistorianConfiguration config, HistorianConnectionType type);
}
/// <summary>
/// Production implementation that creates real Historian SDK connections.
/// </summary>
internal sealed class SdkHistorianConnectionFactory : IHistorianConnectionFactory
{
public HistorianAccess CreateAndConnect(HistorianConfiguration config, HistorianConnectionType type)
{
var conn = new HistorianAccess();
var args = new HistorianConnectionArgs
{
ServerName = config.ServerName,
TcpPort = (ushort)config.Port,
IntegratedSecurity = config.IntegratedSecurity,
UseArchestrAUser = config.IntegratedSecurity,
ConnectionType = type,
ReadOnly = true,
PacketTimeout = (uint)(config.CommandTimeoutSeconds * 1000)
};
if (!config.IntegratedSecurity)
{
args.UserName = config.UserName ?? string.Empty;
args.Password = config.Password ?? string.Empty;
}
if (!conn.OpenConnection(args, out var error))
{
conn.Dispose();
throw new InvalidOperationException(
$"Failed to open Historian SDK connection to {config.ServerName}:{config.Port}: {error.ErrorCode}");
}
// The SDK connects asynchronously — poll until the connection is ready
var timeoutMs = config.CommandTimeoutSeconds * 1000;
var elapsed = 0;
while (elapsed < timeoutMs)
{
var status = new HistorianConnectionStatus();
conn.GetConnectionStatus(ref status);
if (status.ConnectedToServer)
return conn;
if (status.ErrorOccurred)
{
conn.Dispose();
throw new InvalidOperationException(
$"Historian SDK connection failed: {status.Error}");
}
Thread.Sleep(250);
elapsed += 250;
}
conn.Dispose();
throw new TimeoutException(
$"Historian SDK connection to {config.ServerName}:{config.Port} timed out after {config.CommandTimeoutSeconds}s");
}
}
}

View File

@@ -1,93 +0,0 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net48</TargetFramework>
<PlatformTarget>x86</PlatformTarget>
<LangVersion>9.0</LangVersion>
<Nullable>enable</Nullable>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Historian.Aveva</RootNamespace>
<AssemblyName>ZB.MOM.WW.OtOpcUa.Historian.Aveva</AssemblyName>
<!-- Plugin is loaded at runtime via Assembly.LoadFrom; never copy it as a CopyLocal dep. -->
<CopyLocalLockFileAssemblies>false</CopyLocalLockFileAssemblies>
<!-- Deploy next to Host.exe under bin/<cfg>/Historian/ so F5 works without a manual copy. -->
<HistorianPluginOutputDir>$(MSBuildThisFileDirectory)..\ZB.MOM.WW.OtOpcUa.Host\bin\$(Configuration)\net48\Historian\</HistorianPluginOutputDir>
<!--
Phase 2 Stream D — V1 ARCHIVE. Plugs into the legacy in-process Host's
Wonderware Historian loader. Will be ported into Driver.Galaxy.Host's
Backend/Historian/ subtree when MxAccessGalaxyBackend.HistoryReadAsync is
wired (Task B.1.h follow-up). See docs/v2/V1_ARCHIVE_STATUS.md.
-->
</PropertyGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests"/>
</ItemGroup>
<ItemGroup>
<!-- Logging -->
<PackageReference Include="Serilog" Version="2.10.0"/>
<!-- OPC UA (for DataValue/StatusCodes used by the IHistorianDataSource surface) -->
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Server" Version="1.5.374.126"/>
</ItemGroup>
<ItemGroup>
<!-- Private=false: the plugin binds to Host types at compile time but Host.exe must not be
copied into the plugin's output folder (it is already in the process). -->
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Host\ZB.MOM.WW.OtOpcUa.Host.csproj">
<Private>false</Private>
<ReferenceOutputAssembly>true</ReferenceOutputAssembly>
</ProjectReference>
</ItemGroup>
<ItemGroup>
<!-- Wonderware Historian SDK -->
<Reference Include="aahClientManaged">
<HintPath>..\..\lib\aahClientManaged.dll</HintPath>
<EmbedInteropTypes>false</EmbedInteropTypes>
</Reference>
<Reference Include="aahClientCommon">
<HintPath>..\..\lib\aahClientCommon.dll</HintPath>
<EmbedInteropTypes>false</EmbedInteropTypes>
</Reference>
</ItemGroup>
<ItemGroup>
<!-- Historian SDK native dependencies — copied beside the plugin DLL so the AssemblyResolve
handler in HistorianPluginLoader can find them when the plugin first JITs. -->
<None Include="..\..\lib\aahClient.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\aahClientCommon.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\aahClientManaged.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\Historian.CBE.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\Historian.DPAPI.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\ArchestrA.CloudHistorian.Contract.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
<Target Name="StageHistorianPluginForHost" AfterTargets="Build">
<ItemGroup>
<_HistorianStageFiles Include="$(OutDir)aahClient.dll"/>
<_HistorianStageFiles Include="$(OutDir)aahClientCommon.dll"/>
<_HistorianStageFiles Include="$(OutDir)aahClientManaged.dll"/>
<_HistorianStageFiles Include="$(OutDir)Historian.CBE.dll"/>
<_HistorianStageFiles Include="$(OutDir)Historian.DPAPI.dll"/>
<_HistorianStageFiles Include="$(OutDir)ArchestrA.CloudHistorian.Contract.dll"/>
<_HistorianStageFiles Include="$(OutDir)$(AssemblyName).dll"/>
<_HistorianStageFiles Include="$(OutDir)$(AssemblyName).pdb" Condition="Exists('$(OutDir)$(AssemblyName).pdb')"/>
</ItemGroup>
<MakeDir Directories="$(HistorianPluginOutputDir)"/>
<Copy SourceFiles="@(_HistorianStageFiles)" DestinationFolder="$(HistorianPluginOutputDir)" SkipUnchangedFiles="true"/>
</Target>
</Project>

View File

@@ -1,27 +0,0 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Configures the template-based alarm object filter under <c>OpcUa.AlarmFilter</c>.
/// </summary>
/// <remarks>
/// Each entry in <see cref="ObjectFilters"/> is a wildcard pattern matched against the template
/// derivation chain of every Galaxy object. Supported wildcard: <c>*</c>. Matching is case-insensitive
/// and the leading <c>$</c> used by Galaxy template tag_names is normalized away, so operators can
/// write <c>TestMachine*</c> instead of <c>$TestMachine*</c>. An entry may itself contain comma-separated
/// patterns for convenience (e.g., <c>"TestMachine*, Pump_*"</c>). An empty list disables the filter,
/// restoring current behavior: all alarm-bearing objects are monitored when
/// <see cref="OpcUaConfiguration.AlarmTrackingEnabled"/> is <see langword="true"/>.
/// </remarks>
public class AlarmFilterConfiguration
{
/// <summary>
/// Gets or sets the wildcard patterns that select which Galaxy objects contribute alarm conditions.
/// An object is included when any template in its derivation chain matches any pattern, and the
/// inclusion propagates to all descendants in the containment hierarchy. Each object is evaluated
/// once: overlapping matches never create duplicate alarm subscriptions.
/// </summary>
public List<string> ObjectFilters { get; set; } = new();
}
}

View File

@@ -1,48 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Top-level configuration holder binding all sections from appsettings.json. (SVC-003)
/// </summary>
public class AppConfiguration
{
/// <summary>
/// Gets or sets the OPC UA endpoint settings exposed to downstream clients that browse the LMX address space.
/// </summary>
public OpcUaConfiguration OpcUa { get; set; } = new();
/// <summary>
/// Gets or sets the MXAccess runtime connection settings used to read and write live Galaxy attributes.
/// </summary>
public MxAccessConfiguration MxAccess { get; set; } = new();
/// <summary>
/// Gets or sets the repository settings used to query Galaxy metadata for address-space construction.
/// </summary>
public GalaxyRepositoryConfiguration GalaxyRepository { get; set; } = new();
/// <summary>
/// Gets or sets the embedded dashboard settings used to surface service health to operators.
/// </summary>
public DashboardConfiguration Dashboard { get; set; } = new();
/// <summary>
/// Gets or sets the Wonderware Historian connection settings used to serve OPC UA historical data.
/// </summary>
public HistorianConfiguration Historian { get; set; } = new();
/// <summary>
/// Gets or sets the authentication and role-based access control settings.
/// </summary>
public AuthenticationConfiguration Authentication { get; set; } = new();
/// <summary>
/// Gets or sets the transport security settings that control which OPC UA security profiles are exposed.
/// </summary>
public SecurityProfileConfiguration Security { get; set; } = new();
/// <summary>
/// Gets or sets the redundancy settings that control how this server participates in a redundant pair.
/// </summary>
public RedundancyConfiguration Redundancy { get; set; } = new();
}
}

View File

@@ -1,25 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Authentication and role-based access control settings for the OPC UA server.
/// </summary>
public class AuthenticationConfiguration
{
/// <summary>
/// Gets or sets a value indicating whether anonymous OPC UA connections are accepted.
/// </summary>
public bool AllowAnonymous { get; set; } = true;
/// <summary>
/// Gets or sets a value indicating whether anonymous users can write tag values.
/// When false, only authenticated users can write. Existing security classification restrictions still apply.
/// </summary>
public bool AnonymousCanWrite { get; set; } = true;
/// <summary>
/// Gets or sets the LDAP authentication settings. When Ldap.Enabled is true,
/// credentials are validated against the LDAP server and group membership determines permissions.
/// </summary>
public LdapConfiguration Ldap { get; set; } = new();
}
}

View File

@@ -1,314 +0,0 @@
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using Opc.Ua;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Host.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Validates and logs effective configuration at startup. (SVC-003, SVC-005)
/// </summary>
public static class ConfigurationValidator
{
private static readonly ILogger Log = Serilog.Log.ForContext(typeof(ConfigurationValidator));
/// <summary>
/// Validates the effective host configuration and writes the resolved values to the startup log before service
/// initialization continues.
/// </summary>
/// <param name="config">
/// The bound service configuration that drives OPC UA hosting, MXAccess connectivity, Galaxy queries,
/// and dashboard behavior.
/// </param>
/// <returns>
/// <see langword="true" /> when the required settings are present and within supported bounds; otherwise,
/// <see langword="false" />.
/// </returns>
public static bool ValidateAndLog(AppConfiguration config)
{
var valid = true;
Log.Information("=== Effective Configuration ===");
// OPC UA
Log.Information(
"OpcUa.BindAddress={BindAddress}, Port={Port}, EndpointPath={EndpointPath}, ServerName={ServerName}, GalaxyName={GalaxyName}",
config.OpcUa.BindAddress, config.OpcUa.Port, config.OpcUa.EndpointPath, config.OpcUa.ServerName,
config.OpcUa.GalaxyName);
Log.Information("OpcUa.MaxSessions={MaxSessions}, SessionTimeoutMinutes={SessionTimeout}",
config.OpcUa.MaxSessions, config.OpcUa.SessionTimeoutMinutes);
if (config.OpcUa.Port < 1 || config.OpcUa.Port > 65535)
{
Log.Error("OpcUa.Port must be between 1 and 65535");
valid = false;
}
if (string.IsNullOrWhiteSpace(config.OpcUa.GalaxyName))
{
Log.Error("OpcUa.GalaxyName must not be empty");
valid = false;
}
// Alarm filter
var alarmFilterCount = config.OpcUa.AlarmFilter?.ObjectFilters?.Count ?? 0;
Log.Information(
"OpcUa.AlarmTrackingEnabled={AlarmEnabled}, AlarmFilter.ObjectFilters=[{Filters}]",
config.OpcUa.AlarmTrackingEnabled,
alarmFilterCount == 0 ? "(none)" : string.Join(", ", config.OpcUa.AlarmFilter!.ObjectFilters));
if (alarmFilterCount > 0 && !config.OpcUa.AlarmTrackingEnabled)
Log.Warning(
"OpcUa.AlarmFilter.ObjectFilters has {Count} patterns but OpcUa.AlarmTrackingEnabled is false — filter will have no effect",
alarmFilterCount);
// MxAccess
Log.Information(
"MxAccess.ClientName={ClientName}, ReadTimeout={ReadTimeout}s, WriteTimeout={WriteTimeout}s, MaxConcurrent={MaxConcurrent}",
config.MxAccess.ClientName, config.MxAccess.ReadTimeoutSeconds, config.MxAccess.WriteTimeoutSeconds,
config.MxAccess.MaxConcurrentOperations);
Log.Information(
"MxAccess.MonitorInterval={MonitorInterval}s, AutoReconnect={AutoReconnect}, ProbeTag={ProbeTag}, ProbeStaleThreshold={ProbeStale}s",
config.MxAccess.MonitorIntervalSeconds, config.MxAccess.AutoReconnect,
config.MxAccess.ProbeTag ?? "(none)", config.MxAccess.ProbeStaleThresholdSeconds);
Log.Information(
"MxAccess.RuntimeStatusProbesEnabled={Enabled}, RuntimeStatusUnknownTimeoutSeconds={Timeout}s, RequestTimeoutSeconds={RequestTimeout}s",
config.MxAccess.RuntimeStatusProbesEnabled, config.MxAccess.RuntimeStatusUnknownTimeoutSeconds,
config.MxAccess.RequestTimeoutSeconds);
if (string.IsNullOrWhiteSpace(config.MxAccess.ClientName))
{
Log.Error("MxAccess.ClientName must not be empty");
valid = false;
}
if (config.MxAccess.RuntimeStatusUnknownTimeoutSeconds < 5)
Log.Warning(
"MxAccess.RuntimeStatusUnknownTimeoutSeconds={Timeout} is below the recommended floor of 5s; initial probe resolution may time out before MxAccess has delivered the first callback",
config.MxAccess.RuntimeStatusUnknownTimeoutSeconds);
if (config.MxAccess.RequestTimeoutSeconds < 1)
{
Log.Error("MxAccess.RequestTimeoutSeconds must be at least 1");
valid = false;
}
else if (config.MxAccess.RequestTimeoutSeconds <
Math.Max(config.MxAccess.ReadTimeoutSeconds, config.MxAccess.WriteTimeoutSeconds))
{
Log.Warning(
"MxAccess.RequestTimeoutSeconds={RequestTimeout} is below Read/Write inner timeouts ({Read}s/{Write}s); outer safety bound may fire before the inner client completes its own error path",
config.MxAccess.RequestTimeoutSeconds,
config.MxAccess.ReadTimeoutSeconds, config.MxAccess.WriteTimeoutSeconds);
}
// Galaxy Repository
Log.Information(
"GalaxyRepository.ConnectionString={ConnectionString}, ChangeDetectionInterval={ChangeInterval}s, CommandTimeout={CmdTimeout}s, ExtendedAttributes={ExtendedAttributes}",
SanitizeConnectionString(config.GalaxyRepository.ConnectionString), config.GalaxyRepository.ChangeDetectionIntervalSeconds,
config.GalaxyRepository.CommandTimeoutSeconds, config.GalaxyRepository.ExtendedAttributes);
var effectivePlatformName = string.IsNullOrWhiteSpace(config.GalaxyRepository.PlatformName)
? Environment.MachineName
: config.GalaxyRepository.PlatformName;
Log.Information(
"GalaxyRepository.Scope={Scope}, PlatformName={PlatformName}",
config.GalaxyRepository.Scope,
config.GalaxyRepository.Scope == GalaxyScope.LocalPlatform
? effectivePlatformName
: "(n/a)");
if (config.GalaxyRepository.Scope == GalaxyScope.LocalPlatform &&
string.IsNullOrWhiteSpace(config.GalaxyRepository.PlatformName))
Log.Information(
"GalaxyRepository.PlatformName not set — using Environment.MachineName '{MachineName}'",
Environment.MachineName);
if (string.IsNullOrWhiteSpace(config.GalaxyRepository.ConnectionString))
{
Log.Error("GalaxyRepository.ConnectionString must not be empty");
valid = false;
}
// Dashboard
Log.Information("Dashboard.Enabled={Enabled}, Port={Port}, RefreshInterval={Refresh}s",
config.Dashboard.Enabled, config.Dashboard.Port, config.Dashboard.RefreshIntervalSeconds);
// Security
Log.Information(
"Security.Profiles=[{Profiles}], AutoAcceptClientCertificates={AutoAccept}, RejectSHA1={RejectSHA1}, MinKeySize={MinKeySize}",
string.Join(", ", config.Security.Profiles), config.Security.AutoAcceptClientCertificates,
config.Security.RejectSHA1Certificates, config.Security.MinimumCertificateKeySize);
Log.Information("Security.PkiRootPath={PkiRootPath}", config.Security.PkiRootPath ?? "(default)");
Log.Information("Security.CertificateSubject={CertificateSubject}", config.Security.CertificateSubject ?? "(default)");
Log.Information("Security.CertificateLifetimeMonths={Months}", config.Security.CertificateLifetimeMonths);
var unknownProfiles = config.Security.Profiles
.Where(p => !SecurityProfileResolver.ValidProfileNames.Contains(p, StringComparer.OrdinalIgnoreCase))
.ToList();
if (unknownProfiles.Count > 0)
Log.Warning("Unknown security profile(s): {Profiles}. Valid values: {ValidProfiles}",
string.Join(", ", unknownProfiles), string.Join(", ", SecurityProfileResolver.ValidProfileNames));
if (config.Security.MinimumCertificateKeySize < 2048)
{
Log.Error("Security.MinimumCertificateKeySize must be at least 2048");
valid = false;
}
if (config.Security.AutoAcceptClientCertificates)
Log.Warning(
"Security.AutoAcceptClientCertificates is enabled — client certificate trust is not enforced. Set to false in production");
if (config.Security.Profiles.Count == 1 &&
config.Security.Profiles[0].Equals("None", StringComparison.OrdinalIgnoreCase))
Log.Warning("Only the 'None' security profile is configured — transport security is disabled");
// Historian
var clusterNodes = config.Historian.ServerNames ?? new List<string>();
var effectiveNodes = clusterNodes.Count > 0
? string.Join(",", clusterNodes)
: config.Historian.ServerName;
Log.Information(
"Historian.Enabled={Enabled}, Nodes=[{Nodes}], IntegratedSecurity={IntegratedSecurity}, Port={Port}",
config.Historian.Enabled, effectiveNodes, config.Historian.IntegratedSecurity,
config.Historian.Port);
Log.Information(
"Historian.CommandTimeoutSeconds={Timeout}, MaxValuesPerRead={MaxValues}, FailureCooldownSeconds={Cooldown}, RequestTimeoutSeconds={RequestTimeout}",
config.Historian.CommandTimeoutSeconds, config.Historian.MaxValuesPerRead,
config.Historian.FailureCooldownSeconds, config.Historian.RequestTimeoutSeconds);
if (config.Historian.Enabled)
{
if (clusterNodes.Count == 0 && string.IsNullOrWhiteSpace(config.Historian.ServerName))
{
Log.Error("Historian.ServerName (or ServerNames) must not be empty when Historian is enabled");
valid = false;
}
if (config.Historian.FailureCooldownSeconds < 0)
{
Log.Error("Historian.FailureCooldownSeconds must be zero or positive");
valid = false;
}
if (config.Historian.RequestTimeoutSeconds < 1)
{
Log.Error("Historian.RequestTimeoutSeconds must be at least 1");
valid = false;
}
else if (config.Historian.RequestTimeoutSeconds < config.Historian.CommandTimeoutSeconds)
{
Log.Warning(
"Historian.RequestTimeoutSeconds={RequestTimeout} is below CommandTimeoutSeconds={CmdTimeout}; outer safety bound may fire before the inner SDK completes its own error path",
config.Historian.RequestTimeoutSeconds, config.Historian.CommandTimeoutSeconds);
}
if (clusterNodes.Count > 0 && !string.IsNullOrWhiteSpace(config.Historian.ServerName)
&& config.Historian.ServerName != "localhost")
Log.Warning(
"Historian.ServerName='{ServerName}' is ignored because Historian.ServerNames has {Count} entries",
config.Historian.ServerName, clusterNodes.Count);
if (config.Historian.Port < 1 || config.Historian.Port > 65535)
{
Log.Error("Historian.Port must be between 1 and 65535");
valid = false;
}
if (!config.Historian.IntegratedSecurity && string.IsNullOrWhiteSpace(config.Historian.UserName))
{
Log.Error("Historian.UserName must not be empty when IntegratedSecurity is disabled");
valid = false;
}
if (!config.Historian.IntegratedSecurity && string.IsNullOrWhiteSpace(config.Historian.Password))
Log.Warning("Historian.Password is empty — authentication may fail");
}
// Authentication
Log.Information("Authentication.AllowAnonymous={AllowAnonymous}, AnonymousCanWrite={AnonymousCanWrite}",
config.Authentication.AllowAnonymous, config.Authentication.AnonymousCanWrite);
if (config.Authentication.Ldap.Enabled)
{
Log.Information("Authentication.Ldap.Enabled=true, Host={Host}, Port={Port}, BaseDN={BaseDN}",
config.Authentication.Ldap.Host, config.Authentication.Ldap.Port,
config.Authentication.Ldap.BaseDN);
Log.Information(
"Authentication.Ldap groups: ReadOnly={ReadOnly}, WriteOperate={WriteOperate}, WriteTune={WriteTune}, WriteConfigure={WriteConfigure}, AlarmAck={AlarmAck}",
config.Authentication.Ldap.ReadOnlyGroup, config.Authentication.Ldap.WriteOperateGroup,
config.Authentication.Ldap.WriteTuneGroup, config.Authentication.Ldap.WriteConfigureGroup,
config.Authentication.Ldap.AlarmAckGroup);
if (string.IsNullOrWhiteSpace(config.Authentication.Ldap.ServiceAccountDn))
Log.Warning("Authentication.Ldap.ServiceAccountDn is empty — group lookups will fail");
}
// Redundancy
if (config.OpcUa.ApplicationUri != null)
Log.Information("OpcUa.ApplicationUri={ApplicationUri}", config.OpcUa.ApplicationUri);
Log.Information(
"Redundancy.Enabled={Enabled}, Mode={Mode}, Role={Role}, ServiceLevelBase={ServiceLevelBase}",
config.Redundancy.Enabled, config.Redundancy.Mode, config.Redundancy.Role,
config.Redundancy.ServiceLevelBase);
if (config.Redundancy.ServerUris.Count > 0)
Log.Information("Redundancy.ServerUris=[{ServerUris}]",
string.Join(", ", config.Redundancy.ServerUris));
if (config.Redundancy.Enabled)
{
if (string.IsNullOrWhiteSpace(config.OpcUa.ApplicationUri))
{
Log.Error(
"OpcUa.ApplicationUri must be set when redundancy is enabled — each instance needs a unique identity");
valid = false;
}
if (config.Redundancy.ServerUris.Count < 2)
Log.Warning(
"Redundancy.ServerUris contains fewer than 2 entries — a redundant set typically has at least 2 servers");
if (config.OpcUa.ApplicationUri != null &&
!config.Redundancy.ServerUris.Contains(config.OpcUa.ApplicationUri))
Log.Warning("Local OpcUa.ApplicationUri '{ApplicationUri}' is not listed in Redundancy.ServerUris",
config.OpcUa.ApplicationUri);
var mode = RedundancyModeResolver.Resolve(config.Redundancy.Mode, true);
if (mode == RedundancySupport.None)
Log.Warning("Redundancy is enabled but Mode '{Mode}' is not recognized — will fall back to None",
config.Redundancy.Mode);
}
if (config.Redundancy.ServiceLevelBase < 1 || config.Redundancy.ServiceLevelBase > 255)
{
Log.Error("Redundancy.ServiceLevelBase must be between 1 and 255");
valid = false;
}
Log.Information("=== Configuration {Status} ===", valid ? "Valid" : "INVALID");
return valid;
}
private static string SanitizeConnectionString(string connectionString)
{
if (string.IsNullOrWhiteSpace(connectionString))
return "(empty)";
try
{
var builder = new SqlConnectionStringBuilder(connectionString);
if (!string.IsNullOrEmpty(builder.Password))
builder.Password = "********";
return builder.ConnectionString;
}
catch
{
return "(unparseable)";
}
}
}
}

View File

@@ -1,23 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Status dashboard configuration. (SVC-003, DASH-001)
/// </summary>
public class DashboardConfiguration
{
/// <summary>
/// Gets or sets a value indicating whether the operator dashboard is hosted alongside the OPC UA service.
/// </summary>
public bool Enabled { get; set; } = true;
/// <summary>
/// Gets or sets the HTTP port used by the dashboard endpoint that exposes service health and rebuild state.
/// </summary>
public int Port { get; set; } = 8081;
/// <summary>
/// Gets or sets the refresh interval, in seconds, for recalculating the dashboard status snapshot.
/// </summary>
public int RefreshIntervalSeconds { get; set; } = 10;
}
}

View File

@@ -1,42 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Galaxy repository database configuration. (SVC-003, GR-005)
/// </summary>
public class GalaxyRepositoryConfiguration
{
/// <summary>
/// Gets or sets the database connection string used to read Galaxy hierarchy and attribute metadata.
/// </summary>
public string ConnectionString { get; set; } = "Server=localhost;Database=ZB;Integrated Security=true;";
/// <summary>
/// Gets or sets how often, in seconds, the service polls for Galaxy deploy changes that require an address-space
/// rebuild.
/// </summary>
public int ChangeDetectionIntervalSeconds { get; set; } = 30;
/// <summary>
/// Gets or sets the SQL command timeout, in seconds, for repository queries against the Galaxy catalog.
/// </summary>
public int CommandTimeoutSeconds { get; set; } = 30;
/// <summary>
/// Gets or sets a value indicating whether extended Galaxy attribute metadata should be loaded into the OPC UA model.
/// </summary>
public bool ExtendedAttributes { get; set; } = false;
/// <summary>
/// Gets or sets the scope of Galaxy objects loaded into the OPC UA address space.
/// <c>Galaxy</c> loads all deployed objects (default). <c>LocalPlatform</c> loads only
/// objects hosted by the platform deployed on this machine.
/// </summary>
public GalaxyScope Scope { get; set; } = GalaxyScope.Galaxy;
/// <summary>
/// Gets or sets an explicit platform node name for <see cref="GalaxyScope.LocalPlatform" /> filtering.
/// When <see langword="null" />, the local machine name (<c>Environment.MachineName</c>) is used.
/// </summary>
public string? PlatformName { get; set; }
}
}

View File

@@ -1,18 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Controls how much of the Galaxy object hierarchy is loaded into the OPC UA address space.
/// </summary>
public enum GalaxyScope
{
/// <summary>
/// Load all deployed objects from the entire Galaxy (default, backward-compatible behavior).
/// </summary>
Galaxy,
/// <summary>
/// Load only objects hosted by the local platform and the structural areas needed to reach them.
/// </summary>
LocalPlatform
}
}

View File

@@ -1,76 +0,0 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Wonderware Historian SDK configuration for OPC UA historical data access.
/// </summary>
public class HistorianConfiguration
{
/// <summary>
/// Gets or sets a value indicating whether OPC UA historical data access is enabled.
/// </summary>
public bool Enabled { get; set; } = false;
/// <summary>
/// Gets or sets the single Historian server hostname used when <see cref="ServerNames"/>
/// is empty. Preserved for backward compatibility with pre-cluster deployments.
/// </summary>
public string ServerName { get; set; } = "localhost";
/// <summary>
/// Gets or sets the ordered list of Historian cluster nodes. When non-empty, this list
/// supersedes <see cref="ServerName"/>: the data source attempts each node in order on
/// connect, falling through to the next on failure. A failed node is placed in cooldown
/// for <see cref="FailureCooldownSeconds"/> before being re-eligible.
/// </summary>
public List<string> ServerNames { get; set; } = new();
/// <summary>
/// Gets or sets the cooldown window, in seconds, that a historian node is skipped after
/// a connection failure. A value of zero retries the node on every request. Default 60s.
/// </summary>
public int FailureCooldownSeconds { get; set; } = 60;
/// <summary>
/// Gets or sets a value indicating whether Windows Integrated Security is used.
/// When false, <see cref="UserName"/> and <see cref="Password"/> are used instead.
/// </summary>
public bool IntegratedSecurity { get; set; } = true;
/// <summary>
/// Gets or sets the username for Historian authentication when <see cref="IntegratedSecurity"/> is false.
/// </summary>
public string? UserName { get; set; }
/// <summary>
/// Gets or sets the password for Historian authentication when <see cref="IntegratedSecurity"/> is false.
/// </summary>
public string? Password { get; set; }
/// <summary>
/// Gets or sets the Historian server TCP port.
/// </summary>
public int Port { get; set; } = 32568;
/// <summary>
/// Gets or sets the packet timeout in seconds for Historian SDK operations.
/// </summary>
public int CommandTimeoutSeconds { get; set; } = 30;
/// <summary>
/// Gets or sets the maximum number of values returned per HistoryRead request.
/// </summary>
public int MaxValuesPerRead { get; set; } = 10000;
/// <summary>
/// Gets or sets an outer safety timeout, in seconds, applied to sync-over-async Historian
/// operations invoked from the OPC UA stack thread (HistoryReadRaw, HistoryReadProcessed,
/// HistoryReadAtTime, HistoryReadEvents). This is a backstop for the case where a
/// historian query hangs outside <see cref="CommandTimeoutSeconds"/> — e.g., a slow SDK
/// reconnect or mid-failover cluster node. Must be comfortably larger than
/// <see cref="CommandTimeoutSeconds"/> so normal operation is never affected. Default 60s.
/// </summary>
public int RequestTimeoutSeconds { get; set; } = 60;
}
}

View File

@@ -1,75 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// LDAP authentication and group-to-role mapping settings.
/// </summary>
public class LdapConfiguration
{
/// <summary>
/// Gets or sets whether LDAP authentication is enabled.
/// When true, user credentials are validated against the configured LDAP server
/// and group membership determines OPC UA permissions.
/// </summary>
public bool Enabled { get; set; } = false;
/// <summary>
/// Gets or sets the LDAP server hostname or IP address.
/// </summary>
public string Host { get; set; } = "localhost";
/// <summary>
/// Gets or sets the LDAP server port.
/// </summary>
public int Port { get; set; } = 3893;
/// <summary>
/// Gets or sets the base DN for LDAP operations.
/// </summary>
public string BaseDN { get; set; } = "dc=lmxopcua,dc=local";
/// <summary>
/// Gets or sets the bind DN template. Use {username} as a placeholder.
/// </summary>
public string BindDnTemplate { get; set; } = "cn={username},dc=lmxopcua,dc=local";
/// <summary>
/// Gets or sets the service account DN used for LDAP searches (group lookups).
/// </summary>
public string ServiceAccountDn { get; set; } = "";
/// <summary>
/// Gets or sets the service account password.
/// </summary>
public string ServiceAccountPassword { get; set; } = "";
/// <summary>
/// Gets or sets the LDAP connection timeout in seconds.
/// </summary>
public int TimeoutSeconds { get; set; } = 5;
/// <summary>
/// Gets or sets the LDAP group name that grants read-only access.
/// </summary>
public string ReadOnlyGroup { get; set; } = "ReadOnly";
/// <summary>
/// Gets or sets the LDAP group name that grants write access for FreeAccess/Operate attributes.
/// </summary>
public string WriteOperateGroup { get; set; } = "WriteOperate";
/// <summary>
/// Gets or sets the LDAP group name that grants write access for Tune attributes.
/// </summary>
public string WriteTuneGroup { get; set; } = "WriteTune";
/// <summary>
/// Gets or sets the LDAP group name that grants write access for Configure attributes.
/// </summary>
public string WriteConfigureGroup { get; set; } = "WriteConfigure";
/// <summary>
/// Gets or sets the LDAP group name that grants alarm acknowledgment access.
/// </summary>
public string AlarmAckGroup { get; set; } = "AlarmAck";
}
}

View File

@@ -1,86 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// MXAccess client configuration. (SVC-003, MXA-008, MXA-009)
/// </summary>
public class MxAccessConfiguration
{
/// <summary>
/// Gets or sets the client name registered with the MXAccess runtime for this bridge instance.
/// </summary>
public string ClientName { get; set; } = "LmxOpcUa";
/// <summary>
/// Gets or sets the Galaxy node name to target when the service connects to a specific runtime node.
/// </summary>
public string? NodeName { get; set; }
/// <summary>
/// Gets or sets the Galaxy name used when resolving MXAccess references and diagnostics.
/// </summary>
public string? GalaxyName { get; set; }
/// <summary>
/// Gets or sets the maximum time, in seconds, to wait for a live tag read to complete.
/// </summary>
public int ReadTimeoutSeconds { get; set; } = 5;
/// <summary>
/// Gets or sets the maximum time, in seconds, to wait for a tag write acknowledgment from the runtime.
/// </summary>
public int WriteTimeoutSeconds { get; set; } = 5;
/// <summary>
/// Gets or sets an outer safety timeout, in seconds, applied to sync-over-async MxAccess
/// operations invoked from the OPC UA stack thread (Read, Write, address-space rebuild probe
/// sync). This is a backstop for the case where an async path hangs outside the inner
/// <see cref="ReadTimeoutSeconds"/> / <see cref="WriteTimeoutSeconds"/> bounds — e.g., a
/// slow reconnect or a scheduler stall. Must be comfortably larger than the inner timeouts
/// so normal operation is never affected. Default 30s.
/// </summary>
public int RequestTimeoutSeconds { get; set; } = 30;
/// <summary>
/// Gets or sets the cap on concurrent MXAccess operations so the bridge does not overload the runtime.
/// </summary>
public int MaxConcurrentOperations { get; set; } = 10;
/// <summary>
/// Gets or sets how often, in seconds, the connectivity monitor probes the runtime connection.
/// </summary>
public int MonitorIntervalSeconds { get; set; } = 5;
/// <summary>
/// Gets or sets a value indicating whether the bridge should automatically attempt to re-establish a dropped MXAccess
/// session.
/// </summary>
public bool AutoReconnect { get; set; } = true;
/// <summary>
/// Gets or sets the optional probe tag used to verify that the MXAccess runtime is still returning fresh data.
/// </summary>
public string? ProbeTag { get; set; }
/// <summary>
/// Gets or sets the number of seconds a probe value may remain unchanged before the connection is considered stale.
/// </summary>
public int ProbeStaleThresholdSeconds { get; set; } = 60;
/// <summary>
/// Gets or sets a value indicating whether the bridge advises <c>&lt;ObjectName&gt;.ScanState</c> for every
/// deployed <c>$WinPlatform</c> and <c>$AppEngine</c>, reporting per-host runtime state on the status
/// dashboard and proactively invalidating OPC UA variable quality when a host transitions to Stopped.
/// Enabled by default. Disable to return to legacy behavior where host runtime state is invisible and
/// MxAccess's per-tag bad-quality fan-out is the only stop signal.
/// </summary>
public bool RuntimeStatusProbesEnabled { get; set; } = true;
/// <summary>
/// Gets or sets the maximum seconds to wait for the initial probe callback before marking a host as
/// Stopped. Only applies to the Unknown → Stopped transition. Because <c>ScanState</c> is delivered
/// on-change only, a stably Running host does not time out — no starvation check runs on Running
/// entries. Default 15s.
/// </summary>
public int RuntimeStatusUnknownTimeoutSeconds { get; set; } = 15;
}
}

View File

@@ -1,64 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// OPC UA server configuration. (SVC-003, OPC-001, OPC-012, OPC-013)
/// </summary>
public class OpcUaConfiguration
{
/// <summary>
/// Gets or sets the IP address or hostname the OPC UA server binds to.
/// Defaults to <c>0.0.0.0</c> (all interfaces). Set to a specific IP or hostname to restrict listening.
/// </summary>
public string BindAddress { get; set; } = "0.0.0.0";
/// <summary>
/// Gets or sets the TCP port on which the OPC UA server listens for client sessions.
/// </summary>
public int Port { get; set; } = 4840;
/// <summary>
/// Gets or sets the endpoint path appended to the host URI for the LMX OPC UA server.
/// </summary>
public string EndpointPath { get; set; } = "/LmxOpcUa";
/// <summary>
/// Gets or sets the server name presented to OPC UA clients and used in diagnostics.
/// </summary>
public string ServerName { get; set; } = "LmxOpcUa";
/// <summary>
/// Gets or sets the Galaxy name represented by the published OPC UA namespace.
/// </summary>
public string GalaxyName { get; set; } = "ZB";
/// <summary>
/// Gets or sets the explicit application URI for this server instance.
/// When <see langword="null" />, defaults to <c>urn:{GalaxyName}:LmxOpcUa</c>.
/// Must be set to a unique value per instance when redundancy is enabled.
/// </summary>
public string? ApplicationUri { get; set; }
/// <summary>
/// Gets or sets the maximum number of simultaneous OPC UA sessions accepted by the host.
/// </summary>
public int MaxSessions { get; set; } = 100;
/// <summary>
/// Gets or sets the session timeout, in minutes, before idle client sessions are closed.
/// </summary>
public int SessionTimeoutMinutes { get; set; } = 30;
/// <summary>
/// Gets or sets a value indicating whether alarm tracking is enabled.
/// When enabled, AlarmConditionState nodes are created for alarm attributes and InAlarm transitions are monitored.
/// </summary>
public bool AlarmTrackingEnabled { get; set; } = false;
/// <summary>
/// Gets or sets the template-based alarm object filter. When <see cref="AlarmFilterConfiguration.ObjectFilters"/>
/// is empty, all alarm-bearing objects are monitored (current behavior). When patterns are supplied, only
/// objects whose template derivation chain matches a pattern (and their descendants) have alarms monitored.
/// </summary>
public AlarmFilterConfiguration AlarmFilter { get; set; } = new();
}
}

View File

@@ -1,41 +0,0 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Non-transparent redundancy settings that control how the server advertises itself
/// within a redundant pair and computes its dynamic ServiceLevel.
/// </summary>
public class RedundancyConfiguration
{
/// <summary>
/// Gets or sets whether redundancy is enabled. When <see langword="false" /> (default),
/// the server reports <c>RedundancySupport.None</c> and <c>ServiceLevel = 255</c>.
/// </summary>
public bool Enabled { get; set; } = false;
/// <summary>
/// Gets or sets the redundancy mode. Valid values: <c>Warm</c>, <c>Hot</c>.
/// </summary>
public string Mode { get; set; } = "Warm";
/// <summary>
/// Gets or sets the role of this instance. Valid values: <c>Primary</c>, <c>Secondary</c>.
/// The primary advertises a higher ServiceLevel than the secondary when both are healthy.
/// </summary>
public string Role { get; set; } = "Primary";
/// <summary>
/// Gets or sets the ApplicationUri values for all servers in the redundant set.
/// Must include this instance's own <c>OpcUa.ApplicationUri</c>.
/// </summary>
public List<string> ServerUris { get; set; } = new();
/// <summary>
/// Gets or sets the base ServiceLevel when the server is fully healthy.
/// The secondary automatically receives <c>ServiceLevelBase - 50</c>.
/// Valid range: 1-255.
/// </summary>
public int ServiceLevelBase { get; set; } = 200;
}
}

View File

@@ -1,52 +0,0 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Host.Configuration
{
/// <summary>
/// Transport security settings that control which OPC UA security profiles the server exposes and how client
/// certificates are handled.
/// </summary>
public class SecurityProfileConfiguration
{
/// <summary>
/// Gets or sets the list of security profile names to expose as server endpoints.
/// Valid values: "None", "Basic256Sha256-Sign", "Basic256Sha256-SignAndEncrypt".
/// Defaults to ["None"] for backward compatibility.
/// </summary>
public List<string> Profiles { get; set; } = new() { "None" };
/// <summary>
/// Gets or sets a value indicating whether the server automatically accepts client certificates
/// that are not in the trusted store. Should be <see langword="false" /> in production.
/// </summary>
public bool AutoAcceptClientCertificates { get; set; } = true;
/// <summary>
/// Gets or sets a value indicating whether client certificates signed with SHA-1 are rejected.
/// </summary>
public bool RejectSHA1Certificates { get; set; } = true;
/// <summary>
/// Gets or sets the minimum RSA key size required for client certificates.
/// </summary>
public int MinimumCertificateKeySize { get; set; } = 2048;
/// <summary>
/// Gets or sets an optional override for the PKI root directory.
/// When <see langword="null" />, defaults to <c>%LOCALAPPDATA%\OPC Foundation\pki</c>.
/// </summary>
public string? PkiRootPath { get; set; }
/// <summary>
/// Gets or sets an optional override for the server certificate subject name.
/// When <see langword="null" />, defaults to <c>CN={ServerName}, O=ZB MOM, DC=localhost</c>.
/// </summary>
public string? CertificateSubject { get; set; }
/// <summary>
/// Gets or sets the lifetime of the auto-generated server certificate in months.
/// Defaults to 60 months (5 years).
/// </summary>
public int CertificateLifetimeMonths { get; set; } = 60;
}
}

View File

@@ -1,215 +0,0 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Host.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Compiles and applies wildcard template patterns against Galaxy objects to decide which
/// objects should contribute alarm conditions. The filter is pure data — no OPC UA, no DB —
/// so it is fully unit-testable with synthetic hierarchies.
/// </summary>
/// <remarks>
/// <para>Matching rules:</para>
/// <list type="bullet">
/// <item>An object is included when any template name in its derivation chain matches
/// any configured pattern.</item>
/// <item>Matching is case-insensitive and ignores the Galaxy leading <c>$</c> prefix on
/// both the chain entry and the user pattern, so <c>TestMachine*</c> matches the stored
/// <c>$TestMachine</c>.</item>
/// <item>Inclusion propagates to every descendant of a matched object (containment subtree).</item>
/// <item>Each object is evaluated once — overlapping matches never produce duplicate
/// inclusions (set semantics).</item>
/// </list>
/// <para>Pattern syntax: literal text plus <c>*</c> wildcards (zero or more characters).
/// Other regex metacharacters in the raw pattern are escaped and treated literally.</para>
/// </remarks>
public class AlarmObjectFilter
{
private static readonly ILogger Log = Serilog.Log.ForContext<AlarmObjectFilter>();
private readonly List<Regex> _patterns;
private readonly List<string> _rawPatterns;
private readonly HashSet<string> _matchedRawPatterns;
/// <summary>
/// Initializes a new alarm object filter from the supplied configuration section.
/// </summary>
/// <param name="config">The alarm filter configuration whose <see cref="AlarmFilterConfiguration.ObjectFilters"/>
/// entries are parsed into regular expressions. Entries may themselves contain comma-separated patterns.</param>
public AlarmObjectFilter(AlarmFilterConfiguration? config)
{
_patterns = new List<Regex>();
_rawPatterns = new List<string>();
_matchedRawPatterns = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
if (config?.ObjectFilters == null)
return;
foreach (var entry in config.ObjectFilters)
{
if (string.IsNullOrWhiteSpace(entry))
continue;
foreach (var piece in entry.Split(','))
{
var trimmed = piece.Trim();
if (trimmed.Length == 0)
continue;
try
{
var normalized = Normalize(trimmed);
var regex = GlobToRegex(normalized);
_patterns.Add(regex);
_rawPatterns.Add(trimmed);
}
catch (Exception ex)
{
Log.Warning(ex, "Failed to compile alarm filter pattern {Pattern} — skipping", trimmed);
}
}
}
}
/// <summary>
/// Gets a value indicating whether the filter has any compiled patterns. When <see langword="false"/>,
/// callers should treat alarm tracking as unfiltered (current behavior preserved).
/// </summary>
public bool Enabled => _patterns.Count > 0;
/// <summary>
/// Gets the number of compiled patterns the filter will evaluate against each object.
/// </summary>
public int PatternCount => _patterns.Count;
/// <summary>
/// Gets the raw pattern strings that did not match any object in the most recent call to
/// <see cref="ResolveIncludedObjects"/>. Useful for startup warnings about operator typos.
/// </summary>
public IReadOnlyList<string> UnmatchedPatterns =>
_rawPatterns.Where(p => !_matchedRawPatterns.Contains(p)).ToList();
/// <summary>
/// Gets the raw pattern strings exactly as supplied by the operator after comma-splitting
/// and trimming. Surfaced on the status dashboard so operators can confirm the active filter.
/// </summary>
public IReadOnlyList<string> RawPatterns => _rawPatterns;
/// <summary>
/// Returns <see langword="true"/> when any template name in <paramref name="chain"/> matches any
/// compiled pattern. An empty chain never matches unless the operator explicitly supplied a pattern
/// equal to <c>*</c> (which collapses to an empty-matching regex after normalization).
/// </summary>
/// <param name="chain">The template derivation chain to test (own template first, ancestors after).</param>
public bool MatchesTemplateChain(IReadOnlyList<string>? chain)
{
if (chain == null || chain.Count == 0 || _patterns.Count == 0)
return false;
for (var i = 0; i < _patterns.Count; i++)
{
var regex = _patterns[i];
for (var j = 0; j < chain.Count; j++)
{
var entry = chain[j];
if (string.IsNullOrEmpty(entry))
continue;
if (regex.IsMatch(Normalize(entry)))
{
_matchedRawPatterns.Add(_rawPatterns[i]);
return true;
}
}
}
return false;
}
/// <summary>
/// Walks the hierarchy top-down from each root and returns the set of gobject IDs whose alarms
/// should be monitored, honoring both template matching and descendant propagation. Returns
/// <see langword="null"/> when the filter is disabled so callers can skip the containment check
/// entirely.
/// </summary>
/// <param name="hierarchy">The full deployed Galaxy hierarchy, as returned by the repository service.</param>
/// <returns>The set of included gobject IDs, or <see langword="null"/> when filtering is disabled.</returns>
public HashSet<int>? ResolveIncludedObjects(IReadOnlyList<GalaxyObjectInfo>? hierarchy)
{
if (!Enabled)
return null;
_matchedRawPatterns.Clear();
var included = new HashSet<int>();
if (hierarchy == null || hierarchy.Count == 0)
return included;
var byId = new Dictionary<int, GalaxyObjectInfo>(hierarchy.Count);
foreach (var obj in hierarchy)
byId[obj.GobjectId] = obj;
var childrenByParent = new Dictionary<int, List<int>>();
foreach (var obj in hierarchy)
{
var parentId = obj.ParentGobjectId;
if (parentId != 0 && !byId.ContainsKey(parentId))
parentId = 0; // orphan → treat as root
if (!childrenByParent.TryGetValue(parentId, out var list))
{
list = new List<int>();
childrenByParent[parentId] = list;
}
list.Add(obj.GobjectId);
}
var roots = childrenByParent.TryGetValue(0, out var rootList)
? rootList
: new List<int>();
var visited = new HashSet<int>();
var queue = new Queue<(int Id, bool ParentIncluded)>();
foreach (var rootId in roots)
queue.Enqueue((rootId, false));
while (queue.Count > 0)
{
var (id, parentIncluded) = queue.Dequeue();
if (!visited.Add(id))
continue; // cycle defense
if (!byId.TryGetValue(id, out var obj))
continue;
var nodeIncluded = parentIncluded || MatchesTemplateChain(obj.TemplateChain);
if (nodeIncluded)
included.Add(id);
if (childrenByParent.TryGetValue(id, out var children))
foreach (var childId in children)
queue.Enqueue((childId, nodeIncluded));
}
return included;
}
private static Regex GlobToRegex(string normalized)
{
var segments = normalized.Split('*');
var parts = segments.Select(Regex.Escape);
var body = string.Join(".*", parts);
return new Regex("^" + body + "$",
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);
}
private static string Normalize(string value)
{
var trimmed = value.Trim();
if (trimmed.StartsWith("$", StringComparison.Ordinal))
return trimmed.Substring(1);
return trimmed;
}
}
}

View File

@@ -1,38 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// MXAccess connection lifecycle states. (MXA-002)
/// </summary>
public enum ConnectionState
{
/// <summary>
/// No active session exists to the Galaxy runtime.
/// </summary>
Disconnected,
/// <summary>
/// The bridge is opening a new MXAccess session to the runtime.
/// </summary>
Connecting,
/// <summary>
/// The bridge has an active MXAccess session and can service reads, writes, and subscriptions.
/// </summary>
Connected,
/// <summary>
/// The bridge is closing the current MXAccess session and draining runtime resources.
/// </summary>
Disconnecting,
/// <summary>
/// The bridge detected a connection fault that requires operator attention or recovery logic.
/// </summary>
Error,
/// <summary>
/// The bridge is attempting to restore service after a runtime communication failure.
/// </summary>
Reconnecting
}
}

View File

@@ -1,38 +0,0 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Event args for connection state transitions. (MXA-002)
/// </summary>
public class ConnectionStateChangedEventArgs : EventArgs
{
/// <summary>
/// Initializes a new instance of the <see cref="ConnectionStateChangedEventArgs" /> class.
/// </summary>
/// <param name="previous">The connection state being exited.</param>
/// <param name="current">The connection state being entered.</param>
/// <param name="message">Additional context about the transition, such as a connection fault or reconnect attempt.</param>
public ConnectionStateChangedEventArgs(ConnectionState previous, ConnectionState current, string message = "")
{
PreviousState = previous;
CurrentState = current;
Message = message ?? "";
}
/// <summary>
/// Gets the previous MXAccess connection state before the transition was raised.
/// </summary>
public ConnectionState PreviousState { get; }
/// <summary>
/// Gets the new MXAccess connection state that the bridge moved into.
/// </summary>
public ConnectionState CurrentState { get; }
/// <summary>
/// Gets an operator-facing message that explains why the connection state changed.
/// </summary>
public string Message { get; }
}
}

View File

@@ -1,76 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// DTO matching attributes.sql result columns. (GR-002)
/// </summary>
public class GalaxyAttributeInfo
{
/// <summary>
/// Gets or sets the Galaxy object identifier that owns the attribute.
/// </summary>
public int GobjectId { get; set; }
/// <summary>
/// Gets or sets the Wonderware tag name used to associate the attribute with its runtime object.
/// </summary>
public string TagName { get; set; } = "";
/// <summary>
/// Gets or sets the attribute name as defined on the Galaxy template or instance.
/// </summary>
public string AttributeName { get; set; } = "";
/// <summary>
/// Gets or sets the fully qualified MXAccess reference used for runtime reads and writes.
/// </summary>
public string FullTagReference { get; set; } = "";
/// <summary>
/// Gets or sets the numeric Galaxy data type code used to map the attribute into OPC UA.
/// </summary>
public int MxDataType { get; set; }
/// <summary>
/// Gets or sets the human-readable Galaxy data type name returned by the repository query.
/// </summary>
public string DataTypeName { get; set; } = "";
/// <summary>
/// Gets or sets a value indicating whether the attribute is an array and should be exposed as a collection node.
/// </summary>
public bool IsArray { get; set; }
/// <summary>
/// Gets or sets the array length when the Galaxy attribute is modeled as a fixed-size array.
/// </summary>
public int? ArrayDimension { get; set; }
/// <summary>
/// Gets or sets the primitive data type name used when flattening the attribute for OPC UA clients.
/// </summary>
public string PrimitiveName { get; set; } = "";
/// <summary>
/// Gets or sets the source classification that explains whether the attribute comes from configuration, calculation,
/// or runtime data.
/// </summary>
public string AttributeSource { get; set; } = "";
/// <summary>
/// Gets or sets the Galaxy security classification that determines OPC UA write access.
/// 0=FreeAccess, 1=Operate (default), 2=SecuredWrite, 3=VerifiedWrite, 4=Tune, 5=Configure, 6=ViewOnly.
/// </summary>
public int SecurityClassification { get; set; } = 1;
/// <summary>
/// Gets or sets a value indicating whether the attribute has a HistoryExtension primitive and is historized by the
/// Wonderware Historian.
/// </summary>
public bool IsHistorized { get; set; }
/// <summary>
/// Gets or sets a value indicating whether the attribute has an AlarmExtension primitive and is an alarm.
/// </summary>
public bool IsAlarm { get; set; }
}
}

View File

@@ -1,64 +0,0 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// DTO matching hierarchy.sql result columns. (GR-001)
/// </summary>
public class GalaxyObjectInfo
{
/// <summary>
/// Gets or sets the Galaxy object identifier used to connect hierarchy rows to attribute rows.
/// </summary>
public int GobjectId { get; set; }
/// <summary>
/// Gets or sets the runtime tag name for the Galaxy object represented in the OPC UA tree.
/// </summary>
public string TagName { get; set; } = "";
/// <summary>
/// Gets or sets the contained name shown for the object inside its parent area or object.
/// </summary>
public string ContainedName { get; set; } = "";
/// <summary>
/// Gets or sets the browse name emitted into OPC UA so clients can navigate the Galaxy hierarchy.
/// </summary>
public string BrowseName { get; set; } = "";
/// <summary>
/// Gets or sets the parent Galaxy object identifier that establishes the hierarchy relationship.
/// </summary>
public int ParentGobjectId { get; set; }
/// <summary>
/// Gets or sets a value indicating whether the row represents a Galaxy area rather than a contained object.
/// </summary>
public bool IsArea { get; set; }
/// <summary>
/// Gets or sets the template derivation chain for this object. Index 0 is the object's own template;
/// subsequent entries walk up toward the most ancestral template before <c>$Object</c>. Populated by
/// the recursive CTE in <c>hierarchy.sql</c> on <c>gobject.derived_from_gobject_id</c>. Used by
/// <see cref="AlarmObjectFilter"/> to decide whether an object's alarms should be monitored.
/// </summary>
public List<string> TemplateChain { get; set; } = new();
/// <summary>
/// Gets or sets the Galaxy template category id for this object. Category 1 is $WinPlatform,
/// 3 is $AppEngine, 13 is $Area, 10 is $UserDefined, and so on. Populated from
/// <c>template_definition.category_id</c> by <c>hierarchy.sql</c> and consumed by the runtime
/// status probe manager to identify hosts that should receive a <c>ScanState</c> probe.
/// </summary>
public int CategoryId { get; set; }
/// <summary>
/// Gets or sets the Galaxy object id of this object's runtime host, populated from
/// <c>gobject.hosted_by_gobject_id</c>. Walk this chain upward to find the nearest
/// <c>$WinPlatform</c> or <c>$AppEngine</c> ancestor for subtree quality invalidation when
/// a runtime host is reported Stopped. Zero for root objects that have no host.
/// </summary>
public int HostedByGobjectId { get; set; }
}
}

View File

@@ -1,29 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Runtime state of a deployed Galaxy runtime host ($WinPlatform or $AppEngine) as
/// observed by the bridge via its <c>ScanState</c> probe.
/// </summary>
public enum GalaxyRuntimeState
{
/// <summary>
/// Probe advised but no callback received yet. Transitions to <see cref="Running"/>
/// on the first successful <c>ScanState = true</c> callback, or to <see cref="Stopped"/>
/// once the unknown-resolution timeout elapses.
/// </summary>
Unknown,
/// <summary>
/// Last probe callback reported <c>ScanState = true</c> with a successful item status.
/// The host is on scan and executing.
/// </summary>
Running,
/// <summary>
/// Last probe callback reported <c>ScanState != true</c>, or a failed item status, or
/// the initial probe never resolved before the unknown timeout elapsed. The host is
/// off scan or unreachable.
/// </summary>
Stopped
}
}

View File

@@ -1,72 +0,0 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Point-in-time runtime state of a single Galaxy runtime host ($WinPlatform or $AppEngine)
/// as tracked by the <c>GalaxyRuntimeProbeManager</c>. Surfaced on the status dashboard and
/// consumed by <c>HealthCheckService</c> so operators can detect a stopped host before
/// downstream clients notice the stale data.
/// </summary>
public sealed class GalaxyRuntimeStatus
{
/// <summary>
/// Gets or sets the Galaxy tag_name of the host (e.g., <c>DevPlatform</c> or
/// <c>DevAppEngine</c>).
/// </summary>
public string ObjectName { get; set; } = "";
/// <summary>
/// Gets or sets the Galaxy gobject_id of the host.
/// </summary>
public int GobjectId { get; set; }
/// <summary>
/// Gets or sets the Galaxy template category name — <c>$WinPlatform</c> or
/// <c>$AppEngine</c>. Used by the dashboard to group hosts by kind.
/// </summary>
public string Kind { get; set; } = "";
/// <summary>
/// Gets or sets the current runtime state.
/// </summary>
public GalaxyRuntimeState State { get; set; }
/// <summary>
/// Gets or sets the UTC timestamp of the most recent probe callback, whether it
/// reported success or failure. <see langword="null"/> before the first callback.
/// </summary>
public DateTime? LastStateCallbackTime { get; set; }
/// <summary>
/// Gets or sets the UTC timestamp of the most recent <see cref="State"/> transition.
/// Backs the dashboard "Since" column. <see langword="null"/> in the initial Unknown
/// state before any transition.
/// </summary>
public DateTime? LastStateChangeTime { get; set; }
/// <summary>
/// Gets or sets the last <c>ScanState</c> value received from the probe, or
/// <see langword="null"/> before the first update or when the last callback carried
/// a non-success item status (no value delivered).
/// </summary>
public bool? LastScanState { get; set; }
/// <summary>
/// Gets or sets the detail message from the most recent failure callback, cleared on
/// the next successful <c>ScanState = true</c> delivery.
/// </summary>
public string? LastError { get; set; }
/// <summary>
/// Gets or sets the cumulative number of callbacks where <c>ScanState = true</c>.
/// </summary>
public long GoodUpdateCount { get; set; }
/// <summary>
/// Gets or sets the cumulative number of callbacks where <c>ScanState != true</c>
/// or the item status reported failure.
/// </summary>
public long FailureCount { get; set; }
}
}

View File

@@ -1,46 +0,0 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Interface for Galaxy repository database queries. (GR-001 through GR-004)
/// </summary>
public interface IGalaxyRepository
{
/// <summary>
/// Retrieves the Galaxy object hierarchy used to construct the OPC UA browse tree.
/// </summary>
/// <param name="ct">A token that cancels the repository query.</param>
/// <returns>A list of Galaxy objects ordered for address-space construction.</returns>
Task<List<GalaxyObjectInfo>> GetHierarchyAsync(CancellationToken ct = default);
/// <summary>
/// Retrieves the Galaxy attributes that become OPC UA variables under the object hierarchy.
/// </summary>
/// <param name="ct">A token that cancels the repository query.</param>
/// <returns>A list of attribute definitions with MXAccess references and type metadata.</returns>
Task<List<GalaxyAttributeInfo>> GetAttributesAsync(CancellationToken ct = default);
/// <summary>
/// Gets the last Galaxy deploy timestamp used to detect metadata changes that require an address-space rebuild.
/// </summary>
/// <param name="ct">A token that cancels the repository query.</param>
/// <returns>The latest deploy timestamp, or <see langword="null" /> when it cannot be determined.</returns>
Task<DateTime?> GetLastDeployTimeAsync(CancellationToken ct = default);
/// <summary>
/// Verifies that the service can reach the Galaxy repository before it attempts to build the address space.
/// </summary>
/// <param name="ct">A token that cancels the connectivity check.</param>
/// <returns><see langword="true" /> when repository access succeeds; otherwise, <see langword="false" />.</returns>
Task<bool> TestConnectionAsync(CancellationToken ct = default);
/// <summary>
/// Occurs when the repository detects a Galaxy deployment change that should trigger an OPC UA rebuild.
/// </summary>
event Action? OnGalaxyChanged;
}
}

View File

@@ -1,79 +0,0 @@
using System;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Abstraction over MXAccess COM client for tag read/write/subscribe operations.
/// (MXA-001 through MXA-009, OPC-007, OPC-008, OPC-009)
/// </summary>
public interface IMxAccessClient : IDisposable
{
/// <summary>
/// Gets the current runtime connectivity state for the bridge.
/// </summary>
ConnectionState State { get; }
/// <summary>
/// Gets the number of active runtime subscriptions currently being mirrored into OPC UA.
/// </summary>
int ActiveSubscriptionCount { get; }
/// <summary>
/// Gets the number of reconnect cycles attempted since the client was created.
/// </summary>
int ReconnectCount { get; }
/// <summary>
/// Occurs when the MXAccess session changes state so the host can update diagnostics and retry logic.
/// </summary>
event EventHandler<ConnectionStateChangedEventArgs>? ConnectionStateChanged;
/// <summary>
/// Occurs when a subscribed Galaxy attribute publishes a new runtime value.
/// </summary>
event Action<string, Vtq>? OnTagValueChanged;
/// <summary>
/// Opens the MXAccess session required for runtime reads, writes, and subscriptions.
/// </summary>
/// <param name="ct">A token that cancels the connection attempt.</param>
Task ConnectAsync(CancellationToken ct = default);
/// <summary>
/// Closes the MXAccess session and releases runtime resources.
/// </summary>
Task DisconnectAsync();
/// <summary>
/// Starts monitoring a Galaxy attribute so value changes can be pushed to OPC UA subscribers.
/// </summary>
/// <param name="fullTagReference">The fully qualified MXAccess reference for the target attribute.</param>
/// <param name="callback">The callback to invoke when the runtime publishes a new value for the attribute.</param>
Task SubscribeAsync(string fullTagReference, Action<string, Vtq> callback);
/// <summary>
/// Stops monitoring a Galaxy attribute when it is no longer needed by the OPC UA layer.
/// </summary>
/// <param name="fullTagReference">The fully qualified MXAccess reference for the target attribute.</param>
Task UnsubscribeAsync(string fullTagReference);
/// <summary>
/// Reads the current runtime value for a Galaxy attribute.
/// </summary>
/// <param name="fullTagReference">The fully qualified MXAccess reference for the target attribute.</param>
/// <param name="ct">A token that cancels the read.</param>
/// <returns>The value, timestamp, and quality returned by the runtime.</returns>
Task<Vtq> ReadAsync(string fullTagReference, CancellationToken ct = default);
/// <summary>
/// Writes a new runtime value to a writable Galaxy attribute.
/// </summary>
/// <param name="fullTagReference">The fully qualified MXAccess reference for the target attribute.</param>
/// <param name="value">The value to write to the runtime.</param>
/// <param name="ct">A token that cancels the write.</param>
/// <returns><see langword="true" /> when the write is accepted by the runtime; otherwise, <see langword="false" />.</returns>
Task<bool> WriteAsync(string fullTagReference, object value, CancellationToken ct = default);
}
}

View File

@@ -1,99 +0,0 @@
using ArchestrA.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Delegate matching LMXProxyServer.OnDataChange COM event signature.
/// </summary>
/// <param name="hLMXServerHandle">The runtime connection handle that raised the change.</param>
/// <param name="phItemHandle">The runtime item handle for the attribute that changed.</param>
/// <param name="pvItemValue">The new raw runtime value for the attribute.</param>
/// <param name="pwItemQuality">The OPC DA quality code supplied by the runtime.</param>
/// <param name="pftItemTimeStamp">The timestamp object supplied by the runtime for the value.</param>
/// <param name="ItemStatus">The MXAccess status payload associated with the callback.</param>
public delegate void MxDataChangeHandler(
int hLMXServerHandle,
int phItemHandle,
object pvItemValue,
int pwItemQuality,
object pftItemTimeStamp,
ref MXSTATUS_PROXY[] ItemStatus);
/// <summary>
/// Delegate matching LMXProxyServer.OnWriteComplete COM event signature.
/// </summary>
/// <param name="hLMXServerHandle">The runtime connection handle that processed the write.</param>
/// <param name="phItemHandle">The runtime item handle that was written.</param>
/// <param name="ItemStatus">The MXAccess status payload describing the write outcome.</param>
public delegate void MxWriteCompleteHandler(
int hLMXServerHandle,
int phItemHandle,
ref MXSTATUS_PROXY[] ItemStatus);
/// <summary>
/// Abstraction over LMXProxyServer COM object to enable testing without the COM runtime. (MXA-001)
/// </summary>
public interface IMxProxy
{
/// <summary>
/// Registers the bridge as an MXAccess client with the runtime proxy.
/// </summary>
/// <param name="clientName">The client identity reported to the runtime for diagnostics and session tracking.</param>
/// <returns>The runtime connection handle assigned to the client session.</returns>
int Register(string clientName);
/// <summary>
/// Unregisters the bridge from the runtime proxy and releases the connection handle.
/// </summary>
/// <param name="handle">The connection handle returned by <see cref="Register(string)" />.</param>
void Unregister(int handle);
/// <summary>
/// Adds a Galaxy attribute reference to the active runtime session.
/// </summary>
/// <param name="handle">The runtime connection handle.</param>
/// <param name="address">The fully qualified attribute reference to resolve.</param>
/// <returns>The runtime item handle assigned to the attribute.</returns>
int AddItem(int handle, string address);
/// <summary>
/// Removes a previously registered attribute from the runtime session.
/// </summary>
/// <param name="handle">The runtime connection handle.</param>
/// <param name="itemHandle">The item handle returned by <see cref="AddItem(int, string)" />.</param>
void RemoveItem(int handle, int itemHandle);
/// <summary>
/// Starts supervisory updates for an attribute so runtime changes are pushed to the bridge.
/// </summary>
/// <param name="handle">The runtime connection handle.</param>
/// <param name="itemHandle">The item handle to monitor.</param>
void AdviseSupervisory(int handle, int itemHandle);
/// <summary>
/// Stops supervisory updates for an attribute.
/// </summary>
/// <param name="handle">The runtime connection handle.</param>
/// <param name="itemHandle">The item handle to stop monitoring.</param>
void UnAdviseSupervisory(int handle, int itemHandle);
/// <summary>
/// Writes a new value to a runtime attribute through the COM proxy.
/// </summary>
/// <param name="handle">The runtime connection handle.</param>
/// <param name="itemHandle">The item handle to write.</param>
/// <param name="value">The new value to push into the runtime.</param>
/// <param name="securityClassification">The Wonderware security classification applied to the write.</param>
void Write(int handle, int itemHandle, object value, int securityClassification);
/// <summary>
/// Occurs when the runtime pushes a data-change callback for a subscribed attribute.
/// </summary>
event MxDataChangeHandler? OnDataChange;
/// <summary>
/// Occurs when the runtime acknowledges completion of a write request.
/// </summary>
event MxWriteCompleteHandler? OnWriteComplete;
}
}

View File

@@ -1,41 +0,0 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Pluggable interface for validating user credentials. Implement for different backing stores (config file, LDAP,
/// etc.).
/// </summary>
public interface IUserAuthenticationProvider
{
/// <summary>
/// Validates a username/password combination.
/// </summary>
bool ValidateCredentials(string username, string password);
}
/// <summary>
/// Extended interface for providers that can resolve application-level roles for authenticated users.
/// When the auth provider implements this interface, OnImpersonateUser uses the returned roles
/// to control write and alarm-ack permissions.
/// </summary>
public interface IRoleProvider
{
/// <summary>
/// Returns the set of application-level roles granted to the user.
/// </summary>
IReadOnlyList<string> GetUserRoles(string username);
}
/// <summary>
/// Well-known application-level role names used for permission enforcement.
/// </summary>
public static class AppRoles
{
public const string ReadOnly = "ReadOnly";
public const string WriteOperate = "WriteOperate";
public const string WriteTune = "WriteTune";
public const string WriteConfigure = "WriteConfigure";
public const string AlarmAck = "AlarmAck";
}
}

View File

@@ -1,148 +0,0 @@
using System;
using System.Collections.Generic;
using System.DirectoryServices.Protocols;
using System.Net;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Host.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Validates credentials via LDAP bind and resolves group membership to application roles.
/// </summary>
public class LdapAuthenticationProvider : IUserAuthenticationProvider, IRoleProvider
{
private static readonly ILogger Log = Serilog.Log.ForContext<LdapAuthenticationProvider>();
private readonly LdapConfiguration _config;
private readonly Dictionary<string, string> _groupToRole;
public LdapAuthenticationProvider(LdapConfiguration config)
{
_config = config;
_groupToRole = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
{ config.ReadOnlyGroup, AppRoles.ReadOnly },
{ config.WriteOperateGroup, AppRoles.WriteOperate },
{ config.WriteTuneGroup, AppRoles.WriteTune },
{ config.WriteConfigureGroup, AppRoles.WriteConfigure },
{ config.AlarmAckGroup, AppRoles.AlarmAck }
};
}
public IReadOnlyList<string> GetUserRoles(string username)
{
try
{
using (var connection = CreateConnection())
{
// Bind with service account to search
connection.Bind(new NetworkCredential(_config.ServiceAccountDn, _config.ServiceAccountPassword));
var request = new SearchRequest(
_config.BaseDN,
$"(cn={EscapeLdapFilter(username)})",
SearchScope.Subtree,
"memberOf");
var response = (SearchResponse)connection.SendRequest(request);
if (response.Entries.Count == 0)
{
Log.Warning("LDAP search returned no entries for {Username}", username);
return new[] { AppRoles.ReadOnly }; // safe fallback
}
var entry = response.Entries[0];
var memberOf = entry.Attributes["memberOf"];
if (memberOf == null || memberOf.Count == 0)
{
Log.Debug("No memberOf attributes for {Username}, defaulting to ReadOnly", username);
return new[] { AppRoles.ReadOnly };
}
var roles = new List<string>();
for (var i = 0; i < memberOf.Count; i++)
{
var dn = memberOf[i]?.ToString() ?? "";
// Extract the OU/CN from the memberOf DN (e.g., "ou=ReadWrite,ou=groups,dc=...")
var groupName = ExtractGroupName(dn);
if (groupName != null && _groupToRole.TryGetValue(groupName, out var role)) roles.Add(role);
}
if (roles.Count == 0)
{
Log.Debug("No matching role groups for {Username}, defaulting to ReadOnly", username);
roles.Add(AppRoles.ReadOnly);
}
Log.Debug("LDAP roles for {Username}: [{Roles}]", username, string.Join(", ", roles));
return roles;
}
}
catch (Exception ex)
{
Log.Warning(ex, "Failed to resolve LDAP roles for {Username}, defaulting to ReadOnly", username);
return new[] { AppRoles.ReadOnly };
}
}
public bool ValidateCredentials(string username, string password)
{
try
{
var bindDn = _config.BindDnTemplate.Replace("{username}", username);
using (var connection = CreateConnection())
{
connection.Bind(new NetworkCredential(bindDn, password));
}
Log.Debug("LDAP bind succeeded for {Username}", username);
return true;
}
catch (LdapException ex)
{
Log.Debug("LDAP bind failed for {Username}: {Error}", username, ex.Message);
return false;
}
catch (Exception ex)
{
Log.Warning(ex, "LDAP error during credential validation for {Username}", username);
return false;
}
}
private LdapConnection CreateConnection()
{
var identifier = new LdapDirectoryIdentifier(_config.Host, _config.Port);
var connection = new LdapConnection(identifier)
{
AuthType = AuthType.Basic,
Timeout = TimeSpan.FromSeconds(_config.TimeoutSeconds)
};
connection.SessionOptions.ProtocolVersion = 3;
return connection;
}
private static string? ExtractGroupName(string dn)
{
// Parse "ou=ReadWrite,ou=groups,dc=..." or "cn=ReadWrite,..."
if (string.IsNullOrEmpty(dn)) return null;
var parts = dn.Split(',');
if (parts.Length == 0) return null;
var first = parts[0].Trim();
var eqIdx = first.IndexOf('=');
return eqIdx >= 0 ? first.Substring(eqIdx + 1) : null;
}
private static string EscapeLdapFilter(string input)
{
return input
.Replace("\\", "\\5c")
.Replace("*", "\\2a")
.Replace("(", "\\28")
.Replace(")", "\\29")
.Replace("\0", "\\00");
}
}
}

View File

@@ -1,18 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Stable identifiers for custom OPC UA roles mapped from LDAP groups.
/// The namespace URI is registered in the server namespace table at startup,
/// and the string identifiers are resolved to runtime NodeIds before use.
/// </summary>
public static class LmxRoleIds
{
public const string NamespaceUri = "urn:zbmom:lmxopcua:roles";
public const string ReadOnly = "Role.ReadOnly";
public const string WriteOperate = "Role.WriteOperate";
public const string WriteTune = "Role.WriteTune";
public const string WriteConfigure = "Role.WriteConfigure";
public const string AlarmAck = "Role.AlarmAck";
}
}

View File

@@ -1,87 +0,0 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Maps Galaxy mx_data_type integers to OPC UA data types and CLR types. (OPC-005)
/// See gr/data_type_mapping.md for full mapping table.
/// </summary>
public static class MxDataTypeMapper
{
/// <summary>
/// Maps mx_data_type to OPC UA DataType NodeId numeric identifier.
/// Unknown types default to String (i=12).
/// </summary>
/// <param name="mxDataType">The Galaxy MX data type code.</param>
/// <returns>The OPC UA built-in data type node identifier.</returns>
public static uint MapToOpcUaDataType(int mxDataType)
{
return mxDataType switch
{
1 => 1, // Boolean → i=1
2 => 6, // Integer → Int32 i=6
3 => 10, // Float → Float i=10
4 => 11, // Double → Double i=11
5 => 12, // String → String i=12
6 => 13, // Time → DateTime i=13
7 => 11, // ElapsedTime → Double i=11 (seconds)
8 => 12, // Reference → String i=12
13 => 6, // Enumeration → Int32 i=6
14 => 12, // Custom → String i=12
15 => 21, // InternationalizedString → LocalizedText i=21
16 => 12, // Custom → String i=12
_ => 12 // Unknown → String i=12
};
}
/// <summary>
/// Maps mx_data_type to the corresponding CLR type.
/// </summary>
/// <param name="mxDataType">The Galaxy MX data type code.</param>
/// <returns>The CLR type used to represent runtime values for the MX type.</returns>
public static Type MapToClrType(int mxDataType)
{
return mxDataType switch
{
1 => typeof(bool),
2 => typeof(int),
3 => typeof(float),
4 => typeof(double),
5 => typeof(string),
6 => typeof(DateTime),
7 => typeof(double), // ElapsedTime as seconds
8 => typeof(string), // Reference as string
13 => typeof(int), // Enum backing integer
14 => typeof(string),
15 => typeof(string), // LocalizedText stored as string
16 => typeof(string),
_ => typeof(string)
};
}
/// <summary>
/// Returns the OPC UA type name for a given mx_data_type.
/// </summary>
/// <param name="mxDataType">The Galaxy MX data type code.</param>
/// <returns>The OPC UA type name used in diagnostics.</returns>
public static string GetOpcUaTypeName(int mxDataType)
{
return mxDataType switch
{
1 => "Boolean",
2 => "Int32",
3 => "Float",
4 => "Double",
5 => "String",
6 => "DateTime",
7 => "Double",
8 => "String",
13 => "Int32",
14 => "String",
15 => "LocalizedText",
16 => "String",
_ => "String"
};
}
}
}

View File

@@ -1,76 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Translates MXAccess error codes (1008, 1012, 1013, etc.) to human-readable messages. (MXA-009)
/// </summary>
public static class MxErrorCodes
{
/// <summary>
/// The requested Galaxy attribute reference does not resolve in the runtime.
/// </summary>
public const int MX_E_InvalidReference = 1008;
/// <summary>
/// The supplied value does not match the attribute's configured data type.
/// </summary>
public const int MX_E_WrongDataType = 1012;
/// <summary>
/// The target attribute cannot be written because it is read-only or protected.
/// </summary>
public const int MX_E_NotWritable = 1013;
/// <summary>
/// The runtime did not complete the operation within the configured timeout.
/// </summary>
public const int MX_E_RequestTimedOut = 1014;
/// <summary>
/// Communication with the MXAccess runtime failed during the operation.
/// </summary>
public const int MX_E_CommFailure = 1015;
/// <summary>
/// The operation was attempted without an active MXAccess session.
/// </summary>
public const int MX_E_NotConnected = 1016;
/// <summary>
/// Converts a numeric MXAccess error code into an operator-facing message.
/// </summary>
/// <param name="errorCode">The MXAccess error code returned by the runtime.</param>
/// <returns>A human-readable description of the runtime failure.</returns>
public static string GetMessage(int errorCode)
{
return errorCode switch
{
1008 => "Invalid reference: the tag address does not exist or is malformed",
1012 => "Wrong data type: the value type does not match the attribute's expected type",
1013 => "Not writable: the attribute is read-only or locked",
1014 => "Request timed out: the operation did not complete within the allowed time",
1015 => "Communication failure: lost connection to the runtime",
1016 => "Not connected: no active connection to the Galaxy runtime",
_ => $"Unknown MXAccess error code: {errorCode}"
};
}
/// <summary>
/// Maps an MXAccess error code to the OPC quality state that should be exposed to clients.
/// </summary>
/// <param name="errorCode">The MXAccess error code returned by the runtime.</param>
/// <returns>The quality classification that best represents the runtime failure.</returns>
public static Quality MapToQuality(int errorCode)
{
return errorCode switch
{
1008 => Quality.BadConfigError,
1012 => Quality.BadConfigError,
1013 => Quality.BadOutOfService,
1014 => Quality.BadCommFailure,
1015 => Quality.BadCommFailure,
1016 => Quality.BadNotConnected,
_ => Quality.Bad
};
}
}
}

View File

@@ -1,18 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Maps a deployed Galaxy platform to the hostname where it executes.
/// </summary>
public class PlatformInfo
{
/// <summary>
/// Gets or sets the gobject_id of the platform object in the Galaxy repository.
/// </summary>
public int GobjectId { get; set; }
/// <summary>
/// Gets or sets the hostname (node_name) where the platform is deployed.
/// </summary>
public string NodeName { get; set; } = "";
}
}

View File

@@ -1,122 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// OPC DA quality codes mapped from MXAccess quality values. (MXA-009, OPC-005)
/// </summary>
public enum Quality : byte
{
// Bad family (0-63)
/// <summary>
/// No valid process value is available.
/// </summary>
Bad = 0,
/// <summary>
/// The value is invalid because the Galaxy attribute definition or mapping is wrong.
/// </summary>
BadConfigError = 4,
/// <summary>
/// The bridge is not currently connected to the Galaxy runtime.
/// </summary>
BadNotConnected = 8,
/// <summary>
/// The runtime device or adapter failed while obtaining the value.
/// </summary>
BadDeviceFailure = 12,
/// <summary>
/// The underlying field source reported a bad sensor condition.
/// </summary>
BadSensorFailure = 16,
/// <summary>
/// Communication with the runtime failed while retrieving the value.
/// </summary>
BadCommFailure = 20,
/// <summary>
/// The attribute is intentionally unavailable for service, such as a locked or unwritable value.
/// </summary>
BadOutOfService = 24,
/// <summary>
/// The bridge is still waiting for the first usable value after startup or resubscription.
/// </summary>
BadWaitingForInitialData = 32,
// Uncertain family (64-191)
/// <summary>
/// A value is available, but it should be treated cautiously.
/// </summary>
Uncertain = 64,
/// <summary>
/// The last usable value is being repeated because a newer one is unavailable.
/// </summary>
UncertainLastUsable = 68,
/// <summary>
/// The sensor or source is providing a value with reduced accuracy.
/// </summary>
UncertainSensorNotAccurate = 80,
/// <summary>
/// The value exceeds its engineered limits.
/// </summary>
UncertainEuExceeded = 84,
/// <summary>
/// The source is operating in a degraded or subnormal state.
/// </summary>
UncertainSubNormal = 88,
// Good family (192+)
/// <summary>
/// The value is current and suitable for normal client use.
/// </summary>
Good = 192,
/// <summary>
/// The value is good but currently overridden locally rather than flowing from the live source.
/// </summary>
GoodLocalOverride = 216
}
/// <summary>
/// Helper methods for reasoning about OPC quality families used by the bridge.
/// </summary>
public static class QualityExtensions
{
/// <summary>
/// Determines whether the quality represents a good runtime value that can be trusted by OPC UA clients.
/// </summary>
/// <param name="q">The quality code to inspect.</param>
/// <returns><see langword="true" /> when the value is in the good quality range; otherwise, <see langword="false" />.</returns>
public static bool IsGood(this Quality q)
{
return (byte)q >= 192;
}
/// <summary>
/// Determines whether the quality represents an uncertain runtime value that should be treated cautiously.
/// </summary>
/// <param name="q">The quality code to inspect.</param>
/// <returns><see langword="true" /> when the value is in the uncertain range; otherwise, <see langword="false" />.</returns>
public static bool IsUncertain(this Quality q)
{
return (byte)q >= 64 && (byte)q < 192;
}
/// <summary>
/// Determines whether the quality represents a bad runtime value that should not be used as valid process data.
/// </summary>
/// <param name="q">The quality code to inspect.</param>
/// <returns><see langword="true" /> when the value is in the bad range; otherwise, <see langword="false" />.</returns>
public static bool IsBad(this Quality q)
{
return (byte)q < 64;
}
}
}

View File

@@ -1,60 +0,0 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Maps MXAccess integer quality to domain Quality enum and OPC UA StatusCodes. (MXA-009, OPC-005)
/// </summary>
public static class QualityMapper
{
/// <summary>
/// Maps an MXAccess quality integer (OPC DA quality byte) to domain Quality.
/// Uses category bits: 192+ = Good, 64-191 = Uncertain, 0-63 = Bad.
/// </summary>
/// <param name="mxQuality">The raw MXAccess quality integer.</param>
/// <returns>The mapped bridge quality value.</returns>
public static Quality MapFromMxAccessQuality(int mxQuality)
{
var b = (byte)(mxQuality & 0xFF);
// Try exact match first
if (Enum.IsDefined(typeof(Quality), b))
return (Quality)b;
// Fall back to category
if (b >= 192) return Quality.Good;
if (b >= 64) return Quality.Uncertain;
return Quality.Bad;
}
/// <summary>
/// Maps domain Quality to OPC UA StatusCode uint32.
/// </summary>
/// <param name="quality">The bridge quality value.</param>
/// <returns>The OPC UA status code represented as a 32-bit unsigned integer.</returns>
public static uint MapToOpcUaStatusCode(Quality quality)
{
return quality switch
{
Quality.Good => 0x00000000u, // Good
Quality.GoodLocalOverride => 0x00D80000u, // Good_LocalOverride
Quality.Uncertain => 0x40000000u, // Uncertain
Quality.UncertainLastUsable => 0x40900000u,
Quality.UncertainSensorNotAccurate => 0x40930000u,
Quality.UncertainEuExceeded => 0x40940000u,
Quality.UncertainSubNormal => 0x40950000u,
Quality.Bad => 0x80000000u, // Bad
Quality.BadConfigError => 0x80890000u,
Quality.BadNotConnected => 0x808A0000u,
Quality.BadDeviceFailure => 0x808B0000u,
Quality.BadSensorFailure => 0x808C0000u,
Quality.BadCommFailure => 0x80050000u,
Quality.BadOutOfService => 0x808D0000u,
Quality.BadWaitingForInitialData => 0x80320000u,
_ => quality.IsGood() ? 0x00000000u :
quality.IsUncertain() ? 0x40000000u :
0x80000000u
};
}
}
}

View File

@@ -1,30 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Maps Galaxy security classification values to OPC UA write access decisions.
/// See gr/data_type_mapping.md for the full mapping table.
/// </summary>
public static class SecurityClassificationMapper
{
/// <summary>
/// Determines whether an attribute with the given security classification should allow writes.
/// </summary>
/// <param name="securityClassification">The Galaxy security classification value.</param>
/// <returns>
/// <see langword="true" /> for FreeAccess (0), Operate (1), Tune (4), Configure (5);
/// <see langword="false" /> for SecuredWrite (2), VerifiedWrite (3), ViewOnly (6).
/// </returns>
public static bool IsWritable(int securityClassification)
{
switch (securityClassification)
{
case 2: // SecuredWrite
case 3: // VerifiedWrite
case 6: // ViewOnly
return false;
default:
return true;
}
}
}
}

View File

@@ -1,96 +0,0 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Host.Domain
{
/// <summary>
/// Value-Timestamp-Quality triplet for tag data. (MXA-003, OPC-007)
/// </summary>
public readonly struct Vtq : IEquatable<Vtq>
{
/// <summary>
/// Gets the runtime value returned for the Galaxy attribute.
/// </summary>
public object? Value { get; }
/// <summary>
/// Gets the timestamp associated with the runtime value.
/// </summary>
public DateTime Timestamp { get; }
/// <summary>
/// Gets the quality classification that tells OPC UA clients whether the value is usable.
/// </summary>
public Quality Quality { get; }
/// <summary>
/// Initializes a new instance of the <see cref="Vtq" /> struct for a Galaxy attribute value.
/// </summary>
/// <param name="value">The runtime value returned by MXAccess.</param>
/// <param name="timestamp">The timestamp assigned to the runtime value.</param>
/// <param name="quality">The quality classification for the runtime value.</param>
public Vtq(object? value, DateTime timestamp, Quality quality)
{
Value = value;
Timestamp = timestamp;
Quality = quality;
}
/// <summary>
/// Creates a good-quality VTQ snapshot for a successfully read or subscribed attribute value.
/// </summary>
/// <param name="value">The runtime value to wrap.</param>
/// <returns>A VTQ carrying the provided value with the current UTC timestamp and good quality.</returns>
public static Vtq Good(object? value)
{
return new Vtq(value, DateTime.UtcNow, Quality.Good);
}
/// <summary>
/// Creates a bad-quality VTQ snapshot when no usable runtime value is available.
/// </summary>
/// <param name="quality">The specific bad quality reason to expose to clients.</param>
/// <returns>A VTQ with no value, the current UTC timestamp, and the requested bad quality.</returns>
public static Vtq Bad(Quality quality = Quality.Bad)
{
return new Vtq(null, DateTime.UtcNow, quality);
}
/// <summary>
/// Creates an uncertain VTQ snapshot when the runtime value exists but should be treated cautiously.
/// </summary>
/// <param name="value">The runtime value to wrap.</param>
/// <returns>A VTQ carrying the provided value with the current UTC timestamp and uncertain quality.</returns>
public static Vtq Uncertain(object? value)
{
return new Vtq(value, DateTime.UtcNow, Quality.Uncertain);
}
/// <summary>
/// Compares two VTQ snapshots for exact value, timestamp, and quality equality.
/// </summary>
/// <param name="other">The other VTQ snapshot to compare.</param>
/// <returns><see langword="true" /> when all fields match; otherwise, <see langword="false" />.</returns>
public bool Equals(Vtq other)
{
return Equals(Value, other.Value) && Timestamp == other.Timestamp && Quality == other.Quality;
}
/// <inheritdoc />
public override bool Equals(object? obj)
{
return obj is Vtq other && Equals(other);
}
/// <inheritdoc />
public override int GetHashCode()
{
return HashCode.Combine(Value, Timestamp, Quality);
}
/// <inheritdoc />
public override string ToString()
{
return $"Vtq({Value}, {Timestamp:O}, {Quality})";
}
}
}

View File

@@ -1,7 +0,0 @@
<Weavers xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="FodyWeavers.xsd">
<Costura>
<ExcludeAssemblies>
ArchestrA.MxAccess
</ExcludeAssemblies>
</Costura>
</Weavers>

Some files were not shown because too many files have changed in this diff Show More