Joseph Doherty 9e2b5b330f Phase 3 PR 54 -- Siemens S7 Modbus TCP quirks research document. 485-line doc at docs/v2/s7.md mirroring the docs/v2/dl205.md template for the Siemens SIMATIC S7 family (S7-1200 / S7-1500 / S7-300 / S7-400 / ET 200SP / CP 343-1 / CP 443-1 / CP 343-1 Lean / MODBUSPN). Siemens S7 is fundamentally different from DL260: there is no fixed Modbus memory map baked into firmware -- every deployment runs MB_SERVER (S7-1200/1500/ET 200SP), MODBUSCP (S7-300/400 + CP), or MODBUSPN (S7-300/400 PN) library blocks wired up to user DBs via the MB_HOLD_REG / ADDR parameters. The driver's job is therefore to handle per-site CONFIG rather than per-family QUIRKS, and the doc makes that explicit. Key findings worth flagging for the PR 56+ implementation track: (1) S7 has no fixed memory map -- must accept per-site DriverConfig, cannot assume vendor-standard layout. (2) MB_SERVER requires NON-optimized DBs in TIA Portal; optimized DBs cause the library to return STATUS 0x8383 on every access -- the single most common S7 Modbus deployment bug in the field. (3) Word order is ABCD by default (big-endian bytes + big-endian words) across all Siemens S7 Modbus paths, which is the OPPOSITE of DL260 CDAB -- the Modbus driver's S7 profile default must be ByteOrder.BigEndian, not WordSwap. (4) MB_SERVER listens on ONE port per FB instance; multi-client support requires running MB_SERVER on 502 / 503 / 504 / ... simultaneously -- most clients assume port 502 multiplexes, which is wrong on S7. (5) CP 343-1 Lean is SERVER-ONLY and requires the separate 2XV9450-1MB00 MODBUS TCP CP library license; client mode calls return immediate error on Lean. (6) MB_SERVER does NOT filter Unit ID, accepts any value. Means the driver can't use Unit ID to detect 'direct vs gateway' topology. (7) FC23 Read-Write Multiple, FC22 Mask Write, FC20/21 File Records, FC43 Device Identification all return exception 01 Illegal Function on every S7 variant -- the driver MUST NOT attempt bulk-read optimisation via FC23 when talking to S7. (8) STOP-mode read/write behaviour is non-deterministic across firmware bands: reads may return cached data (library internal buffer), writes may succeed-silently or return exception 04 depending on CPU firmware version -- flagged as 'driver treats both as unavailable, do not distinguish'. Unconfirmed rumours flagged separately: 'V2.0+ reverses float byte order' claim (cited but not reproduced), STOP-mode caching location (folklore, no primary source). Per-model test differentiation section names the tests as S7_<model>_<behavior> matching the DL205 template convention (e.g. S7_1200_MB_SERVER_requires_non_optimized_DB, S7_343_1_Lean_rejects_client_mode, S7_FC23_returns_IllegalFunction). 31 cited references across the Siemens Industry Online Support entry-ID system (68011496 for MB_SERVER FAQ, etc.), TIA Portal library manuals, and three third-party driver vendor release notes (Kepware, Ignition, FactoryTalk). This is a pure documentation PR -- no code, no tests, no csproj changes. Per-quirk implementation lands in PRs 56+. Research conducted 2026-04-18 against latest publicly-available Siemens documentation; STOP-mode behaviour and MB_SERVER versioning specifically cross-checked against Siemens forum answers from 2024-2025.
2026-04-18 22:50:51 -04:00
Phase 3 PR 54 -- Siemens S7 Modbus TCP quirks research document. 485-line doc at docs/v2/s7.md mirroring the docs/v2/dl205.md template for the Siemens SIMATIC S7 family (S7-1200 / S7-1500 / S7-300 / S7-400 / ET 200SP / CP 343-1 / CP 443-1 / CP 343-1 Lean / MODBUSPN). Siemens S7 is fundamentally different from DL260: there is no fixed Modbus memory map baked into firmware -- every deployment runs MB_SERVER (S7-1200/1500/ET 200SP), MODBUSCP (S7-300/400 + CP), or MODBUSPN (S7-300/400 PN) library blocks wired up to user DBs via the MB_HOLD_REG / ADDR parameters. The driver's job is therefore to handle per-site CONFIG rather than per-family QUIRKS, and the doc makes that explicit. Key findings worth flagging for the PR 56+ implementation track: (1) S7 has no fixed memory map -- must accept per-site DriverConfig, cannot assume vendor-standard layout. (2) MB_SERVER requires NON-optimized DBs in TIA Portal; optimized DBs cause the library to return STATUS 0x8383 on every access -- the single most common S7 Modbus deployment bug in the field. (3) Word order is ABCD by default (big-endian bytes + big-endian words) across all Siemens S7 Modbus paths, which is the OPPOSITE of DL260 CDAB -- the Modbus driver's S7 profile default must be ByteOrder.BigEndian, not WordSwap. (4) MB_SERVER listens on ONE port per FB instance; multi-client support requires running MB_SERVER on 502 / 503 / 504 / ... simultaneously -- most clients assume port 502 multiplexes, which is wrong on S7. (5) CP 343-1 Lean is SERVER-ONLY and requires the separate 2XV9450-1MB00 MODBUS TCP CP library license; client mode calls return immediate error on Lean. (6) MB_SERVER does NOT filter Unit ID, accepts any value. Means the driver can't use Unit ID to detect 'direct vs gateway' topology. (7) FC23 Read-Write Multiple, FC22 Mask Write, FC20/21 File Records, FC43 Device Identification all return exception 01 Illegal Function on every S7 variant -- the driver MUST NOT attempt bulk-read optimisation via FC23 when talking to S7. (8) STOP-mode read/write behaviour is non-deterministic across firmware bands: reads may return cached data (library internal buffer), writes may succeed-silently or return exception 04 depending on CPU firmware version -- flagged as 'driver treats both as unavailable, do not distinguish'. Unconfirmed rumours flagged separately: 'V2.0+ reverses float byte order' claim (cited but not reproduced), STOP-mode caching location (folklore, no primary source). Per-model test differentiation section names the tests as S7_<model>_<behavior> matching the DL205 template convention (e.g. S7_1200_MB_SERVER_requires_non_optimized_DB, S7_343_1_Lean_rejects_client_mode, S7_FC23_returns_IllegalFunction). 31 cited references across the Siemens Industry Online Support entry-ID system (68011496 for MB_SERVER FAQ, etc.), TIA Portal library manuals, and three third-party driver vendor release notes (Kepware, Ignition, FactoryTalk). This is a pure documentation PR -- no code, no tests, no csproj changes. Per-quirk implementation lands in PRs 56+. Research conducted 2026-04-18 against latest publicly-available Siemens documentation; STOP-mode behaviour and MB_SERVER versioning specifically cross-checked against Siemens forum answers from 2024-2025.
2026-04-18 22:50:51 -04:00
Phase 2 Stream D progress — non-destructive deliverables: appsettings → DriverConfig migration script, two-service Windows installer scripts, process-spawn cross-FX parity test, Stream D removal procedure doc with both Option A (rewrite 494 v1 tests) and Option B (archive + new v2 E2E suite) spelled out step-by-step. Cannot one-shot the actual legacy-Host deletion in any unattended session — explained in the procedure doc; the parity-defect debug cycle is intrinsically interactive (each iteration requires inspecting a v1↔v2 diff and deciding if it's a legitimate v2 improvement or a regression, then either widening the assertion or fixing the v2 code), and git rm -r src/ZB.MOM.WW.OtOpcUa.Host is destructive enough to need explicit operator authorization on a real PR review. scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 takes a v1 appsettings.json and emits the v2 DriverInstance.DriverConfig JSON blob (MxAccess/Database/Historian sections) ready to upsert into the central Configuration DB; null-leaf stripping; -DryRun mode; smoke-tested against the dev appsettings.json and produces the expected three-section ordered-dictionary output. scripts/install/Install-Services.ps1 registers the two v2 services with sc.exe — OtOpcUaGalaxyHost first (net48 x86 EXE with OTOPCUA_GALAXY_PIPE/OTOPCUA_ALLOWED_SID/OTOPCUA_GALAXY_SECRET/OTOPCUA_GALAXY_BACKEND/OTOPCUA_GALAXY_ZB_CONN/OTOPCUA_GALAXY_CLIENT_NAME env vars set via HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost\Environment registry), then OtOpcUa with depend=OtOpcUaGalaxyHost; resolves down-level account names to SID for the IPC ACL; generates a fresh 32-byte base64 shared secret per install if not supplied (kept out of registry — operators record offline for service rebinding scenarios); echoes start commands. scripts/install/Uninstall-Services.ps1 stops + removes both services. tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs is the production-shape parity test — Proxy (.NET 10) spawns the actual OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a subprocess via Process.Start with backend=db env vars, connects via real named pipe, calls Discover, asserts at least one Galaxy gobject comes back. Skipped when running as Administrator (PipeAcl denies admins, same guard as other IPC integration tests), when the Host EXE hasn't been built, or when the ZB SQL endpoint is unreachable. This is the cross-FX integration that the parity suite genuinely needs — the previous IPC tests all ran in-process; this one validates the production deployment topology where Proxy and Host are separate processes communicating only over the named pipe. docs/v2/implementation/stream-d-removal-procedure.md is the next-session playbook: Option A (rewrite 494 v1 tests via a ProxyMxAccessClientAdapter that implements v1's IMxAccessClient by forwarding to GalaxyProxyDriver — Vtq↔DataValueSnapshot, Quality↔StatusCode, OnTagValueChanged↔OnDataChange mapping; 3-5 days, full coverage), Option B (rename OtOpcUa.Tests → OtOpcUa.Tests.v1Archive with [Trait("Category", "v1Archive")] for opt-in CI runs; new OtOpcUa.Driver.Galaxy.E2E test project with 10-20 representative tests via the HostSubprocessParityTests pattern; 1-2 days, accreted coverage); deletion checklist with eight pre-conditions, ten ordered steps, and a rollback path (git revert restores the legacy Host alongside the v2 stack — both topologies remain installable until the downstream consumer cutover). Full solution 964 pass / 1 pre-existing Phase 0 baseline; the 494 v1 IntegrationTests + 6 v1 IntegrationTests-net48 still pass because legacy OtOpcUa.Host stays untouched until an interactive session executes the procedure doc.
2026-04-18 00:38:44 -04:00
Phase 3 PR 53 -- Transport reconnect-on-drop + SO_KEEPALIVE for DL205 no-keepalive quirk. AutomationDirect H2-ECOM100 does NOT send TCP keepalives per docs/v2/dl205.md behavioral-oddities section -- any NAT/firewall device between the gateway and the PLC can silently close an idle socket after 2-5 minutes of inactivity. The PLC itself never notices and the first SendAsync after the drop would previously surface as IOException / EndOfStreamException / SocketException to the caller even though the PLC is perfectly healthy. PR 53 makes ModbusTcpTransport survive mid-session socket drops: SendAsync wraps the previous body as SendOnceAsync; on the first attempt, if the failure is a socket-layer error (IOException, SocketException, EndOfStreamException, ObjectDisposedException) AND autoReconnect is enabled (default true), the transport tears down the dead socket, calls ConnectAsync to re-establish, and resends the PDU exactly once. Deliberately single-retry -- further failures propagate so the driver health surface reflects the real state, no masking a dead PLC. Protocol-layer failures (e.g. ModbusException with exception code 02) are specifically NOT caught by the reconnect path -- they would just come back with the same exception code after the reconnect, so retrying is wasted wire time. Socket-level vs protocol-level is a discriminator inside IsSocketLevelFailure. Also enables SO_KEEPALIVE on the TcpClient with aggressive timing: TcpKeepAliveTime=30s, TcpKeepAliveInterval=10s, TcpKeepAliveRetryCount=3. Total time-to-detect-dead-socket = 30 + 10*3 = 60s, vs the Windows default 2-hour idle + 9 retries = 2h40min. Best-effort: older OSes that don't expose the fine-grained keepalive knobs silently skip them (catch {}). New ModbusDriverOptions.AutoReconnect bool (default true) threads through to the default transport factory in ModbusDriver -- callers wanting the old 'fail loud on drop' behavior can set AutoReconnect=false, or use a custom transportFactory that ignores the option. Unit tests: ModbusTcpReconnectTests boots a FlakeyModbusServer in-process (real TcpListener on loopback) that serves one valid FC03 response then forcibly shuts down the socket. Transport_recovers_from_mid_session_drop_and_retries_successfully issues two consecutive SendAsync calls and asserts both return valid PDUs -- the second must trigger the reconnect path transparently. Transport_without_AutoReconnect_propagates_drop_to_caller asserts the legacy behavior when the opt-out is taken. Validates real socket semantics rather than mocked exceptions. 142/142 Modbus.Tests pass (113 prior + 2 mapper + 2 reconnect + 25 accumulated across PRs 45-52); 11/11 DL205 integration tests still pass with MODBUS_SIM_PROFILE=dl205 -- no regression from the transport change.
2026-04-18 22:32:13 -04:00
Phase 3 PR 53 -- Transport reconnect-on-drop + SO_KEEPALIVE for DL205 no-keepalive quirk. AutomationDirect H2-ECOM100 does NOT send TCP keepalives per docs/v2/dl205.md behavioral-oddities section -- any NAT/firewall device between the gateway and the PLC can silently close an idle socket after 2-5 minutes of inactivity. The PLC itself never notices and the first SendAsync after the drop would previously surface as IOException / EndOfStreamException / SocketException to the caller even though the PLC is perfectly healthy. PR 53 makes ModbusTcpTransport survive mid-session socket drops: SendAsync wraps the previous body as SendOnceAsync; on the first attempt, if the failure is a socket-layer error (IOException, SocketException, EndOfStreamException, ObjectDisposedException) AND autoReconnect is enabled (default true), the transport tears down the dead socket, calls ConnectAsync to re-establish, and resends the PDU exactly once. Deliberately single-retry -- further failures propagate so the driver health surface reflects the real state, no masking a dead PLC. Protocol-layer failures (e.g. ModbusException with exception code 02) are specifically NOT caught by the reconnect path -- they would just come back with the same exception code after the reconnect, so retrying is wasted wire time. Socket-level vs protocol-level is a discriminator inside IsSocketLevelFailure. Also enables SO_KEEPALIVE on the TcpClient with aggressive timing: TcpKeepAliveTime=30s, TcpKeepAliveInterval=10s, TcpKeepAliveRetryCount=3. Total time-to-detect-dead-socket = 30 + 10*3 = 60s, vs the Windows default 2-hour idle + 9 retries = 2h40min. Best-effort: older OSes that don't expose the fine-grained keepalive knobs silently skip them (catch {}). New ModbusDriverOptions.AutoReconnect bool (default true) threads through to the default transport factory in ModbusDriver -- callers wanting the old 'fail loud on drop' behavior can set AutoReconnect=false, or use a custom transportFactory that ignores the option. Unit tests: ModbusTcpReconnectTests boots a FlakeyModbusServer in-process (real TcpListener on loopback) that serves one valid FC03 response then forcibly shuts down the socket. Transport_recovers_from_mid_session_drop_and_retries_successfully issues two consecutive SendAsync calls and asserts both return valid PDUs -- the second must trigger the reconnect path transparently. Transport_without_AutoReconnect_propagates_drop_to_caller asserts the legacy behavior when the opt-out is taken. Validates real socket semantics rather than mocked exceptions. 142/142 Modbus.Tests pass (113 prior + 2 mapper + 2 reconnect + 25 accumulated across PRs 45-52); 11/11 DL205 integration tests still pass with MODBUS_SIM_PROFILE=dl205 -- no regression from the transport change.
2026-04-18 22:32:13 -04:00

LmxOpcUa

OPC UA server and cross-platform client tools for AVEVA System Platform (Wonderware) Galaxy. The server exposes Galaxy tags via MXAccess as an OPC UA address space. The client stack provides a shared library, CLI tool, and Avalonia desktop application for browsing, reading/writing, subscriptions, alarms, and historical data.

Architecture

                                    OPC UA Clients
                              (CLI, Desktop UI, 3rd-party)
                                         |
                                         v
+-----------------+     +------------------+     +-----------------+
| Galaxy Repo DB  |---->|   OPC UA Server  |<--->| MXAccess Client |
|   (SQL Server)  |     | (address space)  |     | (STA + COM)     |
+-----------------+     +------------------+     +-----------------+
                                |                        |
                        +-------+--------+     +---------+---------+
                        | Status Dashboard|     | Historian Runtime |
                        |  (HTTP/JSON)   |     |   (SQL Server)    |
                        +----------------+     +-------------------+

Contained Name vs Tag Name

Browse Path (contained names) Runtime Reference (tag name)
TestMachine_001/DelmiaReceiver/DownloadPath DelmiaReceiver_001.DownloadPath
TestMachine_001/MESReceiver/MoveInBatchID MESReceiver_001.MoveInBatchID

Server

The OPC UA server runs on .NET Framework 4.8 (x86) and bridges the Galaxy runtime to OPC UA clients.

Server Prerequisites

  • .NET Framework 4.8 SDK
  • AVEVA System Platform with ArchestrA Framework installed
  • Galaxy repository database (SQL Server, Windows Auth)
  • MXAccess COM registered (LMXProxy.LMXProxyServer)
  • Wonderware Historian (optional, for historical data access)
  • Windows (required for COM interop and MXAccess)

Build and Run Server

dotnet restore ZB.MOM.WW.LmxOpcUa.slnx
dotnet build src/ZB.MOM.WW.LmxOpcUa.Host
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Host

The server starts on opc.tcp://localhost:4840/LmxOpcUa with the None security profile by default. Configure Security.Profiles in appsettings.json to enable Basic256Sha256-Sign or Basic256Sha256-SignAndEncrypt for transport security. See Security Guide.

Install as Windows Service

cd src/ZB.MOM.WW.LmxOpcUa.Host/bin/Debug/net48
ZB.MOM.WW.LmxOpcUa.Host.exe install
ZB.MOM.WW.LmxOpcUa.Host.exe start

Service logon requirement: The service must run under a Windows account that has access to the AVEVA Galaxy and Historian. The default LocalSystem account can connect to MXAccess and SQL Server but cannot authenticate with the Historian SDK (HCAP). Configure the service to "Log on as" a domain or local user that is a recognized ArchestrA platform user. This can be set in services.msc or during install with ZB.MOM.WW.LmxOpcUa.Host.exe install -username DOMAIN\user -password ***.

Run Server Tests

dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests
dotnet test tests/ZB.MOM.WW.LmxOpcUa.IntegrationTests

Client Stack

The client stack is cross-platform (.NET 10) and consists of three projects sharing a common IOpcUaClientService abstraction. No AVEVA software or COM is required — the clients connect to any OPC UA server.

Client Prerequisites

  • .NET 10 SDK
  • No platform-specific dependencies (runs on Windows, macOS, Linux)

Build All Clients

dotnet build src/ZB.MOM.WW.LmxOpcUa.Client.Shared
dotnet build src/ZB.MOM.WW.LmxOpcUa.Client.CLI
dotnet build src/ZB.MOM.WW.LmxOpcUa.Client.UI

Run Client Tests

dotnet test tests/ZB.MOM.WW.LmxOpcUa.Client.Shared.Tests
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Client.CLI.Tests
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Client.UI.Tests

Client CLI

# Connect
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- connect -u opc.tcp://localhost:4840/LmxOpcUa

# Browse Galaxy hierarchy
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- browse -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=ZB" -r -d 5

# Read a tag
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- read -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=TestMachine_001.MachineID"

# Write a tag
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- write -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=TestChildObject.TestString" -v "Hello"

# Subscribe to changes
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- subscribe -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=TestChildObject.TestInt" -i 500

# Read historical data
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- historyread -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=TestMachine_001.TestHistoryValue" --start "2026-03-25" --end "2026-03-30"

# Subscribe to alarm events
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- alarms -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=TestMachine_001" --refresh

# Query redundancy state
dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.CLI -- redundancy -u opc.tcp://localhost:4840/LmxOpcUa

Client UI

dotnet run --project src/ZB.MOM.WW.LmxOpcUa.Client.UI

The desktop application provides browse tree, subscriptions, alarm monitoring, history reads, and write dialogs. See Client UI Documentation for details.


Project Structure

src/
    ZB.MOM.WW.LmxOpcUa.Host/           OPC UA server (.NET Framework 4.8, x86)
        Configuration/                   Config binding and validation
        Domain/                          Interfaces, DTOs, enums, mappers
        Historian/                       Wonderware Historian data source
        Metrics/                         Performance tracking (rolling P95)
        MxAccess/                        STA thread, COM interop, subscriptions
        GalaxyRepository/                SQL queries, change detection
        OpcUa/                           Server, node manager, address space, alarms, diff
        Status/                          HTTP dashboard, health checks

    ZB.MOM.WW.LmxOpcUa.Client.Shared/   Shared OPC UA client library (.NET 10)
    ZB.MOM.WW.LmxOpcUa.Client.CLI/      Command-line client (.NET 10)
    ZB.MOM.WW.LmxOpcUa.Client.UI/       Avalonia desktop client (.NET 10)

tests/
    ZB.MOM.WW.LmxOpcUa.Tests/           Server unit + integration tests
    ZB.MOM.WW.LmxOpcUa.IntegrationTests/ Server integration tests (live DB)
    ZB.MOM.WW.LmxOpcUa.Client.Shared.Tests/  Shared library tests
    ZB.MOM.WW.LmxOpcUa.Client.CLI.Tests/     CLI command tests
    ZB.MOM.WW.LmxOpcUa.Client.UI.Tests/      UI ViewModel + headless tests

gr/                                      Galaxy repository docs, SQL queries, schema

Documentation

Server

Component Description
OPC UA Server Endpoint, sessions, security policy, server lifecycle
Address Space Hierarchy nodes, variable nodes, primitive grouping, NodeId scheme
Galaxy Repository SQL queries, deployed package chain, change detection
MXAccess Bridge STA thread, COM interop, subscriptions, reconnection
Data Type Mapping Galaxy to OPC UA types, arrays, security classification
Read/Write Operations Value reads, writes, access level enforcement, array element writes
Subscriptions Ref-counted MXAccess subscriptions, data change dispatch
Alarm Tracking AlarmConditionState nodes, InAlarm monitoring, event reporting
Historical Data Access Historian data source, HistoryReadRaw, HistoryReadProcessed
Incremental Sync Diff computation, subtree teardown/rebuild, subscription preservation
Configuration appsettings.json binding, feature flags, validation
Status Dashboard HTTP server, health checks, metrics reporting
Service Hosting TopShelf, startup/shutdown sequence, error handling
Security Transport security profiles, certificate trust, production hardening
Redundancy Non-transparent warm/hot redundancy, ServiceLevel, paired deployment

Client

Component Description
Client CLI Connect, browse, read, write, subscribe, historyread, alarms, redundancy commands
Client UI Avalonia desktop client: browse, subscribe, alarms, history, write values

Reference

License

Internal use only.

Description
No description provided
Readme 12 MiB
Languages
C# 94.6%
TSQL 4.9%
Python 0.3%
Batchfile 0.2%