Files
mxaccess/design/00-overview.md
T
Joseph Doherty fe2a6db786
rust / build / test / clippy / fmt (push) Has been cancelled
Initial project state: .NET reference, design, Rust port (M0+M1), evidence
Layout:
- src/                    .NET 10 x64 reference: MxNativeCodec, MxNativeClient,
                          MxAsbClient, probes, tests, harnesses. Executable spec.
- design/                 Architectural plan for the Rust port (M0–M6), error
                          model, protocol invariants, risks (R1–R16), adversarial
                          review log (review.md).
- rust/                   Rust workspace. M0 skeleton + M1 codec parity.
                          mxaccess-codec: 215 unit tests + 2 cross-implementation
                          parity tests (byte-identical against .NET reference).
                          Other crates are M0 stubs awaiting M2+.
- captures/               Frida + netsh + pcap evidence per CLAUDE.md
                          ("captures are evidence, not throwaway logs").
- analysis/               Decompiled C# (frida/proxy/decompiled-*),
                          Ghidra exports for native DLLs (`exports/` only —
                          working state at `projects/` and AVEVA's input
                          binaries at `input/` are gitignored).
- docs/                   Reverse-engineering reference docs.
- tools/                  Setup-LiveProbeEnv.ps1 (Infisical credential fetcher),
                          Compute-Crc.ps1 (.NET parity helper).
- .github/workflows/      Rust CI: fmt + build + test + clippy on Windows.
- LICENSE                 MIT (Joseph Doherty, 2026).

Verified:
- cargo test --workspace → 217 passed (215 unit + 2 .NET parity), 0 failed
- cargo clippy --workspace -- -D warnings → clean
- cargo fmt --all -- --check → clean
- cargo publish --dry-run -p mxaccess-codec → packages cleanly

Excluded from history (see .gitignore):
- **/bin, **/obj, **/target — build artifacts
- analysis/ghidra/projects/ — Ghidra working state (regenerable)
- analysis/ghidra/input/ — AVEVA proprietary DLLs (vendor IP)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 06:21:00 -04:00

16 KiB
Raw Blame History

Overview

Mission

Build a native Rust replacement for AVEVA/Wonderware MXAccess that gives Rust applications byte-equivalent access to the AVEVA System Platform without depending on the 32-bit LmxProxy.dll / NmxSvcps.dll interop chain.

The replacement ships in two layers:

  1. Raw layer — a faithful Rust reimplementation of the wire protocol (codec + transport + session). Every byte over the wire matches what native MXAccess sends, validated against Frida-captured baselines. The raw layer's API is unsafe-free and Tokio-aware (it uses Tokio for I/O) but its codec is pure and runtime-agnostic.
  2. Async layer — an idiomatic Tokio façade on top of the raw layer: typed errors, Send + Sync handles, async fn operations, structured subscription Streams, drop and CancellationToken cancellation, tracing instrumentation. This is what most consumers reach for.

Both layers ship in one Cargo workspace; the raw crates are useful on their own for power users who need byte-level control or who are integrating into a non-standard runtime.

Why two layers

Inverting the order would compromise correctness. If the public API is async-first, the protocol behavior gets shaped to fit the API. We saw the alternative work in the .NET reference: every async method bottoms out in a sync codec call (NmxTransferEnvelopeTemplate.Encode — see src/MxNativeCodec/NmxTransferEnvelopeTemplate.cs:33) because the wire format has no "async" — it has bytes. Putting bytes first lets us validate against captures with a pure round-trip test, then layer ergonomics on top.

The split also maps cleanly onto the existing .NET tree:

.NET project Rust analogue (raw) Rust analogue (async)
MxNativeCodec mxaccess-codec (codec is shared)
MxNativeClient (DCE/RPC + NTLM + IRemUnknown + INmxService2) mxaccess-rpc, mxaccess-nmx, mxaccess-callback (transport is shared)
MxNativeClient (MxNativeSession, MxNativeCompatibilityServer) (raw layer ends at transport) mxaccess (async session, optional mxaccess-compat shim)
MxAsbClient mxaccess-asb (codec+transport) mxaccess (async ASB session)

The session-level state in MxNativeSession (subscription registry, correlation-id bookkeeping, recovery state, callback routing — src/MxNativeClient/MxNativeSession.cs:90-125, 312-351, 573) lives in the async mxaccess crate, not in mxaccess-nmx. The raw mxaccess-nmx crate exposes the INmxService2 client + envelope codec + a low-level register/advise/write surface so power users can drive it directly (per the "byte-level control" promise above) — but it does not own correlation or recovery, because those are session-level concerns that span both transports. A consumer using mxaccess-nmx standalone is responsible for its own correlation-id table.

Architectural principles

These are non-negotiable. They are informed by what went wrong in the reverse-engineering effort and what the existing tree gets right.

  1. Do not fabricate protocol behavior. Every wire shape in the Rust port must be backed by a Frida capture, a decompiled artifact, or a live probe. When extending, cite the evidence — and capture a new fixture if one does not exist. The native codec deliberately does not zero "unknown" bytes; the Rust port mirrors this.
  2. Round-trip preservation. Encoder and decoder must be bijective on observed traffic. Codec types keep the original byte buffer alongside parsed fields so unknown bytes survive a parse + re-encode. NmxTransferEnvelopeTemplate and ObservedWriteBodyTemplate in the .NET reference exist for this reason — Rust analogues must too.
  3. No unsafe in the public API surface. Public types and trait methods across all crates are safe. Internal unsafe is permitted but confined to mxaccess-rpc, where COM activation / IUnknown calls via the windows crate are unavoidable (see principle 6) — every such call must be wrapped in a safe abstraction at the crate boundary. Codec crates (mxaccess-codec, mxaccess-asb-nettcp) remain #![forbid(unsafe_code)]: no raw pointers, no transmute, multi-byte field access via bytes::Buf / byteorder, memory layout never derived from #[repr(C)].
  4. x64 only. The whole point of the replacement is escaping the 32-bit NmxSvcps.dll proxy/stub. The Rust workspace targets x86_64-pc-windows-msvc (and optionally x86_64-pc-windows-gnu). No 32-bit code paths anywhere; cross-compile to i686-* is unsupported by design.
  5. Windows-first, cross-platform-aware. NTLM, DPAPI, and Galaxy SQL Server are Windows realities for AVEVA deployments. Crate boundaries are drawn so the codec, ASB net.tcp framing (MC-NMF + MC-NBFX/NBFS — not SOAP/XML on the wire; see src/MxAsbClient/MxAsbDataClient.cs:660-685 where NetTcpBinding(SecurityMode.None) selects the default BinaryMessageEncodingBindingElement), and protocol logic compile on Linux even when the platform-bound transports do not. Cross-platform reach is a stretch goal — see 70-risks-and-open-questions.md.
  6. COM via windows-rs when COM types are unavoidable: OBJREF building, IPID/OXID/OID handling, GUID literals. For raw bytes (NDR encoding, NMX envelope, write bodies) we hand-roll — the surface is small enough that a generated stub would obscure the wire and compromise rule 1.
  7. Galaxy access is direct SQL. No LMX. The Rust port queries dbo.gobject / dbo.instance / dbo.dynamic_attribute (and the package-inheritance CTE) the same way GalaxyRepositoryTagResolver.cs does (see src/MxNativeClient/GalaxyRepositoryTagResolver.cs:215, 253, 257), then computes CRC-16/IBM signatures locally to build MxReferenceHandles. Note: CLAUDE.md lists the SQL surface as aa_attribute / aa_object / mx_attribute_category — that is incorrect. Those tables do not exist in the resolver source; the actual tables are dbo.gobject / dbo.instance / dbo.dynamic_attribute as cited above. Treat this design doc as authoritative over CLAUDE.md for SQL surface, and update CLAUDE.md next time it is touched.
  8. One Tokio runtime, multi-thread by default. The async layer assumes #[tokio::main(flavor = "multi_thread")] semantics; current_thread is supported but not the default. No tokio::spawn from inside Drop; no blocking calls inside async fn. Drop-based cancellation is implemented by sending a cleanup request (e.g. UnAdvise for a Subscription, RemoveSubscriberEngine/UnregisterEngine for the last Session clone) over a tokio::sync::mpsc or tokio::sync::oneshot channel to a long-lived connection task that was spawned at session construction time. The connection task's lifetime exceeds any individual Subscription, so Drop itself never spawns and never blocks. This mirrors the .NET reference's synchronous teardown path (MxNativeSession.cs:483-507), where UnAdvise per subscription, RemoveSubscriberEngine per publisher, and UnregisterEngine are all invoked from a single dispose-time loop on a pre-existing service handle.
  9. Two transports, one façade. Session is parameterised over a Transport trait. NmxTransport and AsbTransport are independent implementations; capability is queryable. A Session constructed with a single transport returns Error::Unsupported for operations that transport cannot reach (e.g. Session::activate(item) on an ASB-only Session — ASB has no Activate/Suspend/supervisory-advise surface; see non-goal 5). A Session constructed via the dual-transport builder (Session::builder().with_nmx(...).with_asb(...).build()) routes callback-only operations to NMX automatically and the regular tag data plane to ASB, matching the deployment shape recommended in docs/ASB-Native-Integration-Decision.md. Routing is static at session-build time; Session does not silently activate a fallback transport at runtime.
  10. Status is data, errors are exceptional. A non-Ok MxStatus on a returned data change is data the caller inspects, not a Result::Err. A non-Ok status returned from a synchronous-shaped operation (write, read) is an Err. This split mirrors the .NET reference and is the only sensible mapping; see 50-error-model.md.

Non-goals (V1)

  • WinSXS-style side-by-side install with the native MXAccess COM proxies.
  • 32-bit clients. The Rust crates do not build for i686-pc-windows-msvc.
  • A drop-in COM-visible LMXProxyServer.LMXProxyServer ProgId. The MXAccess shape is replicated as a Rust API; consumers that want to expose it as COM register a separate shim crate (mxaccess-compat-com, deferred to post-V1).
  • Linux first-class support in V1. Crate boundaries do not preclude Linux later, but Galaxy SQL + DPAPI mean V1 ships Windows-only.
  • ASB feature parity with NMX. ASB cannot reach callback-only semantics (Activate/Suspend, supervisory advise, OperationComplete). The Rust port routes those to NMX; ASB owns the regular tag data plane only. See docs/ASB-Native-Integration-Decision.md.

At-a-glance architecture

+----------------------------------------------------------------------+
|                Application (Rust, async)                             |
+----------------------------------------------------------------------+
                              |
                              v  async fn / Stream<Item = DataChange>
+----------------------------------------------------------------------+
|  mxaccess (async layer)                                              |
|  - Session, Subscription, DataChange, MxValue                        |
|  - trait Transport { connect, register, write, advise, ... }         |
|    (read is NOT a transport primitive — it is a session-level helper |
|     composed from subscribe + first-result + drop, mirroring         |
|     MxNativeSession.ReadAsync at src/MxNativeClient/MxNativeSession.cs:312-359) |
|  - Drop-cancellable, tracing-instrumented, typed Error               |
+----------------------------------------------------------------------+
              |                                    |
              | NmxTransport                       | AsbTransport
              v                                    v
+---------------------------------+  +----------------------------------+
| mxaccess-nmx (NMX raw)          |  | mxaccess-asb (ASB raw)           |
| INmxService2 client + envelope  |  | IASBIDataV2 client + variant     |
+---------------------------------+  +----------------------------------+
       |              |                         |
       v              v                         v
+--------------+ +------------------------+ +--------------------+
| mxaccess-rpc | | mxaccess-callback      | | mxaccess-asb-nettcp|
| DCE/RPC PDU  | | INmxSvcCallback server | | MC-NMF framing +   |
| + NTLMv2 SSP | | + IRemUnknown          | | MC-NBFX/NBFS binary|
|              | |                        | | + DH/HMAC/AES      |
+--------------+ +------------------------+ +--------------------+
       |
       v
+----------------------------------------------------------------------+
|  mxaccess-codec (pure, no I/O)                                       |
|  MxReferenceHandle, NmxTransferEnvelope, write/advise/subscribe      |
|  bodies, MxStatus, MxValueKind, MxDataType, ASB Variant              |
+----------------------------------------------------------------------+
       |                                                              |
       v                                                              v
+--------------------+                              +-------------------+
| mxaccess-galaxy    |                              | windows (crate)   |
| SQL tag resolver   |                              | OBJREF/IID/OXID   |
+--------------------+                              +-------------------+

Phasing summary

Detailed roadmap in 60-roadmap.md. At a glance:

  • M0 — Workspace skeleton, CI, fixture infrastructure.
  • M1mxaccess-codec complete; round-trips every Frida fixture.
  • M2mxaccess-rpc + mxaccess-callback: live RegisterEngine2 against NmxSvc.exe.
  • M3mxaccess-nmx + mxaccess-galaxy: live scalar write/subscribe.
  • M4mxaccess async façade over NMX. End-to-end consumer-grade API.
  • M5mxaccess-asb + mxaccess-asb-nettcp: ASB transport plugged into the same Session.
  • M6mxaccess-compat + production hardening (recovery, perf, observability).

The order is chosen so each milestone's exit criterion is independently observable: codec parity (M1), live RPC (M2), live data (M3), consumer API (M4), alternate transport (M5), shipping (M6).

Adjacent tooling (C:\Users\dohertj2\Desktop\wwtools)

A sibling toolkit at C:\Users\dohertj2\Desktop\wwtools collects WW/AVEVA-adjacent CLIs and reference material. Several are load-bearing for this project — they replace credentials we would otherwise inline, and provide the comparison harnesses M2M5 need. See wwtools/CLAUDE.md for the authoritative index.

Tool Path Used by Rust port for
secrets/ wwtools/secrets/ Credential retrieval. Self-hosted Infisical CLI (infisical.exe) + Get-Secret.ps1 PowerShell helper backed by https://infisical.dohertylan.com. Replaces the DPAPI-only path in mxaccess-asb (R9): live probes and CI fetch the ASB shared secret, NTLM credentials, Galaxy DB connection string, etc. via secret <KEY> instead of inlining plaintext. The AsbCredentials::shared_secret(&[u8]) constructor pairs with this — wire it via secret ASB_SHARED_SECRET in probe scripts.
mxaccesscli/ wwtools/mxaccesscli/ Parity harness. .NET Framework 4.8 / x86 CLI built on LMXProxyServerClass — i.e. the original 32-bit MxAccess COM proxy. Use as a third comparison point for cross-implementation parity (alongside src/MxNativeClient.Probe). Read/write/subscribe semantics here are the proven ground truth for what consumers expect from the Rust port's compat shim.
graccesscli/ wwtools/graccesscli/ Galaxy configuration setup. .NET Framework 4.8 / x86 CLI over GRAccess COM. Use to provision test objects/attributes for live probes (M3+) without manual IDE clicks — scriptable galaxy setup for CI and reproducible test fixtures.
grdb/ wwtools/grdb/ Galaxy SQL schema reference. Cross-check mxaccess-galaxy's tiberius queries against the documented schema, hierarchy queries, and contained-name ↔ tag-name translation rules. M3 schema correctness is verified here before M3 lands.
aalogcli/ wwtools/aalogcli/ Debugging. Reads System Platform .aaLGX binary logs. Use to correlate Rust-port runtime errors with what NmxSvc.exe / LMX adapters log on the System Platform side.
histdb/ wwtools/histdb/ Out of scope for V1 but documented here so the Rust port doesn't accidentally re-implement Historian retrieval in mxaccess. The tag data plane (NMX/ASB) and the historical-data plane (INSQL, wwXxx extensions) are distinct subsystems.
aot/ wwtools/aot/ Reference material (ArchestrA Object Toolkit dev guide, API reference). Background only — the Rust port does not consume AOT primitives directly; the wire shapes are observed end-to-end in captures/.

Operational note: wwtools/secrets/secret <KEY> is the canonical credential-fetch path on this workstation. The Rust port's live-feature integration tests should source MX_GALAXY_DB, MX_NMX_HOST, MX_ASB_SHARED_SECRET, etc. via secret invocations in the test setup script, not via inline plaintext or .env files committed to the repo. This supersedes the "inline credentials are fine for the maintainer's workstation" stance implied by the M2/M3 live-probe DoDs in 60-roadmap.md.