fe2a6db786
rust / build / test / clippy / fmt (push) Has been cancelled
Layout:
- src/ .NET 10 x64 reference: MxNativeCodec, MxNativeClient,
MxAsbClient, probes, tests, harnesses. Executable spec.
- design/ Architectural plan for the Rust port (M0–M6), error
model, protocol invariants, risks (R1–R16), adversarial
review log (review.md).
- rust/ Rust workspace. M0 skeleton + M1 codec parity.
mxaccess-codec: 215 unit tests + 2 cross-implementation
parity tests (byte-identical against .NET reference).
Other crates are M0 stubs awaiting M2+.
- captures/ Frida + netsh + pcap evidence per CLAUDE.md
("captures are evidence, not throwaway logs").
- analysis/ Decompiled C# (frida/proxy/decompiled-*),
Ghidra exports for native DLLs (`exports/` only —
working state at `projects/` and AVEVA's input
binaries at `input/` are gitignored).
- docs/ Reverse-engineering reference docs.
- tools/ Setup-LiveProbeEnv.ps1 (Infisical credential fetcher),
Compute-Crc.ps1 (.NET parity helper).
- .github/workflows/ Rust CI: fmt + build + test + clippy on Windows.
- LICENSE MIT (Joseph Doherty, 2026).
Verified:
- cargo test --workspace → 217 passed (215 unit + 2 .NET parity), 0 failed
- cargo clippy --workspace -- -D warnings → clean
- cargo fmt --all -- --check → clean
- cargo publish --dry-run -p mxaccess-codec → packages cleanly
Excluded from history (see .gitignore):
- **/bin, **/obj, **/target — build artifacts
- analysis/ghidra/projects/ — Ghidra working state (regenerable)
- analysis/ghidra/input/ — AVEVA proprietary DLLs (vendor IP)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
187 lines
16 KiB
Markdown
187 lines
16 KiB
Markdown
# Roadmap
|
||
|
||
The Rust port is staged so each milestone is independently usable: a milestone delivers either a useful crate, a measurable test improvement, or a concrete API surface. Earlier milestones never depend on later ones.
|
||
|
||
## Phasing
|
||
|
||
### M0 — Workspace skeleton
|
||
|
||
- Create `rust/` workspace per `30-crate-topology.md`.
|
||
- Pin Rust toolchain via `rust-toolchain.toml`.
|
||
- CI: `cargo build --workspace`, `cargo test --workspace`, `cargo clippy --workspace -- -D warnings`, `cargo fmt --check`.
|
||
- Test infrastructure: `tests/fixtures/` populated from `captures/` (copy — junctions are Windows-only and don't survive `git clone` on Linux/macOS; symlinks need Developer Mode on Windows; a plain copy is the only cross-platform option). The matching line in `30-crate-topology.md:29` says the same thing — flag any drift to the user.
|
||
- `mxaccess-codec` exposes the type stubs (empty bodies returning `unimplemented!`) so downstream crates compile.
|
||
- **Define the `Transport` trait (and a placeholder `Session` shape it returns) in M0** as empty/stub signatures in `mxaccess` so M5 can build against the trait without waiting for M4's NMX implementation. This is what allows M5 to run in parallel with M3/M4 — see "Sequencing dependencies" below. M4 fills in the concrete `NmxTransport` impl + recovery policy + `Stream<Item = DataChange>` plumbing; M5 fills in `AsbTransport` against the same trait.
|
||
- Update `CLAUDE.md` "Common commands" with cargo invocations.
|
||
|
||
**Definition of done:** `cargo build --workspace` succeeds; CI green on a clean commit; the empty crates publish-check (`cargo publish --dry-run -p mxaccess-codec`) passes. **Dependency on Q2 (license).** `cargo publish --dry-run` requires a resolved `license` field in `Cargo.toml`; until Q2 in `70-risks-and-open-questions.md` is settled, the publish-check is downgraded to `cargo build --release -p mxaccess-codec`. M0 cannot complete cleanly with the publish-check until Q2 is resolved.
|
||
|
||
### M1 — Codec parity
|
||
|
||
Implement every codec type from `src/MxNativeCodec/`:
|
||
|
||
- `MxReferenceHandle` (CRC-16/IBM, 20-byte layout)
|
||
- `NmxTransferEnvelope` + `NmxTransferEnvelopeTemplate`
|
||
- `NmxItemControlMessage` (advise / supervisory / unadvise)
|
||
- `NmxWriteMessage` (scalar + array, normal + timestamped)
|
||
- `NmxSecuredWrite2Message`
|
||
- `NmxSubscriptionMessage` (DataUpdate, SubscriptionStatus)
|
||
- `NmxReferenceRegistrationMessage` + Result
|
||
- `NmxMetadataQueryMessage`
|
||
- `NmxOperationStatusMessage`
|
||
- `ObservedWriteBodyTemplate`
|
||
- ASB Variant + AsbStatus + RuntimeValue
|
||
- `MxStatus`, `MxValueKind`, `MxDataType`, `MxValue`
|
||
|
||
**Definition of done:** every Frida-captured write/advise/subscribe body that the .NET reference encodes today round-trips byte-identical through `mxaccess-codec` — i.e. the proven matrix in `work_remain.md` (scalar/array writes, advise/unadvise, single-record `0x33` DataUpdate, single-record SubscriptionStatus, the 5-byte `00 00 50 80 00` write-complete frame, and the 1-byte completion frames `0x00`/`0x41`/`0xEF` preserved verbatim). Cross-validated against `src/MxNativeCodec.Tests/` outputs (a fixture runner shells out to `dotnet run --project src\MxNativeCodec.Tests` and asserts the same bytes are produced for shared inputs).
|
||
|
||
Captures whose native behaviour the .NET reference does not yet decode are explicitly out of scope for M1 and are tracked elsewhere:
|
||
|
||
- `captures/077`, `captures/079-082`, `captures/094` — buffered batch payloads (`work_remain.md:176–181`); deferred to M6 + R2.
|
||
- `captures/036` — Activate/Suspend trigger conditions; deferred to R5.
|
||
- Single-token `WriteSecured` (returns `0x80004021` before sending the body); deferred to R6.
|
||
|
||
These captures are still loaded as fixtures so their headers/envelopes round-trip, but the inner unproven payloads are preserved as opaque bytes rather than asserted against a typed decode.
|
||
|
||
### M2 — DCE/RPC + NTLM + OBJREF + OXID + callback exporter
|
||
|
||
- `mxaccess-rpc`: NTLMv2 client context, DCE/RPC PDU codec, TCP transport, OBJREF parser, OXID resolution, IRemUnknown::RemQueryInterface.
|
||
- `mxaccess-callback`: callback exporter (RPC server with `INmxSvcCallback` + `IRemUnknown`).
|
||
- Live probe: connect to local `NmxSvc.exe`, execute `RegisterEngine2` + `GetPartnerVersion` round-trip, register a callback OBJREF, observe a status frame.
|
||
|
||
**Definition of done:** all three of the following must hold against a running AVEVA install (the `partnerVersion`-only check is insufficient on its own — it does not exercise the callback exporter, which the .NET evidence shows is the hardest part of M2 since `MxNativeClient` had to hand-roll `INmxSvcCallback`/`IRemUnknown`):
|
||
|
||
1. `cargo run --example connect-nmx` issues `RegisterEngine2` and observes `partnerVersion == 6` in the response (cite `docs/DotNet10-Native-Library-Plan.md:64-73` for the expected value).
|
||
2. The Rust callback exporter accepts an inbound `IRemUnknown::RemQueryInterface` from `NmxSvc.exe` and returns the negotiated `INmxSvcCallback` interface pointer — i.e. the server-side handshake against our exported OBJREF completes without an `IRemUnknown` reject.
|
||
3. At least one `INmxSvcCallback::StatusReceived` frame is observed end-to-end through the Rust callback exporter (raw frame bytes captured to a fixture under `tests/fixtures/m2-status-frame/`).
|
||
|
||
NTLMv2 packet-integrity matches the .NET reference's `MakeSignature` outputs on a fixed challenge fixture (i.e. byte-equivalent signature for fixed input).
|
||
|
||
### M3 — NMX session + Galaxy resolver
|
||
|
||
- `mxaccess-galaxy`: `tiberius`-based tag resolver, user resolver. Replicates the recursive CTE from `GalaxyRepositoryTagResolver.cs:209–293`.
|
||
- `mxaccess-nmx`: `NmxClient` with `register_engine_2`, `transfer_data`, `add_subscriber_engine`, `set_heartbeat_send_interval`. Builds `MxReferenceHandle` from resolver output + `MxReferenceHandle::compute_name_signature`.
|
||
- Live probes: write `TestChildObject.TestInt = 123`, subscribe, receive callback.
|
||
|
||
**Definition of done:** scalar write + scalar subscribe round-trip live, identical bytes to `captures/022-frida-write-test-int-sequence-106-108` and `captures/058-frida-subscribe-testint` (verified against `captures/` directory listing — `077-frida-suspend-advised-scanstate` is a *suspend* capture and belongs to R5, not M3). Re-implementations of `dotnet run --project src\MxNativeClient.Probe -- --probe-session-write` and `--probe-session-subscribe` succeed when invoked as `cargo run --example session-write` / `--example session-subscribe`.
|
||
|
||
### M4 — Async Tokio façade (NMX path)
|
||
|
||
- `mxaccess::Session` over `NmxTransport`.
|
||
- `mxaccess::Subscription` as `Stream<Item = Result<DataChange, Error>>`.
|
||
- `Session::write`, `write_with_completion`, `write_with_timestamp`, `write_secured`, `write_secured_at`, `read`, `subscribe`, `subscribe_many`.
|
||
- Recovery policy + recovery events (mirroring `MxNativeSession.RecoveryAttempt*` events).
|
||
- `tracing` instrumentation throughout.
|
||
- Examples: `connect-write-read.rs`, `subscribe.rs`, `recovery.rs`, `multi-tag.rs`.
|
||
|
||
**Definition of done:** the public API is end-to-end usable without referencing `mxaccess-codec` directly. A consumer can write 30 lines of `tokio::main` code and get live data. `cargo doc --workspace --open` produces useful API docs. The `examples/` programs all exit `0` against a live AVEVA install.
|
||
|
||
**Parity test fixtures** (verified against `wwtools/mxaccesscli/`):
|
||
- A bare-array reference (e.g. `Obj.Arr` without brackets) returns `MxStatus { category: CommunicationError, detail: 1003 }`. Source: `wwtools/mxaccesscli/docs/usage.md:215,299`. Add as a `mxaccess` integration test that subscribes to a known bare-array reference and asserts the exact `(category, detail)` tuple.
|
||
- Read-as-subscribe parity: a `read(tag, timeout)` against a tag that never publishes returns `Error::Timeout(_)`, with no leaked advise on the server side (verified by issuing a subsequent `subscribe` and confirming no stale-handle error). Source: `wwtools/mxaccesscli/docs/usage.md:24` and `wwtools/mxaccesscli/src/MxAccess.Cli/Commands/ReadCommand.cs:14-78`.
|
||
- Verified Write parity: `write_secured(tag, value, current_user_id, verifier_user_id)` with `current_user_id == verifier_user_id` (single-user path) and with two distinct ids (two-person verification path) both succeed against a tag whose security classification permits it. Source: `wwtools/mxaccesscli/src/MxAccess.Cli/Commands/WriteCommand.cs:151-155,196-199`.
|
||
|
||
### M5 — ASB transport
|
||
|
||
- `mxaccess-asb-nettcp`: [MS-NMF] net.tcp framing + [MC-NBFX]/[MC-NBFS] binary message encoding (the default `NetTcpBinding` encoder, *not* SOAP/XML — see `src/MxAsbClient/MxAsbDataClient.cs:660-685`) + WCF custom-binary inside ASBIData base64.
|
||
- `mxaccess-asb`: `AsbClient` with Connect, RegisterItems, Read, Write, CreateSubscription, AddMonitoredItems, Publish, Disconnect.
|
||
- `mxaccess::Session` over `AsbTransport`; capabilities reflect ASB limits (no `subscribe_buffered`, no `Activate`/`Suspend`, no `OperationComplete` outside the proven write-completion frame).
|
||
- DPAPI shared-secret read on Windows; explicit `AsbCredentials::shared_secret(&[u8])` constructor as escape hatch.
|
||
|
||
**Definition of done:** `cargo run --example asb-subscribe -- --tag TestChildObject.TestInt` succeeds against a live ASB endpoint. Round-trip parity with `dotnet run --project src\MxAsbClient.Probe`. Type matrix in `mxaccess-asb` covers what `work_remain.md:108–113` documents as proven: scalar Boolean, Int32, Float, Double, String, DateTime, Duration, plus "deployed array tags" (the array shapes actually exercised against the live VM, not all eight scalar arrays). Less-common ASB types and the unexercised scalar array shapes are deferred — added only as needed by real deployed tags, per `work_remain.md:110`.
|
||
|
||
### M6 — Compatibility shim + production hardening
|
||
|
||
- `mxaccess-compat`: `LMXProxyServer`-shaped methods on top of `Session`.
|
||
- `subscribe_buffered` (NMX feature) — guarded by `BufferedOptions`; no synthesis if provider returns single-sample batches.
|
||
- Performance pass: zero-copy frame parsing where possible (`bytes::Bytes`), pre-allocated `BytesMut` per session, codec allocation count benchmarked.
|
||
- Optional `metrics` feature emitting counters / histograms.
|
||
- Docs: `cargo doc` published; `cargo public-api` baseline established.
|
||
- Release: `cargo publish` all crates.
|
||
|
||
**Definition of done:** the codec hits the per-write allocation target from R12 (< 5 allocations per write at steady state, measured via `cargo bench` with a counting allocator); live subscribe under churn does not allocate per-message; `cargo public-api` produces a stable surface that clears review.
|
||
|
||
`cargo bench` latency numbers are reported but **not gating** — this matches the V1 non-goal "we measure but don't gate beyond M6's loose acceptance bar" below. There is no .NET microbench harness to compare against (the .NET reference ships probes and assertion-style runners, not benchmarks); building one is out of scope for V1. If a future milestone adds a comparison harness, document it here and only then can a "comparable to .NET" clause be added without contradicting the non-goal.
|
||
|
||
## Validation strategy
|
||
|
||
Three lines of defense against regression.
|
||
|
||
### 1. Round-trip fixtures
|
||
|
||
Every byte sequence in `captures/0NN-frida-*` and `analysis/frida/*.tsv` is a fixture. Test cases load the bytes, decode them, re-encode, and assert equality. New scenarios add new fixtures, never modify old ones. Fixtures live under `crates/mxaccess-codec/tests/fixtures/` (linked or copied).
|
||
|
||
### 2. Live probes
|
||
|
||
Per-milestone live probes mirror the .NET probes (`MxNativeClient.Probe`, `MxAsbClient.Probe`). They run only when `MX_LIVE` env var is set, match the .NET command-line surface, and print the same artifacts. The Rust examples are the canonical live tests.
|
||
|
||
**CI lane status (V1).** Live probes require a Windows runner with AVEVA System Platform installed and a populated Galaxy DB. AVEVA System Platform is a licensed Windows-only install with a SQL Galaxy attached, so a hosted CI runner cannot be spun up from a public image. **V1 ships without a hosted CI lane for live probes** — they run only on the maintainer's workstation, gated by `MX_LIVE=1`. PRs that touch the live-probe surface (anything under `crates/*/examples/` invoked when `MX_LIVE` is set, plus `mxaccess-galaxy` and the NMX/ASB transport crates' integration tests) require a screenshot or capture log from a successful local run attached to the PR. Hosted CI for milestones M2/M3/M4/M5 covers `cargo build`, `cargo test --workspace` (non-live), and `cargo clippy` only. Building a hosted live-probe lane (a pinned AVEVA VM image + Galaxy seed snapshot, owned by the project) is a stretch goal post-V1; it is not a V1 deliverable.
|
||
|
||
### 3. Cross-implementation parity
|
||
|
||
For each milestone with a `dotnet run` equivalent, the same operation runs through both:
|
||
- the .NET reference (`dotnet run --project src\MxNativeClient.Probe -- --probe-session-write ...`)
|
||
- the Rust port (`cargo run --example session-write -- ...`)
|
||
|
||
Wireshark / Frida captures of both runs are diffed; any byte-level divergence is a regression.
|
||
|
||
**Caveat — parity is not correctness.** Byte-parity tests confirm the Rust port matches the .NET reference; they do **not** confirm correctness in the absolute sense. Specifically, the completion-only frame mappings (`0x00`, `0x41`, `0xEF`, plus `MXSTATUS_PROXY[]` conversion — see `work_remain.md:170-174`) are unmapped in both implementations; both preserve them verbatim. If the .NET reference is wrong about one of these bytes, the Rust port will also be wrong, and the parity test will still pass green. R3 and R4 in `70-risks-and-open-questions.md` track this. These frames are marked "preserved verbatim" rather than "verified correct" in the milestone DoDs that touch them.
|
||
|
||
## Build & test commands
|
||
|
||
To be added to `CLAUDE.md` "Common commands" once `rust/` exists:
|
||
|
||
```powershell
|
||
# Workspace-wide
|
||
cargo build --workspace
|
||
cargo test --workspace
|
||
cargo clippy --workspace -- -D warnings
|
||
cargo fmt --check
|
||
|
||
# Single crate
|
||
cargo test -p mxaccess-codec
|
||
|
||
# Live probes (require AVEVA + Galaxy DB). Credentials come from Infisical via
|
||
# wwtools/secrets/Get-Secret.ps1 — never inline plaintext. The setup script
|
||
# fetches the WW_VM_ADMIN_* secrets and (when present) the AVEVA-specific
|
||
# ASB_SHARED_SECRET, then exports MX_LIVE, MX_NMX_HOST, MX_GALAXY_DB,
|
||
# MX_TEST_USER, MX_TEST_DOMAIN, MX_TEST_PASSWORD, MX_ASB_SHARED_SECRET.
|
||
. .\tools\Setup-LiveProbeEnv.ps1 # dot-source so env vars persist
|
||
cargo test -p mxaccess --features live -- --ignored
|
||
|
||
# CI / dev fallback when ASB shared secret is not yet in Infisical:
|
||
. .\tools\Setup-LiveProbeEnv.ps1 -SkipAsbSecret
|
||
# ...then construct AsbCredentials::shared_secret(&[u8]) explicitly in the test.
|
||
|
||
# Examples
|
||
cargo run --example connect-write-read
|
||
cargo run --example subscribe -- --tag TestChildObject.TestInt
|
||
cargo run --example asb-subscribe -- --tag TestChildObject.TestInt
|
||
|
||
# Benchmarks (M6)
|
||
cargo bench -p mxaccess-codec
|
||
```
|
||
|
||
## Sequencing dependencies
|
||
|
||
| Milestone | Depends on | Blocks |
|
||
|---|---|---|
|
||
| M0 | nothing | M1 |
|
||
| M1 | M0 | M2, M3, M5 |
|
||
| M2 | M0, M1 | M3 |
|
||
| M3 | M1, M2 | M4 |
|
||
| M4 | M3 | M6 (NMX) |
|
||
| M5 | M0, M1 (codec only — ASB does not need M2/M3) | M6 (ASB) |
|
||
| M6 | M4 (NMX consumers) or M5 (ASB consumers) | release |
|
||
|
||
M5 can be developed in parallel with M3/M4 because ASB does not depend on DCE/RPC **and** because the `Transport` trait (plus the placeholder `Session` shape it returns) is defined in M0, not M4. M0 publishes the empty/stub trait; M4 fills in the concrete NMX-side `Session` recovery policy, `Stream<Item = DataChange>`, and `NmxTransport` impl; M5 fills in `AsbTransport` against the same M0 trait. Without that M0-level trait split, M5 would block on M4 (since the Session/Transport types live in `mxaccess`, which sits below both transports per `30-crate-topology.md`). The two transport paths converge into the same `Session` at M4 (NMX) and M5 (ASB).
|
||
|
||
## What this roadmap deliberately does not include (V1)
|
||
|
||
- `cargo bench` numbers as gating criteria. We measure but don't gate beyond M6's loose acceptance bar.
|
||
- Drop-in COM interop. `mxaccess-compat` wraps the API shape, not the COM ABI. A `mxaccess-compat-com` crate is post-V1.
|
||
- Multi-runtime support (smol, async-std). Tokio only.
|
||
- 32-bit Windows. x64 only by design.
|
||
- Linux-first deployment. Linux is a stretch goal sitting behind feature flags; see `70-risks-and-open-questions.md` Q3.
|
||
- Full `OperationComplete` parity. Bound to whether the .NET reference can prove the trigger conditions; see R3/R4 in `70-risks-and-open-questions.md`.
|