docs: F3 cross-domain NTLM provisioning recipe

Self-contained doc at docs/F3-cross-domain-ntlm-recipe.md for whoever picks F3 up on hardware with two AD forests + a forest trust. Covers: - Lab topology (LAB-A resource forest with AVEVA install + LAB-B account forest with the probe user, bidirectional forest trust). - DC + DNS + trust + user provisioning steps (Install-ADDSForest, Add-DnsServerConditionalForwarderZone, New-ADTrust, New-ADUser). - Capture procedure for both the Rust and .NET probes under a `runas /netonly` cross-domain token, with Wireshark NTLMSSP guidance. - Fixture layout under crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/. - Round-trip test skeleton (replay the captured Type 2 → regenerate Type 3 → assert byte-equality against the captured Type 3). - Redaction checklist for the captured bytes. - Why F3 is "evidence work" not "codec work" — the AV pair parser is shape-agnostic, so the codec path is already correct; the fixture is a regression net for any future drift. F3 entry in design/followups.md and R8 in design/70-risks-and-open-questions.md both now point at the recipe so a future contributor doesn't have to reconstruct the lab topology from the followup analysis alone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 02:40:06 -04:00
parent 73e2bd8771
commit ddebab2c2d
3 changed files with 337 additions and 2 deletions
@@ -202,7 +202,7 @@ Captured traffic is single-domain (local AVEVA install). Cross-domain NTLM exerc

 **Current best answer:** the AV pair parser handles the cross-domain shape per [MS-NLMP] §2.2.2.1; document `mxaccess-rpc` as untested across domains in the README. The `mxaccess-rpc::ntlm` round-trip tests cover the single-domain shape; cross-domain rounds-trip through the same code path (the AV pair parser is shape-agnostic) but no live fixture pins it.

-**Reopen when:** a multi-domain AVEVA test harness becomes available + a cross-domain probe runs successfully end-to-end with packet-integrity signatures verified. Until then, this risk is permanently deferred — same status pattern as F3.
+**Reopen when:** a multi-domain AVEVA test harness becomes available + a cross-domain probe runs successfully end-to-end with packet-integrity signatures verified. Until then, this risk is permanently deferred — same status pattern as F3. Self-contained provisioning recipe (lab topology, DC/DNS/trust setup, capture procedure, fixture layout, round-trip test skeleton) at `docs/F3-cross-domain-ntlm-recipe.md`.

 ### R9 — DPAPI dependency for ASB

@@ -218,7 +218,7 @@ This makes Path A the architecturally correct fix: the callback exporter must be
 **Severity:** P2
 **Status:** Permanently out-of-scope on the current dev host (no second AD domain). Resolution requires external infrastructure not available here.
 **Source:** M2 wave 1, `crates/mxaccess-rpc/src/ntlm.rs`. All current NTLM fixtures are single-domain (the local AVEVA install). Tracked separately in `design/70-risks-and-open-questions.md` R8 (P1 risk) and the open-evidence-gaps table.
-**Concrete next step:** Provision a two-domain Windows lab (e.g. `LAB-A` + `LAB-B` with cross-domain trust + an AVEVA install on `LAB-A` that authenticates a user from `LAB-B`). Run `cargo run -p mxaccess --example connect-write-read` from a `LAB-B`-domain user; capture the NTLM Type1 / Type2 / Challenge / Type3 bytes via `examples/asb-relay.rs` or a Wireshark NTLM filter. Save under `crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/`. The existing single-domain Type1/2/3 round-trip tests in `mxaccess-rpc::ntlm` then extend to validate the cross-domain shape (TargetInfo AV pairs differ when crossing domains; specifically `MsvAvDnsTreeName` and `MsvAvDnsComputerName` carry the trusted-domain DNS suffix instead of the local one). Clears R8 in the risks doc.
+**Concrete next step:** See the full provisioning recipe at [`docs/F3-cross-domain-ntlm-recipe.md`](../docs/F3-cross-domain-ntlm-recipe.md). It documents the lab topology (two forests + bidirectional forest trust + a `LAB-B\probe.user` authenticating against an AVEVA install on `LAB-A`), the DC + DNS + trust + user provisioning steps, the Wireshark + `connect-write-read` capture procedure, the exact fixture layout under `crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/`, the round-trip test skeleton (replay the captured Type 2 bytes → regenerate Type 3 → assert byte-equality), and the redaction checklist. Clears R8 in the risks doc when the fixture lands.



@@ -0,0 +1,335 @@
+# F3 — Cross-domain NTLM Type1/2/3 fixture: provisioning recipe
+
+This is a self-contained recipe for whoever picks F3 up on hardware that has (or can run) **two Active Directory domains with a forest trust**. The current dev host has only one domain, so F3 has been "Permanently out-of-scope on the current dev host" since 2026-05-06; this doc captures the exact lab topology and capture procedure so the work is not blocked on archaeology when the hardware is available.
+
+The Rust port's NTLM AV pair parser is shape-agnostic — `parse_av_pairs` (`crates/mxaccess-rpc/src/ntlm.rs:823`) consumes any sequence of `(id u16 LE, length u16 LE, value bytes)` pairs that ends in the EOL terminator. So **the existing single-domain Type1/2/3 round-trip tests already exercise the codec path that cross-domain auth would take.** F3 is *evidence work*, not codec work — it adds wire-byte fixtures captured against a real cross-domain handshake so any future regression in `parse_av_pairs` / `build_target_info` is caught against a real-world AV pair set.
+
+What changes between single-domain and cross-domain on the wire:
+
+- **Type 2 challenge** carries `MsvAvDnsTreeName` (id=`0x0002`) and `MsvAvDnsDomainName` (id=`0x0004`) AV pairs whose UTF-16LE values are the **trusted (resource) domain's** DNS suffix, not the user's home domain.
+- `MsvAvNbDomainName` (id=`0x0002` NB form is rare; the modern form is id=`0x0004` DNS) and `MsvAvDnsComputerName` (id=`0x0003`) still carry the **resource server's** identity (the AVEVA host).
+- **Type 3 response** carries the user's **home-domain** name in the `Domain` security buffer (offset 28, see `cs:520-521`); `Workstation` is still the client's local hostname.
+- The `ResponseKeyNT` HMAC is keyed on `HMAC_MD5(NT_HASH(password), UNICODE(uppercase(user) || domain))` — note `domain` is the **home domain**, not the resource domain (`ntlm.rs:459-465`).
+
+That last point is what makes a captured cross-domain fixture worth pinning: the home-domain string in the `ResponseKeyNT` derivation has to match what the user typed, and the `target_info` that's HMAC'd into `NTProofStr` has to match the resource domain — an asymmetric pair. Single-domain fixtures cannot exercise that asymmetry.
+
+---
+
+## Lab topology
+
+Minimum viable two-domain lab. Names are illustrative; substitute throughout.
+
+```
+                    +-----------------+         +-----------------+
+                    |   LAB-A.LOCAL   |  trust  |   LAB-B.LOCAL   |
+                    |  (resource)     |<------->|  (account)      |
+                    |  domain GUID Ga |         |  domain GUID Gb |
+                    +-----------------+         +-----------------+
+                            |                            |
+                  +---------+---------+        +---------+---------+
+                  | DC-A.LAB-A.LOCAL  |        | DC-B.LAB-B.LOCAL  |
+                  | Win Server 2022   |        | Win Server 2022   |
+                  | DC + DNS          |        | DC + DNS          |
+                  | 10.20.0.10        |        | 10.21.0.10        |
+                  +-------------------+        +-------------------+
+                            |
+                  +---------+---------+
+                  | AVEVA-A.LAB-A.    |        users:
+                  |   LOCAL           |          - lab-a\admin       (DC-A admin)
+                  | Win 10/11 Pro     |          - lab-b\probe.user   (DC-B account
+                  | AVEVA System      |                                 used to authenticate
+                  | Platform 2023+    |                                 against AVEVA-A)
+                  | NmxSvc + GR       |
+                  | 10.20.0.20        |
+                  +-------------------+
+```
+
+The trust must be **forest trust, two-way (or one-way: B→A trusts A)**. Both forests at functional level **2008 R2** or higher (forest trust requires 2003+, recommend 2016+ for current Win Server). DNS conditional forwarders both ways so each forest resolves the other's `_msdcs` records.
+
+**Why not a single forest with two child domains.** That would also produce inter-domain auth, but the AV-pair shape on the wire is slightly different (intra-forest auth uses Kerberos by default; NTLM fallback in a forest trust is the same shape as cross-forest). Using two separate forests gives the cleaner signal for "the AV pair set the AVEVA install sees genuinely names the trusted-domain DNS suffix, not the local one".
+
+---
+
+## Provisioning the lab
+
+### 1. Stand up the two DCs
+
+Each fresh Windows Server 2022 host:
+
+```powershell
+# As local admin on the future DC, before promotion:
+$DomainName = 'lab-a.local'         # or 'lab-b.local' for the other one
+$DsrmPassword = ConvertTo-SecureString '<choose-strong>' -AsPlainText -Force
+
+Install-WindowsFeature AD-Domain-Services, DNS -IncludeManagementTools
+
+Install-ADDSForest `
+    -DomainName $DomainName `
+    -DomainNetbiosName ($DomainName.Split('.')[0].ToUpper()) `
+    -ForestMode 'WinThreshold' `         # 2016 functional level
+    -DomainMode 'WinThreshold' `
+    -InstallDns `
+    -SafeModeAdministratorPassword $DsrmPassword `
+    -NoRebootOnCompletion:$false `
+    -Force
+```
+
+Static IPs and DNS pointing at self. Reboot once, log in as `LAB-A\Administrator` / `LAB-B\Administrator`.
+
+### 2. Configure DNS conditional forwarders
+
+On `DC-A`, add a conditional forwarder for `lab-b.local` → `10.21.0.10`. On `DC-B`, the mirror image.
+
+```powershell
+# On DC-A:
+Add-DnsServerConditionalForwarderZone -Name 'lab-b.local' -MasterServers '10.21.0.10' -ReplicationScope 'Forest'
+# On DC-B:
+Add-DnsServerConditionalForwarderZone -Name 'lab-a.local' -MasterServers '10.20.0.10' -ReplicationScope 'Forest'
+```
+
+Verify with `Resolve-DnsName lab-b.local -Server localhost` from `DC-A` (and the reverse).
+
+### 3. Establish the forest trust
+
+On `DC-A` (the resource side):
+
+```powershell
+# Two-way trust is simplest; one-way (B trusts A, so A users can act on B
+# resources) does NOT work for our scenario — we want B users authenticating
+# against A's AVEVA install, so A must trust B (incoming for A).
+$Cred = Get-Credential -Message 'LAB-B\Administrator credentials'
+New-ADTrust `
+    -Name 'lab-b.local' `
+    -SourceForest 'lab-a.local' `
+    -TargetForest 'lab-b.local' `
+    -TrustType Forest `
+    -Direction Bidirectional `
+    -Authentication Selective:$false `   # forest-wide auth (simpler for the lab)
+    -Credential $Cred
+```
+
+Verify: `Get-ADTrust -Filter * | Format-Table Name, Direction, TrustType` on each DC should show the trust as `Bidirectional` / `Forest`.
+
+### 4. Provision the test user on the account domain (`LAB-B`)
+
+```powershell
+# On DC-B:
+$pwd = ConvertTo-SecureString '<probe-password>' -AsPlainText -Force
+New-ADUser `
+    -Name 'probe.user' `
+    -SamAccountName 'probe.user' `
+    -UserPrincipalName 'probe.user@lab-b.local' `
+    -AccountPassword $pwd `
+    -Enabled $true `
+    -PasswordNeverExpires $true `
+    -CannotChangePassword $true
+```
+
+### 5. Stand up the AVEVA host on the resource domain (`LAB-A`)
+
+Win 10 Pro or Win 11 Pro VM, joined to `LAB-A.LOCAL`. Install AVEVA System Platform 2023 R2 (or whatever matches the dev host). Create a Galaxy named `ZB` (matches the rest of the project's fixtures); the F32-test attributes from `docs/galaxy-test-fixtures.md` are sufficient.
+
+Grant `LAB-B\probe.user` Galaxy rights:
+
+- ArchestrA IDE → User Roles → add `LAB-B\probe.user` to a role with `Read/Write` on the test objects.
+- Local: add `LAB-B\probe.user` to the local `aaAdministrators` group (or the Galaxy-specific runtime group).
+
+### 6. Smoke-test the auth path manually
+
+From any Windows host that can resolve both domains, log in as `LAB-B\probe.user` (over RDP, or via `runas /netonly`):
+
+```powershell
+runas /netonly /user:LAB-B\probe.user `
+    "powershell -NoProfile -Command `"net use \\AVEVA-A.LAB-A.LOCAL\IPC$ /user:LAB-B\probe.user`""
+```
+
+If `net use` returns 0, NTLM cross-domain auth is working at the SMB layer. Now we capture the same shape against NmxSvc.
+
+---
+
+## Capture procedure
+
+### A. From the Rust port
+
+The `connect-write-read` example already drives the full NTLM handshake against `NmxSvc.exe`. Capture under a `LAB-B\probe.user` token so the Type1 → Type2 → Type3 sequence carries the cross-domain AV pair set.
+
+```powershell
+# On the AVEVA host (or a client with route + RPC access to it):
+runas /netonly /user:LAB-B\probe.user powershell
+
+# Inside the spawned shell:
+$env:MX_RPC_USER = 'probe.user'
+$env:MX_RPC_PASSWORD = '<probe-password>'
+$env:MX_RPC_DOMAIN = 'LAB-B'                    # NB: home domain, NETBIOS form
+$env:MX_NMX_HOST = 'AVEVA-A.LAB-A.LOCAL'
+$env:MX_GALAXY_DB = 'AVEVA-A.LAB-A.LOCAL\SQLEXPRESS'
+$env:MX_TEST_USER = 'probe.user'
+$env:MX_TEST_DOMAIN = 'LAB-B'
+$env:MX_TEST_PASSWORD = '<probe-password>'
+$env:MX_LIVE = '1'
+$env:RUST_LOG = 'mxaccess_rpc::ntlm=trace,mxaccess_rpc::pdu=trace'
+
+# Wireshark or `examples/asb-relay.rs` middleman to intercept the bytes.
+# Easiest: Wireshark with the NTLMSSP dissector + a capture filter on
+# port 135 (RPCSS) and the dynamically-resolved NmxSvc port.
+cargo run -p mxaccess --example connect-write-read -- `
+    --tag TestChildObject.TestInt --value 42 2>&1 | Tee-Object -FilePath connect.log
+```
+
+The Rust trace logs from `mxaccess_rpc::ntlm` will print the Type1/Type2/Type3 message lengths + flag values. Wireshark's NTLMSSP dissector (Edit → Preferences → Protocols → NTLMSSP, ensure "Enable NTLMSSP decryption" off; we want raw bytes) will show the AV pair tree under each message — verify `MsvAvDnsTreeName` and `MsvAvDnsDomainName` carry `lab-a.local` (the resource domain) before saving.
+
+### B. From the .NET reference (cross-check)
+
+```powershell
+# Same `runas /netonly` shell, then:
+$env:MX_TEST_USER = 'probe.user'
+$env:MX_TEST_DOMAIN = 'LAB-B'
+$env:MX_TEST_PASSWORD = '<probe-password>'
+dotnet run --project src\MxNativeClient.Probe\MxNativeClient.Probe.csproj `
+    -c Release -- --probe-session-write `
+    --tag=TestChildObject.TestInt --value=42 --objref-only
+```
+
+If both the Rust and .NET probes succeed end-to-end against the same `LAB-B\probe.user` credential, NTLM is working cross-domain. Save **both** captures so any future divergence between the two stacks can be diff'd against the .NET reference's known-good bytes.
+
+### C. Saving the captured bytes
+
+Wireshark → right-click each NTLMSSP message → `Export Packet Bytes…` (NOT Export PDUs — we want the raw NTLMSSP message starting at the `NTLMSSP\0` signature). Save as:
+
+```
+crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/
+├── README.md                     # capture date, lab versions, redacted creds
+├── type1-laB-b-user-vs-aveva-a.bin
+├── type2-challenge-from-aveva-a.bin
+├── type3-laB-b-user-to-aveva-a.bin
+└── target-info-laB-b-user.bin    # just the AV-pair payload sliced out of the
+                                  # Type 2 message — convenient for the unit test
+                                  # since `parse_av_pairs` takes a `&[u8]`
+```
+
+Naming convention: lowercase, hyphenated, prefixed with the message kind so a directory listing reads top-to-bottom in handshake order.
+
+### D. Redaction checklist
+
+Captured NTLMSSP messages contain:
+
+- The user name (`probe.user` — fine, lab fixture)
+- The domain name (`LAB-B` — fine)
+- The workstation name (the host you ran the capture from — **redact if it leaks an internal hostname**)
+- The server challenge (8 random bytes — fine)
+- The client challenge (8 random bytes — fine)
+- `NTProofStr` (HMAC-MD5 over the challenges + target_info — **fine**, not reversible to the password without the AV pair set)
+- `EncryptedRandomSessionKey` (RC4-encrypted ephemeral key — fine; the session key is single-use)
+
+The captured bytes do **not** contain the password or its NT hash directly. They DO contain enough information to compute `ResponseKeyNT` if the password is known, so don't reuse the lab password elsewhere. Add the captured creds to the `.gitignore`-honoured `tools/Setup-LiveProbeEnv.ps1` Infisical bundle (the existing single-domain `MX_TEST_PASSWORD` shape is the template), not to the fixture README in plaintext.
+
+---
+
+## Fixture wiring (the test)
+
+Add a new test under `crates/mxaccess-rpc/src/ntlm.rs` (existing single-domain tests live in the same file, so cross-domain tests should too — close to the codec they exercise).
+
+Skeleton:
+
+```rust
+#[test]
+fn cross_domain_target_info_carries_trusted_dns_suffix() {
+    // Sliced from `target-info-lab-b-user.bin` — the AV-pair payload
+    // from a real LAB-B\probe.user → AVEVA-A.LAB-A.LOCAL handshake.
+    let target_info = include_bytes!(
+        "../tests/fixtures/cross-domain-ntlm/target-info-lab-b-user.bin"
+    );
+    let pairs = parse_av_pairs(target_info).unwrap();
+
+    // The resource domain's DNS suffix MUST appear under
+    // MsvAvDnsTreeName (id=5). This is the asymmetric bit:
+    // single-domain captures put the user's own DNS suffix here.
+    let tree = pairs.iter().find(|p| p.id == 5).expect("MsvAvDnsTreeName");
+    assert_eq!(utf16le_to_string(&tree.value), "lab-a.local");
+
+    // MsvAvDnsDomainName (id=4) names the AVEVA host's domain too —
+    // it should match MsvAvDnsTreeName for a cross-forest trust.
+    let dom = pairs.iter().find(|p| p.id == 4).expect("MsvAvDnsDomainName");
+    assert_eq!(utf16le_to_string(&dom.value), "lab-a.local");
+
+    // MsvAvDnsComputerName (id=3) is the FQDN of the resource server.
+    let host = pairs.iter().find(|p| p.id == 3).expect("MsvAvDnsComputerName");
+    assert!(utf16le_to_string(&host.value).ends_with(".lab-a.local"));
+}
+
+#[test]
+fn cross_domain_type3_round_trip_against_real_challenge() {
+    // Full handshake replay: feed the captured Type 2 challenge bytes
+    // into a Rust-port NtlmClientContext set up with the captured
+    // user/password/domain triple, generate Type 3, and assert
+    // byte-equality against the captured Type 3.
+    //
+    // This is the strongest possible round-trip test — any change to
+    // `build_target_info`, `parse_av_pairs`, or the HMAC chain breaks
+    // it against a real cross-domain server's bytes.
+    let challenge = include_bytes!(
+        "../tests/fixtures/cross-domain-ntlm/type2-challenge-from-aveva-a.bin"
+    );
+    let expected_type3 = include_bytes!(
+        "../tests/fixtures/cross-domain-ntlm/type3-lab-b-user-to-aveva-a.bin"
+    );
+
+    let mut ctx = NtlmClientContext::new(
+        "probe.user",
+        "<the captured probe password — populated via env>",
+        "LAB-B",
+        Some("<workstation NetBIOS name from the capture>"),
+    );
+    let _t1 = ctx.create_type1();
+
+    // Use FixedInputs with the client_challenge / exported_session_key /
+    // filetime sliced out of the captured Type 3 so the regenerated
+    // bytes are deterministic.
+    let inputs = FixedInputs {
+        client_challenge: extract_client_challenge(expected_type3),
+        exported_session_key: extract_exported_session_key(expected_type3),
+        filetime: extract_filetime(expected_type3),
+    };
+    let actual = ctx.create_type3(challenge, &mut { inputs }).unwrap();
+    assert_eq!(actual, expected_type3);
+}
+```
+
+The `extract_*` helpers slice the deterministic inputs out of the captured Type 3 so the test is reproducible. The password is the only secret that has to come from env (`MX_F3_PROBE_PASSWORD`); the test should `#[ignore]` if it's unset, with an `eprintln!` pointing at this recipe doc.
+
+Helper for the UTF-16LE comparison:
+
+```rust
+fn utf16le_to_string(bytes: &[u8]) -> String {
+    let units: Vec<u16> = bytes
+        .chunks_exact(2)
+        .map(|c| u16::from_le_bytes([c[0], c[1]]))
+        .collect();
+    String::from_utf16(&units).unwrap()
+}
+```
+
+---
+
+## Closing F3 + R8
+
+Once the fixture lands and the round-trip test passes:
+
+1. `design/followups.md` F3 → move to `## Resolved` with the commit hash.
+2. `design/70-risks-and-open-questions.md` R8 → flip from `PERMANENTLY DEFERRED` to `Resolved <date> (commit hash). Cross-domain handshake exercised live + fixture pinned at crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/.`
+3. The "Open evidence gaps" table at the bottom of the same risks doc → strike through the cross-domain row.
+
+Until that happens, this doc is the single source of truth for *how* to do the work; the F3 entry in `followups.md` only needs to point here.
+
+---
+
+## Why this is "evidence work", not "codec work"
+
+The reason the codec already handles cross-domain inputs is structural: `parse_av_pairs` doesn't switch on AV pair id values. It walks any `(id, len, value)` sequence. `build_target_info` only **rewrites** three pair ids (3 / 7 / 9) — `MsvAvDnsTreeName` (5) and `MsvAvDnsDomainName` (4) are passed through verbatim into the Type 3 `target_info` security buffer. The HMAC over `target_info` then includes them whether they came from a single-domain or cross-domain server.
+
+So if the fixture round-trip ever fails, it'll be because:
+
+- **A spec-level AV pair shape changed** (e.g. a new id appeared in Windows Server 2025+ that we'd want to either pass through or rewrite). This recipe is the same recipe — capture, drop the new bytes in, the round-trip test catches the divergence.
+- **The HMAC chain has a bug that's masked by the single-domain fixture.** Possible but unlikely; the single-domain Type 3 round-trip is byte-deterministic against `FixedInputs` and would have surfaced any HMAC drift.
+
+Either way, the fixture is the diagnostic — not a behavioural patch. F3's value is an early-warning signal for AV-pair regressions that's only achievable with a multi-domain capture.