Files

T

Joseph Doherty ddebab2c2d docs: F3 cross-domain NTLM provisioning recipe

Self-contained doc at docs/F3-cross-domain-ntlm-recipe.md for whoever
picks F3 up on hardware with two AD forests + a forest trust. Covers:

- Lab topology (LAB-A resource forest with AVEVA install + LAB-B
  account forest with the probe user, bidirectional forest trust).
- DC + DNS + trust + user provisioning steps (Install-ADDSForest,
  Add-DnsServerConditionalForwarderZone, New-ADTrust, New-ADUser).
- Capture procedure for both the Rust and .NET probes under a
  `runas /netonly` cross-domain token, with Wireshark NTLMSSP guidance.
- Fixture layout under crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/.
- Round-trip test skeleton (replay the captured Type 2 → regenerate
  Type 3 → assert byte-equality against the captured Type 3).
- Redaction checklist for the captured bytes.
- Why F3 is "evidence work" not "codec work" — the AV pair parser
  is shape-agnostic, so the codec path is already correct; the
  fixture is a regression net for any future drift.

F3 entry in design/followups.md and R8 in design/70-risks-and-open-questions.md
both now point at the recipe so a future contributor doesn't have
to reconstruct the lab topology from the followup analysis alone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-07 02:40:06 -04:00

17 KiB

Raw Permalink Blame History

F3 — Cross-domain NTLM Type1/2/3 fixture: provisioning recipe

This is a self-contained recipe for whoever picks F3 up on hardware that has (or can run) two Active Directory domains with a forest trust. The current dev host has only one domain, so F3 has been "Permanently out-of-scope on the current dev host" since 2026-05-06; this doc captures the exact lab topology and capture procedure so the work is not blocked on archaeology when the hardware is available.

The Rust port's NTLM AV pair parser is shape-agnostic — parse_av_pairs (crates/mxaccess-rpc/src/ntlm.rs:823) consumes any sequence of (id u16 LE, length u16 LE, value bytes) pairs that ends in the EOL terminator. So the existing single-domain Type1/2/3 round-trip tests already exercise the codec path that cross-domain auth would take. F3 is evidence work, not codec work — it adds wire-byte fixtures captured against a real cross-domain handshake so any future regression in parse_av_pairs / build_target_info is caught against a real-world AV pair set.

What changes between single-domain and cross-domain on the wire:

Type 2 challenge carries MsvAvDnsTreeName (id=0x0002) and MsvAvDnsDomainName (id=0x0004) AV pairs whose UTF-16LE values are the trusted (resource) domain's DNS suffix, not the user's home domain.
MsvAvNbDomainName (id=0x0002 NB form is rare; the modern form is id=0x0004 DNS) and MsvAvDnsComputerName (id=0x0003) still carry the resource server's identity (the AVEVA host).
Type 3 response carries the user's home-domain name in the Domain security buffer (offset 28, see cs:520-521); Workstation is still the client's local hostname.
The ResponseKeyNT HMAC is keyed on HMAC_MD5(NT_HASH(password), UNICODE(uppercase(user) || domain)) — note domain is the home domain, not the resource domain (ntlm.rs:459-465).

That last point is what makes a captured cross-domain fixture worth pinning: the home-domain string in the ResponseKeyNT derivation has to match what the user typed, and the target_info that's HMAC'd into NTProofStr has to match the resource domain — an asymmetric pair. Single-domain fixtures cannot exercise that asymmetry.

Lab topology

Minimum viable two-domain lab. Names are illustrative; substitute throughout.

                    +-----------------+         +-----------------+
                    |   LAB-A.LOCAL   |  trust  |   LAB-B.LOCAL   |
                    |  (resource)     |<------->|  (account)      |
                    |  domain GUID Ga |         |  domain GUID Gb |
                    +-----------------+         +-----------------+
                            |                            |
                  +---------+---------+        +---------+---------+
                  | DC-A.LAB-A.LOCAL  |        | DC-B.LAB-B.LOCAL  |
                  | Win Server 2022   |        | Win Server 2022   |
                  | DC + DNS          |        | DC + DNS          |
                  | 10.20.0.10        |        | 10.21.0.10        |
                  +-------------------+        +-------------------+
                            |
                  +---------+---------+
                  | AVEVA-A.LAB-A.    |        users:
                  |   LOCAL           |          - lab-a\admin       (DC-A admin)
                  | Win 10/11 Pro     |          - lab-b\probe.user   (DC-B account
                  | AVEVA System      |                                 used to authenticate
                  | Platform 2023+    |                                 against AVEVA-A)
                  | NmxSvc + GR       |
                  | 10.20.0.20        |
                  +-------------------+

The trust must be forest trust, two-way (or one-way: B→A trusts A). Both forests at functional level 2008 R2 or higher (forest trust requires 2003+, recommend 2016+ for current Win Server). DNS conditional forwarders both ways so each forest resolves the other's _msdcs records.

Why not a single forest with two child domains. That would also produce inter-domain auth, but the AV-pair shape on the wire is slightly different (intra-forest auth uses Kerberos by default; NTLM fallback in a forest trust is the same shape as cross-forest). Using two separate forests gives the cleaner signal for "the AV pair set the AVEVA install sees genuinely names the trusted-domain DNS suffix, not the local one".

Provisioning the lab

1. Stand up the two DCs

Each fresh Windows Server 2022 host:

# As local admin on the future DC, before promotion:
$DomainName = 'lab-a.local'         # or 'lab-b.local' for the other one
$DsrmPassword = ConvertTo-SecureString '<choose-strong>' -AsPlainText -Force

Install-WindowsFeature AD-Domain-Services, DNS -IncludeManagementTools

Install-ADDSForest `
    -DomainName $DomainName `
    -DomainNetbiosName ($DomainName.Split('.')[0].ToUpper()) `
    -ForestMode 'WinThreshold' `         # 2016 functional level
    -DomainMode 'WinThreshold' `
    -InstallDns `
    -SafeModeAdministratorPassword $DsrmPassword `
    -NoRebootOnCompletion:$false `
    -Force

Static IPs and DNS pointing at self. Reboot once, log in as LAB-A\Administrator / LAB-B\Administrator.

2. Configure DNS conditional forwarders

On DC-A, add a conditional forwarder for lab-b.local → 10.21.0.10. On DC-B, the mirror image.

# On DC-A:
Add-DnsServerConditionalForwarderZone -Name 'lab-b.local' -MasterServers '10.21.0.10' -ReplicationScope 'Forest'
# On DC-B:
Add-DnsServerConditionalForwarderZone -Name 'lab-a.local' -MasterServers '10.20.0.10' -ReplicationScope 'Forest'

Verify with Resolve-DnsName lab-b.local -Server localhost from DC-A (and the reverse).

3. Establish the forest trust

On DC-A (the resource side):

# Two-way trust is simplest; one-way (B trusts A, so A users can act on B
# resources) does NOT work for our scenario — we want B users authenticating
# against A's AVEVA install, so A must trust B (incoming for A).
$Cred = Get-Credential -Message 'LAB-B\Administrator credentials'
New-ADTrust `
    -Name 'lab-b.local' `
    -SourceForest 'lab-a.local' `
    -TargetForest 'lab-b.local' `
    -TrustType Forest `
    -Direction Bidirectional `
    -Authentication Selective:$false `   # forest-wide auth (simpler for the lab)
    -Credential $Cred

Verify: Get-ADTrust -Filter * | Format-Table Name, Direction, TrustType on each DC should show the trust as Bidirectional / Forest.

4. Provision the test user on the account domain (`LAB-B`)

# On DC-B:
$pwd = ConvertTo-SecureString '<probe-password>' -AsPlainText -Force
New-ADUser `
    -Name 'probe.user' `
    -SamAccountName 'probe.user' `
    -UserPrincipalName 'probe.user@lab-b.local' `
    -AccountPassword $pwd `
    -Enabled $true `
    -PasswordNeverExpires $true `
    -CannotChangePassword $true

5. Stand up the AVEVA host on the resource domain (`LAB-A`)

Win 10 Pro or Win 11 Pro VM, joined to LAB-A.LOCAL. Install AVEVA System Platform 2023 R2 (or whatever matches the dev host). Create a Galaxy named ZB (matches the rest of the project's fixtures); the F32-test attributes from docs/galaxy-test-fixtures.md are sufficient.

Grant LAB-B\probe.user Galaxy rights:

ArchestrA IDE → User Roles → add LAB-B\probe.user to a role with Read/Write on the test objects.
Local: add LAB-B\probe.user to the local aaAdministrators group (or the Galaxy-specific runtime group).

6. Smoke-test the auth path manually

From any Windows host that can resolve both domains, log in as LAB-B\probe.user (over RDP, or via runas /netonly):

runas /netonly /user:LAB-B\probe.user `
    "powershell -NoProfile -Command `"net use \\AVEVA-A.LAB-A.LOCAL\IPC$ /user:LAB-B\probe.user`""

If net use returns 0, NTLM cross-domain auth is working at the SMB layer. Now we capture the same shape against NmxSvc.

Capture procedure

A. From the Rust port

The connect-write-read example already drives the full NTLM handshake against NmxSvc.exe. Capture under a LAB-B\probe.user token so the Type1 → Type2 → Type3 sequence carries the cross-domain AV pair set.

# On the AVEVA host (or a client with route + RPC access to it):
runas /netonly /user:LAB-B\probe.user powershell

# Inside the spawned shell:
$env:MX_RPC_USER = 'probe.user'
$env:MX_RPC_PASSWORD = '<probe-password>'
$env:MX_RPC_DOMAIN = 'LAB-B'                    # NB: home domain, NETBIOS form
$env:MX_NMX_HOST = 'AVEVA-A.LAB-A.LOCAL'
$env:MX_GALAXY_DB = 'AVEVA-A.LAB-A.LOCAL\SQLEXPRESS'
$env:MX_TEST_USER = 'probe.user'
$env:MX_TEST_DOMAIN = 'LAB-B'
$env:MX_TEST_PASSWORD = '<probe-password>'
$env:MX_LIVE = '1'
$env:RUST_LOG = 'mxaccess_rpc::ntlm=trace,mxaccess_rpc::pdu=trace'

# Wireshark or `examples/asb-relay.rs` middleman to intercept the bytes.
# Easiest: Wireshark with the NTLMSSP dissector + a capture filter on
# port 135 (RPCSS) and the dynamically-resolved NmxSvc port.
cargo run -p mxaccess --example connect-write-read -- `
    --tag TestChildObject.TestInt --value 42 2>&1 | Tee-Object -FilePath connect.log

The Rust trace logs from mxaccess_rpc::ntlm will print the Type1/Type2/Type3 message lengths + flag values. Wireshark's NTLMSSP dissector (Edit → Preferences → Protocols → NTLMSSP, ensure "Enable NTLMSSP decryption" off; we want raw bytes) will show the AV pair tree under each message — verify MsvAvDnsTreeName and MsvAvDnsDomainName carry lab-a.local (the resource domain) before saving.

B. From the .NET reference (cross-check)

# Same `runas /netonly` shell, then:
$env:MX_TEST_USER = 'probe.user'
$env:MX_TEST_DOMAIN = 'LAB-B'
$env:MX_TEST_PASSWORD = '<probe-password>'
dotnet run --project src\MxNativeClient.Probe\MxNativeClient.Probe.csproj `
    -c Release -- --probe-session-write `
    --tag=TestChildObject.TestInt --value=42 --objref-only

If both the Rust and .NET probes succeed end-to-end against the same LAB-B\probe.user credential, NTLM is working cross-domain. Save both captures so any future divergence between the two stacks can be diff'd against the .NET reference's known-good bytes.

C. Saving the captured bytes

Wireshark → right-click each NTLMSSP message → Export Packet Bytes… (NOT Export PDUs — we want the raw NTLMSSP message starting at the NTLMSSP\0 signature). Save as:

crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/
├── README.md                     # capture date, lab versions, redacted creds
├── type1-laB-b-user-vs-aveva-a.bin
├── type2-challenge-from-aveva-a.bin
├── type3-laB-b-user-to-aveva-a.bin
└── target-info-laB-b-user.bin    # just the AV-pair payload sliced out of the
                                  # Type 2 message — convenient for the unit test
                                  # since `parse_av_pairs` takes a `&[u8]`

Naming convention: lowercase, hyphenated, prefixed with the message kind so a directory listing reads top-to-bottom in handshake order.

D. Redaction checklist

Captured NTLMSSP messages contain:

The user name (probe.user — fine, lab fixture)
The domain name (LAB-B — fine)
The workstation name (the host you ran the capture from — redact if it leaks an internal hostname)
The server challenge (8 random bytes — fine)
The client challenge (8 random bytes — fine)
NTProofStr (HMAC-MD5 over the challenges + target_info — fine, not reversible to the password without the AV pair set)
EncryptedRandomSessionKey (RC4-encrypted ephemeral key — fine; the session key is single-use)

The captured bytes do not contain the password or its NT hash directly. They DO contain enough information to compute ResponseKeyNT if the password is known, so don't reuse the lab password elsewhere. Add the captured creds to the .gitignore-honoured tools/Setup-LiveProbeEnv.ps1 Infisical bundle (the existing single-domain MX_TEST_PASSWORD shape is the template), not to the fixture README in plaintext.

Fixture wiring (the test)

Add a new test under crates/mxaccess-rpc/src/ntlm.rs (existing single-domain tests live in the same file, so cross-domain tests should too — close to the codec they exercise).

Skeleton:

#[test]
fn cross_domain_target_info_carries_trusted_dns_suffix() {
    // Sliced from `target-info-lab-b-user.bin` — the AV-pair payload
    // from a real LAB-B\probe.user → AVEVA-A.LAB-A.LOCAL handshake.
    let target_info = include_bytes!(
        "../tests/fixtures/cross-domain-ntlm/target-info-lab-b-user.bin"
    );
    let pairs = parse_av_pairs(target_info).unwrap();

    // The resource domain's DNS suffix MUST appear under
    // MsvAvDnsTreeName (id=5). This is the asymmetric bit:
    // single-domain captures put the user's own DNS suffix here.
    let tree = pairs.iter().find(|p| p.id == 5).expect("MsvAvDnsTreeName");
    assert_eq!(utf16le_to_string(&tree.value), "lab-a.local");

    // MsvAvDnsDomainName (id=4) names the AVEVA host's domain too —
    // it should match MsvAvDnsTreeName for a cross-forest trust.
    let dom = pairs.iter().find(|p| p.id == 4).expect("MsvAvDnsDomainName");
    assert_eq!(utf16le_to_string(&dom.value), "lab-a.local");

    // MsvAvDnsComputerName (id=3) is the FQDN of the resource server.
    let host = pairs.iter().find(|p| p.id == 3).expect("MsvAvDnsComputerName");
    assert!(utf16le_to_string(&host.value).ends_with(".lab-a.local"));
}

#[test]
fn cross_domain_type3_round_trip_against_real_challenge() {
    // Full handshake replay: feed the captured Type 2 challenge bytes
    // into a Rust-port NtlmClientContext set up with the captured
    // user/password/domain triple, generate Type 3, and assert
    // byte-equality against the captured Type 3.
    //
    // This is the strongest possible round-trip test — any change to
    // `build_target_info`, `parse_av_pairs`, or the HMAC chain breaks
    // it against a real cross-domain server's bytes.
    let challenge = include_bytes!(
        "../tests/fixtures/cross-domain-ntlm/type2-challenge-from-aveva-a.bin"
    );
    let expected_type3 = include_bytes!(
        "../tests/fixtures/cross-domain-ntlm/type3-lab-b-user-to-aveva-a.bin"
    );

    let mut ctx = NtlmClientContext::new(
        "probe.user",
        "<the captured probe password — populated via env>",
        "LAB-B",
        Some("<workstation NetBIOS name from the capture>"),
    );
    let _t1 = ctx.create_type1();

    // Use FixedInputs with the client_challenge / exported_session_key /
    // filetime sliced out of the captured Type 3 so the regenerated
    // bytes are deterministic.
    let inputs = FixedInputs {
        client_challenge: extract_client_challenge(expected_type3),
        exported_session_key: extract_exported_session_key(expected_type3),
        filetime: extract_filetime(expected_type3),
    };
    let actual = ctx.create_type3(challenge, &mut { inputs }).unwrap();
    assert_eq!(actual, expected_type3);
}

The extract_* helpers slice the deterministic inputs out of the captured Type 3 so the test is reproducible. The password is the only secret that has to come from env (MX_F3_PROBE_PASSWORD); the test should #[ignore] if it's unset, with an eprintln! pointing at this recipe doc.

Helper for the UTF-16LE comparison:

fn utf16le_to_string(bytes: &[u8]) -> String {
    let units: Vec<u16> = bytes
        .chunks_exact(2)
        .map(|c| u16::from_le_bytes([c[0], c[1]]))
        .collect();
    String::from_utf16(&units).unwrap()
}

Closing F3 + R8

Once the fixture lands and the round-trip test passes:

design/followups.md F3 → move to ## Resolved with the commit hash.
design/70-risks-and-open-questions.md R8 → flip from PERMANENTLY DEFERRED to Resolved <date> (commit hash). Cross-domain handshake exercised live + fixture pinned at crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/.
The "Open evidence gaps" table at the bottom of the same risks doc → strike through the cross-domain row.

Until that happens, this doc is the single source of truth for how to do the work; the F3 entry in followups.md only needs to point here.

Why this is "evidence work", not "codec work"

The reason the codec already handles cross-domain inputs is structural: parse_av_pairs doesn't switch on AV pair id values. It walks any (id, len, value) sequence. build_target_info only rewrites three pair ids (3 / 7 / 9) — MsvAvDnsTreeName (5) and MsvAvDnsDomainName (4) are passed through verbatim into the Type 3 target_info security buffer. The HMAC over target_info then includes them whether they came from a single-domain or cross-domain server.

So if the fixture round-trip ever fails, it'll be because:

A spec-level AV pair shape changed (e.g. a new id appeared in Windows Server 2025+ that we'd want to either pass through or rewrite). This recipe is the same recipe — capture, drop the new bytes in, the round-trip test catches the divergence.
The HMAC chain has a bug that's masked by the single-domain fixture. Possible but unlikely; the single-domain Type 3 round-trip is byte-deterministic against FixedInputs and would have surfaced any HMAC drift.

Either way, the fixture is the diagnostic — not a behavioural patch. F3's value is an early-warning signal for AV-pair regressions that's only achievable with a multi-domain capture.

17 KiB Raw Permalink Blame History