# WW VM — GPU Passthrough NVIDIA Quadro P1000 PCI passthrough to `WW_DEV_VM` (10.100.0.48). **Executed 2026-04-28.** GPU is live in Windows: `nvidia-smi` reports driver 582.41 / CUDA 13.0, 4 GiB VRAM, status WDDM, no Code 43. ## Final state | Item | Value | |---|---| | Quadro P1000 driver in guest | 582.41 / CUDA 13.0 (already installed; took the device on first boot) | | Guest PCI bus address | `00000000:23:00.0` | | Audio function | High Definition Audio Controller present, status OK | | ESXi `graphicsInfo.graphicsType` | `direct` (was already set before this task) | | ESXi `pciPassthruInfo` for `87:00.0` / `87:00.1` | `passthruEnabled=true, passthruActive=true` (flipped on without host reboot) | | VM `nestedHVEnabled` | `false` | | VM `memoryHotAddEnabled` | `false` | | VM memory reservation | 32768 MB / 32768 MB (locked) | | Other VMs touched during the change | None — host stayed up | ## What `graphicsType: direct` actually means (lesson learned) `graphicsInfo.graphicsType: direct` and `pciPassthruInfo.passthruEnabled` are **two parallel mechanisms**. Both must be set for direct GPU passthrough: 1. `graphicsType: direct` — graphics subsystem says "this card is a passthrough device, not vSGA/vGPU". Set in vSphere UI: Host → Configure → Hardware → Graphics. 2. `pciPassthruInfo.passthruEnabled` — generic per-PCI-device passthrough flag. Set via `host.esxcli hardware pci pcipassthru set -e=true`. Without this, the device doesn't appear in `device.pci.ls -vm `, so VMs can't claim it. The "no host reboot needed" benefit only kicks in when `graphicsType: direct` is **already** in effect — the runtime activation flag (`-a=true` on the esxcli call) succeeds because the device isn't actively serving as a host graphics device. If `graphicsType` is still `shared` (default), flipping `pcipassthru` requires a host reboot for the activation to land. ## Procedure (the one that worked) ### 1. Finish inside-VM teardown — already done before this task WSL2 + VirtualMachinePlatform Windows features were disabled during the Docker→DOCKER migration. The reboot to finalize that disable also serves as the "shut down before passthrough" step. ```powershell ssh dohertj2@10.100.0.48 'Get-WindowsOptionalFeature -Online -FeatureName VirtualMachinePlatform,Microsoft-Windows-Subsystem-Linux | Select-Object FeatureName,State' # Expect: both Disabled ``` ### 2. Shut down the VM (graceful) ```bash export GOVC_URL=https://10.2.0.12/sdk GOVC_USERNAME=govc GOVC_PASSWORD='Tn9.xKw-m4Vp' GOVC_INSECURE=true govc vm.power -s=true WW_DEV_VM until govc vm.info WW_DEV_VM | grep -q "Power state: poweredOff"; do sleep 5; done ``` ### 3. Flip the VM hardware flags (VM must be off) ```bash govc vm.change -vm WW_DEV_VM -nested-hv-enabled=false govc vm.change -vm WW_DEV_VM -memory-hot-add-enabled=false govc vm.info -json=true WW_DEV_VM | python3 -c "import json,sys;v=json.load(sys.stdin)['virtualMachines'][0]['config'];print('nestedHV:',v.get('nestedHVEnabled'));print('memHotAdd:',v.get('memoryHotAddEnabled'))" # Expect: nestedHV: False, memHotAdd: False ``` ### 4. Enable `pcipassthru` for both Quadro PCI functions `graphicsType: direct` was already set, so `-a=true` activates the flag immediately — no host reboot. (Note: `govc gpu.vm.add` is for **vGPU profiles**, not direct PCI passthrough, and fails on this card with "no vGPU profiles available". Use `device.pci.add` instead.) ```bash govc host.esxcli hardware pci pcipassthru set -d=0000:87:00.0 -e=true -a=true govc host.esxcli hardware pci pcipassthru set -d=0000:87:00.1 -e=true -a=true # Confirm both are active govc host.info -json=true | python3 -c " import json,sys d=json.load(sys.stdin) for p in d['hostSystems'][0]['config'].get('pciPassthruInfo', []): if '87:00' in p.get('id',''): print(p) " # Expect: passthruEnabled=True, passthruActive=True for both # Confirm the Quadro now shows up as available for VMs govc device.pci.ls -vm WW_DEV_VM | grep -i nvidia # Expect: 0000:87:00.0 and 0000:87:00.1 listed ``` A harmless quirk: the second `pcipassthru set` command may emit `Device owner is already configured to passthru` if the audio function was previously partially configured. Check the post-state with `pciPassthruInfo` — both should be `passthruActive=True`. ### 5. Attach the GPU + audio to the VM ```bash govc device.pci.add -vm WW_DEV_VM 0000:87:00.0 govc device.pci.add -vm WW_DEV_VM 0000:87:00.1 # Verify two VirtualPCIPassthrough devices exist govc device.info -vm WW_DEV_VM 'pcipassthrough-*' ``` ### 6. Power on, verify ```bash govc vm.power -on=true WW_DEV_VM until ssh -o ConnectTimeout=3 -o BatchMode=yes dohertj2@10.100.0.48 'hostname' 2>/dev/null; do sleep 5; done # Confirm the GPU is detected and the driver bound ssh dohertj2@10.100.0.48 'Get-PnpDevice -Class Display | Where-Object FriendlyName -match "Quadro" | Select-Object FriendlyName,Status' # Confirm CUDA / driver runtime ssh dohertj2@10.100.0.48 'nvidia-smi' ``` ## Notes for future operators 1. **`gpu.vm.add` vs `device.pci.add`**: govc's `gpu.vm.add` is for vGPU profiles (data-center cards like A40 with NVIDIA vGPU licensing). For consumer Quadro cards in direct passthrough mode, use `device.pci.add`. `gpu.host.profile.ls` returns "no vGPU profiles available" on a host whose only NVIDIA card is a non-vGPU Quadro. 2. **Audio function `87:00.1`** must be attached to the same VM as `87:00.0` — they share an IOMMU group via parent bridge `0000:80:03.0` and ESXi rejects splitting them. 3. **No host reboot was needed** because `graphicsType: direct` was already in effect from earlier vSphere UI work. If you ever swap GPUs, set `graphicsType: direct` first (vSphere UI: Host → Configure → Hardware → Graphics → Edit → Direct) and reboot the host once; from then on, per-VM attach/detach is a runtime operation. 4. **Driver was pre-installed**: the previous Windows install already had NVIDIA driver 582.41, so the GPU appeared with status OK on first boot. A fresh Windows install would need the driver from https://www.nvidia.com/Download/index.aspx (Quadro P1000). 5. **Rollback**: `govc device.pci.remove -vm WW_DEV_VM pcipassthrough-13000 pcipassthrough-13001` → re-enable `nestedHVEnabled` / `memoryHotAddEnabled` → power VM on. Host PCI flags can stay enabled; they don't hurt. ## Inventory | Field | Value | |---|---| | Model | NVIDIA Quadro P1000 (GP107GL) | | GPU PCI ID (host) | `0000:87:00.0` (vendor `0x10de`, device `0x1cb1`) | | Audio PCI ID (host) | `0000:87:00.1` (vendor `0x10de`, device `0x0fb9`) | | Subsystem | Dell (`0x1028:0x11bc`) | | Parent bridge | `0000:80:03.0` | | VRAM | 4 GiB | | Driver in guest | 582.41 (Windows 10 WDDM) |