PCI(E) Passthrough

PCI Passthrough in Proxmox Virtual Environment
Note

The steps outlined in this article have been tested only on Proxmox Virtual Environment 8.

Note

My systems have Intel CPUs and Nvidia GPUs so there are no instructions specific to AMD hardware. Generally AMD hardware is easier to handle in linux so I expect there shouldn’t be much trouble finding instructions.

Why Passthrough?

Passthrough is needed for direct hardware access in VMs. GPUs, disks, network interface cards and any other PCI(E) devices can be passed through to VMs in Proxmox.

Setting up Passthrough

PVE Node Configuration

Intel Nodes

IOMMU and VFIO

IOMMU stands for Input-Output Memory Managment Unit. It allows a system to map virtual memory addresses to physical ones. It is required for PCI(E) passthrough. VFIO stands for Virtual Function I/O. It is a Linux kernel subsystem that allows VMs direct access to hardware devices.

There is also a setting called IOMMU passthrough mode, which may be required for better performance. I don’t believe there are any downsides so make sure to enable it. Add the following parameters to the kernel cmdline: intel_iommu=on, iommu=pt.

LINE IN FILE: /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"

Now we need to configure PVE to load the VFIO kernel modules on boot. Append the following lines to /etc/modules.

APPEND TO FILE: /etc/modules*

vfio
vfio_iommu_type1
vfio_pci

Now we need to rebuild the initramfs. Use the following command:

update-initramfs -u -k all

Other config options(unsafe interrupts and the like)

Blacklisting Drivers

For some hardware devices, it is much easier to passthrough if the device driver is blacklisted on the PVE host. This is especially true for GPUs.

Nvidia

To blacklist Nvidia drivers create the file /etc/modprobe.d/nvidia.blacklist.conf with the following contents:

FILE: /etc/modprobe.d/nvidia.blacklist.conf

blacklist nouveau
blacklist nvidia*
Intel

To blacklist Intel GPU drivers create the file /etc/modprobe.d/intel_gpu.blacklist.conf with the following contents:

FILE: /etc/modprobe.d/nvidia.blacklist.conf

blacklist snd_hda_intel
blacklist snd_hda_codec_hdmi
blacklist i915

Make sure to rebuild your initramfs after making changes to /etc/modprobe.d/:

update-initramfs -u -k all
Verify Drivers not loaded

After rebuilding your initramfs and rebooting the node, run the following command to verify that the device drivers are not being loaded.

lspci -nnk

Each PCI(e) device will be listed in the output. Find the device you want to passthrough and make sure it includes the line

Kernel driver in use: vfio-pci

or that the line is not present at all.

PVE Cluster Configuration

Now that our PVE hosts are configured, we can set cluster-wide Resource Mappings to enable easy passthrough to VMs. These can be configured in the web UI or via the pvesh cli which is a bit more involved.

In the web UI, under Datacenter, select Resouce Mappings and under PCI Devices, click Add. This opens the menu for mapping PCI devices. Select the correct node and find the Device you intend to passthrough.

In this example I am going to pass through my Nvidia RTX 3070. It has two different IOMMU groups, one for the GPU itself, and one for the audio controller. I am going to map these together by selecting the entry for Pass through all functions as one device.

The name of the mapping must be unique per Node, however multiple nodes can have a device with an identical name. This is useful if you have two or more identical PVE nodes because it would simplify migrating VMs from node to node.

VM Configuration

In order to passthrough a PCI(e) device to a VM it must be created with the following settings:

  • Machine Type: q35
  • BIOS: OVMF (UEFI)

Once the VM is created, under the Hardware tab, click Add and select PCI Device. Select the correct Device in the Mapped Device drop-down. Check the Advanced box, and make sure PCI-Express is selected.

Important

If passing through a GPU, selecting the Primary GPU checkbox will set the device as the main video adapter. This will cause the built-in VM console to fail. Do not enable this unless a you have a physical display connected to the GPU or you have already configured a remote access protocol.

Other Considerations

There is a large variety of options, configurations, and seemingly unexplained behaviors when working with PCI passthrough. A few of my notes are listed below, as well as links to the Proxmox Wiki which reiterate the instructions above and have some more situation specific knowledge.

  • When passing through PCI devices to VM, the VFIO subsystem needs to allocate a contiguous memory block for the VM. This means that memory ballooning will not function. This also means that in certain situations, Proxmox will throw an error when starting a VM with a PCI device attached. This happens when the pve host is either running near memory capacity, or when the host memory is fragmented: if the host takes too long to allocate the contigouous memory block, the VM start timeout will be reached and an error will be thrown. If you wait a few more minutes, the VM will eventually start.

  • https://pve.proxmox.com/wiki/PCI(e)_Passthrough

  • https://pve.proxmox.com/wiki/PCI_Passthrough