GPU Passthrough Guide

This guide covers how to set GPU passthrough using Arch and Nvidia. I originally wrote this guide on reddit but decided to put it here in case that one gets removed.

First, I’d like to show you the results of this guide. Here’s a firestrike run using gpu passthrough.

This is meant to be a start to finish, holy shit this actually works, guide and is another lengthy post because there’s a lot to cover so stick with it and you’ll be happy you did. This post is going to cover UEFI specific hardware because every GPU made in the last few years has had it. I believe Nvidia implemented UEFI bios in the 600 series cards and some of you may need to flash the bios for that support, so all of you 500 series owners looking to pass them through will need to refer to the previous post and you want the q35 bios method. I’m unsure on when exactly AMD implemented support so do your research, I’m also supporting Intel/Nvidia exclusively in this post because I don’t own any AMD hardware. Everyone else can continue reading.

My Hardware Config

CPU: i7 4790k

Our primary concern is VT-d support. This is our bread and butter tech that allows us to pass through the GPU.

Mobo: MSI z97a Gaming 7

It’s a bit of an upgrade for me because my g1.sniper h6 was giving me fits when I upgraded to 32GB of ram. Manual doesn’t say the MSI supports VT-d specifically but does say it supports “virtualization technology” which is what we’re after.

GPU: 1 x EVGA GTX 970 FTW, 1 x ASUS STRIX GTX 980ti

I got a fancy 1440p 144hz monitor and the 970 wasn’t cutting it so I picked up a 980ti. You do not need these expensive cards to achieve this. The 980ti supports UEFI. That’s all we’re after in this post. I’ve done this with everything from a gtx 260, 550ti, 970, and now 980ti.

Storage: 1 x Intel 730 240GB SSD, 4 x 1TB WD in Raid 10

I changed my storage up a bit for peace of mind reasons. I’m now running windows off of a qcow2 container rather than a physical hdd. You can do either one.

RAM: 32GB of ddr3 1600

Linux doesn’t require much ram. I simply have 32GB because I run a lot of VM’s for work and emulate an enterprise environment. Generally Linux doesn’t require much RAM and you can get by with 8-12GB easily.

Monitors: 2 x 1920×1080, 1 x 1440p

The gist of monitor setups with this is to wire everything twice. You need one cable going out of each GPU to each monitor. If your monitor only has one input you can buy a switch for whatever connection you need. If you get a switch you can either leave both outputs enabled all the time or turn the linux output off with xrandr and the switch should failover automatically to the second input. If you don’t have a switch then I would recommend using xrandr because it gets annoying to manually switch the inputs every time on the monitor.


Setup and Installation

I break things constantly on my system. So I’m using Antergos to install Arch. I put my /home on the RAID-10 and just change /etc/fstab if I need to reinstall so I never lose anything important. Pretty much any offshoot of Arch and vanilla Arch should work just fine for this post. I have also helped people do this on Debian and derivatives. For this guide we will be sticking to Arch. I’m using grub2 as my bootloader with UEFI.

The first thing we need to do is enable VT-d and ensure functionality. We need to edit /etc/default/grub . Look for the line that says “GRUB_CMDLINE_DEFAULT=”” and append “intel_iommu=on” to what is inside of that line. Mine looks like this

GRUB_CMDLINE_LINUX_DEFAULT="resume=UUID=6771936b-06b6-493c-b655-6f60122f5228 intel_iommu=on"

Once you’ve done that we need to rebuild our grub.cfg file. Run this to do so

sudo grub-mkconfig -o /boot/grub/grub.cfg

Next we reboot to activate iommu/vt-d. Once you’re back in Arch we need to verify that VT-d is enabled and functioning. First we need to identify the pci-e bus the GPU we’re passing through is on. We run “lspci -nnk” to find this information. Here are the lines important to me.


02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM200 [GeForce GTX 980 Ti] [10de:17c8] (rev a1)
Subsystem: ASUSTeK Computer Inc. Device [1043:8548]
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia
02:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:0fb0] (rev a1)
Subsystem: ASUSTeK Computer Inc. Device [1043:8548]
Kernel driver in use: snd_hda_in

My 980ti is in the second pci-e slot on my motherboard so this is correct. 02.00.1 is the audio bus for the card and is also important later on. Next we need to see if the 980ti falls into another pci-e bus’s iommu group and if that will conflict. To find this out you look in /sys/bus/pci/devices/YOUR_BUS/iommu_group/devices/. If you do not have an iommu_group folder then vt-d was not enabled properly! Here is the output for mine

 [kemmler@arch ~]$ ls -lha /sys/bus/pci/devices/0000\:02\:00.0/iommu_group/devices/
 total 0
 drwxr-xr-x 2 root root 0 Sep 19 18:47 .
 drwxr-xr-x 3 root root 0 Sep 19 18:46 ..
 lrwxrwxrwx 1 root root 0 Sep 19 18:47 0000:00:01.0 -> ../../../../devices/pci0000:00/0000:00:01.0
 lrwxrwxrwx 1 root root 0 Sep 19 18:47 0000:00:01.1 -> ../../../../devices/pci0000:00/0000:00:01.1
 lrwxrwxrwx 1 root root 0 Sep 19 18:47 0000:01:00.0 -> ../../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0
 lrwxrwxrwx 1 root root 0 Sep 19 18:47 0000:01:00.1 -> ../../../../devices/pci0000:00/0000:00:01.0/0000:01:00.1
 lrwxrwxrwx 1 root root 0 Sep 19 18:47 0000:02:00.0 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.0
 lrwxrwxrwx 1 root root 0 Sep 19 18:47 0000:02:00.1 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.1

0000:00:01.0 and 0000:00:01.1 can be ignored. The issue is 0000:01:00.0 and 0000:01:00.1. These are my 970 and will cause GPU passthrough to fail unless I pass both cards through to the VM. If you only see device of 0000:0#.00.# in your output then your iommu group is clean and you can skip this next section


Fixing IOMMU Grouping

I installed Antergos with AUR support. This provides a tool called yaourt. A very thoughtful member of the Arch community has been steadily providing a current kernel patched with a few fixes that deal with IOMMU groups and other issues. This package is called “linux-vfio”. Assuming you have yaourt installed you will run this.

yaourt -S linux-vfio

It will ask you if you want to edit a few things, you can say no. It will then ask you if you want to only install linux-vfio. Say NO so it will also install the docs and headers for the kernel. Proceed through the installation with common sense. Once you’ve installed it, rebuild your grub.cfg as above with “sudo grub-mkconfig -o /boot/grub/grub.cfg”.


Skip if using Nouveau

If you installed the “nvidia” package from the arch repo your driver will probably break in the new kernel. I simply installed the binary from nvidia’s website. A simple way to do that against the new kernel is to download the binary, reboot, edit grub by pressing “e” with linux-vfio selected, and then append “nomodeset systemd.unit=multi-user.target” to the end of the long line that begins with “linux …” so you do a one time edit of the boot parameters. Then navigate to the binary and “sh NVIDIA-###…sh” It should disable nouveau if needed and install. Reboot and continue. You may also want to look into nvidia-dkms if you plan on updating your linux-vfio kernel regularly.


Once you’re in the linux-vfio kernel you need to enable the acs_override patch. The easiest way is to just use the downstream method. There are other optionsbut should not be necessary. We will add “pcie_acs_override=downstream” to our grub.cfg at /etc/default/grub. “sudo grub-mkconfig -o /boot/grub/grub.cfg” once again to rebuild it.

GRUB_CMDLINE_LINUX_DEFAULT="resume=UUID=6771936b-06b6-493c-b655-6f60122f5228 pcie_acs_override=downstream intel_iommu=on"

Reboot and then check your iommu group again.


[kemmler@arch ~]$ ls -lha /sys/bus/pci/devices/0000\:02\:00.0/iommu_group/devices/
total 0
drwxr-xr-x 2 root root 0 Sep 19 19:11 .
drwxr-xr-x 3 root root 0 Sep 19 19:11 ..
lrwxrwxrwx 1 root root 0 Sep 19 19:11 0000:02:00.0 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.0
lrwxrwxrwx 1 root root 0 Sep 19 19:11 0000:02:00.1 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.1

 

0000:01:00.0/1 are now missing from the initial output. This is exactly what we want to see. This means that the 980ti is now in it’s own IOMMU group and can be allocated to a VM by itself. We’re now ready to move on.


Setup Continued

The next thing we need to do is blacklist the GPU we’re passing through to the VM so that the Nvidia driver doesn’t try to grab it. Nvidia is a dick and doesn’t conform to standards properly. You can’t easily unbind a gpu from the Nvidia driver so we use a module called “pci-stub” to claim the card before nvidia can. We achieve this by putting pci-stub in our initramfs that is loaded before the kernel and passing a parameter to grub telling it what to do. So, edit “/etc/mkinitcpio.conf” and add “pci-stub” to the Modules=”” section like so

MODULES="pci-stub"

If you’re running the stock kernel use this to rebuild your initramfs

sudo mkinitcpio -p linux

linux-vfio users run this

sudo mkinitcpio -p linux-vfio

Now we edit grub once again and add our pci_stub options to bind our card to pci-stub. Get your device IDs from “lspci -nnk”. My id’s are “10de:17c8” and “10de:0fb0” as seen here again

02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM200 [GeForce GTX 980 Ti] [10de:17c8] (rev a1)
Subsystem: ASUSTeK Computer Inc. Device [1043:8548]
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia
02:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:0fb0] (rev a1)
Subsystem: ASUSTeK Computer Inc. Device [1043:8548]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

Edit /etc/default/grub and then run “sudo grub-mkconfig -o /boot/grub/grub.cfg” Your line should similar to this. Remember that you need to pass what’s in the IOMMU group so you need the main card and the audio bus if it’s there.


GRUB_CMDLINE_LINUX_DEFAULT="resume=UUID=6771936b-06b6-493c-b655-6f60122f5228 pcie_acs_override=downstream intel_iommu=on pci-stub.ids=10de:17c8,10de:0fb0"

Reboot. If everything goes to plan you should now see “pci-stub” as the module in use from your “lspci -nnk” output for the card.

Kernel driver in use: pci-stub

You’re all set. Now we can move on to installing the core software.


Setup KVM/QEMU

We need to install qemu first.

sudo pacman -S qemu

Next we need the UEFI bios called OVMF. Look hereand get the edk2.git-ovmf-x64-###.noarch.rpm file. Install rpmextract

sudo pacman -S rpmextract

Next we’ll extract the files and move them over as they had them.

 
[kemmler@arch ovmf]$ ls edk2.git-ovmf-x64-0-20150916.b1214.g2f667c5.noarch.rpm 
[kemmler@arch ovmf]$ rpmextract.sh edk2.git-ovmf-x64-0-20150916.b1214.g2f667c5.noarch.rpm 
[kemmler@arch ovmf]$ ls edk2.git-ovmf-x64-0-20150916.b1214.g2f667c5.noarch.rpm usr 
[kemmler@arch ovmf]$ sudo cp -R usr/share/* /usr/share/
[kemmler@arch ovmf]$ ls /usr/share/edk2.git/ovmf-x64/ 
OVMF_CODE-pure-efi.fd OVMF_CODE-with-csm.fd OVMF-pure-efi.fd OVMF_VARS-pure-efi.fd OVMF_VARS-with-csm.fd OVMF-with-csm.fd UefiShell.iso 

Now we need to create our vfio-bind script that will replace our pci-stub placeholder driver. I suggest putting it in /usr/bin/vfio-bind and then running “chmod +x /usr/bin/vfio-bind”


#!/bin/bash

modprobe vfio-pci

for dev in "$@"; do
vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
device=$(cat /sys/bus/pci/devices/$dev/device)
if [ -e /sys/bus/pci/devices/$dev/driver ]; then
echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
fi
echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id
done

Next we bind our gpu. Remember to bind the whole gpu if necessary which means both buses if present. Replace your pci bus

sudo vfio-bind 0000:0#:00.0 0000:0#:00.1

Verify that vfio-bind is now in control of the gpu with “lspci -nnk”

Kernel driver in use: vfio-pci

Now we can test it out and see if it works! Make sure to verify paths are correct, change your pci bus ID, and remove the second pci bus line if you only have one for your card. Throw this script in a file and run it like you did with the vfio-bind script. Once you do that you should be able to switch the input on your monitor or KVM switch and be greeted with a black terminal with yellow text. This is the UEFI shell and means that everything is working wonderfully!


#!/bin/bash

cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd /tmp/my_vars.fd
qemu-system-x86_64 \
-enable-kvm \
-m 2048 \
-cpu host,kvm=off \
-vga none \
-device vfio-pci,host=02:00.0,multifunction=on \
-device vfio-pci,host=02:00.1 \
-drive if=pflash,format=raw,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd \
-drive if=pflash,format=raw,file=/tmp/my_vars.fd

Setting Up Windows

Now that we have video out all we need to do next is change our script up and install windows to a disk or to a qcow2 container. I’ll be doing the latter but can help with the former. You need a windows iso. I’m using windows 10 enterprise. You also need to get the VirtIO drivers from redhat. You can download them here. I am using the “Latest virtio-win iso” but stable should also be fine. First let’s create our qcow2 container. I’m using a few params to increase performance, if you have any tips on better methods I’d be glad to hear it. Command below. Change 120G to whatever size you want your container to be.

qemu-img create -f qcow2 -o preallocation=metadata,compat=1.1,lazy_refcounts=on win.img 120G

Next we modify our script to include the windows iso, virtio iso, and our new container. Once again, verify all path names. I’m also passing through my usb keyboard to the guest. This line begins with -usb. Find your usb device by using “lsusb”. I’d recommend not passing through your mouse just yet so that if you fuck up you can simply close the black qemu window that pops up to get your keyboard back. Alterantively just hook up a second keyboard if you have one. I always keep a spare around in case windows hangs. Notice I’m using writeback cache on my qcow2 image. Remove that if you do not want it.


#!/bin/bash

cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd /tmp/my_vars.fd
qemu-system-x86_64 \
-enable-kvm \
-m 2048 \
-cpu host,kvm=off \
-vga none \
-usb -usbdevice host:1b1c:1b09 \
-device vfio-pci,host=02:00.0,multifunction=on \
-device vfio-pci,host=02:00.1 \
-drive if=pflash,format=raw,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd \
-drive if=pflash,format=raw,file=/tmp/my_vars.fd \
-device virtio-scsi-pci,id=scsi \
-drive file=/home/kemmler/kvm/win10.iso,id=isocd,format=raw,if=none -device scsi-cd,drive=isocd \
-drive file=/home/kemmler/kvm/win.img,id=disk,format=qcow2,if=none,cache=writeback -device scsi-hd,drive=disk \
-drive file=/home/kemmler/kvm/virt.iso,id=virtiocd,if=none,format=raw -device ide-cd,bus=ide.1,drive=virtiocd

Once you boot you should see the “press any key to boot from cd…” if you miss it you’ll eventually get dumped into a Shell> prompt. Just type “exit” hit enter, and navigate to “Boot Manager”. Select the first SCSI device and you should get the prompt again. Go through and install windows as you normally would. When you get to the disk selection screen it will prompt you for a driver. Navigate to the virtio iso, virtscsi folder, Windows 8.1, x64, assuming you’re using windows 10 x64 like me. Go through the rest of the install and then shut it down when done. Next we’re going to get sound working, add our mouse to the usb passthrough, and set our monitors to switch automatically with xrandr.

Run “xrandr” to find your output device names. They correspond to your connection types. eg dvi-d-0. I’m going to be using two monitors while in windows and leaving 1 for linux to keep conky running to monitor the system. My two monitors i’ll be switching are called “DVI-I-1” and “DVI-D-0”. I’m also changing the cpu values to match my 4790k and my ram to 8GB. Note the mode and the rate on the xrandr commands. This refers to the resolution and refresh rate respectively.


#!/bin/bash

xrandr --output DVI-I-1 --off
xrandr --output DVI-D-0 --off
cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd /tmp/my_vars.fd
QEMU_PA_SAMPLES=128 QEMU_AUDIO_DRV=pa
qemu-system-x86_64 \
-enable-kvm \
-m 8196 \
-smp cores=4,threads=2 \
-cpu host,kvm=off \
-vga none \
-soundhw hda \
-usb -usbdevice host:1b1c:1b09 -usbdevice host:046d:c07d \
-device vfio-pci,host=02:00.0,multifunction=on \
-device vfio-pci,host=02:00.1 \
-drive if=pflash,format=raw,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd \
-drive if=pflash,format=raw,file=/tmp/my_vars.fd \
-device virtio-scsi-pci,id=scsi \
-drive file=/home/kemmler/kvm/win10.iso,id=isocd,format=raw,if=none -device scsi-cd,drive=isocd \
-drive file=/home/kemmler/kvm/win.img,id=disk,format=qcow2,if=none,cache=writeback -device scsi-hd,drive=disk \
-drive file=/home/kemmler/kvm/virt.iso,id=virtiocd,if=none,format=raw -device ide-cd,bus=ide.1,drive=virtiocd
xrandr --output DVI-D-0 --mode "2560x1440" --rate 60 --left-of HDMI-0
xrandr --output DVI-I-1 --mode "1920x1080" --rate 144 --left-of DVI-D-0

From here you should be able to install the nvidia driver, steam, etc. You’ll probably need to reboot to get the nvidia driver working. Once that’s done everything should be great.


Common Questions/Problems and Final Thoughts

  1. Can I SLI in the guest VM? Short answer is probably not. Neither of my lga 1150 motherboards will allow me to try.
  2. Can I use two identical GPUs with one being in each OS? You will run into problems binding just one with pci-stub if both gpus share the same identifier found in “lspci -nnk”. pci-stub will bind both of them and the nvidia driver will cease to function in linux. A potential workaround is to use xen’s pciback module. This will allow you to grab based on the pci bus rather than the device id but I haven’t tried this.
  3. Sound isn’t working! I’m using pulseaudio and the hda device. Honestly you’ll have to experiment. There’s a 200+ page forum post with people posting working configs here
  4. The guide is too long or not easy enough. This guide isn’t for you then.
  5. Can I use X device? Generally you can pass through any pci, or usb device. If you want an actual answer the only way would be to donate me the part so I can figure it out short of finding someone on google that claims it works.
  6. My windows ISO won’t boot! Make sure you’re using an unmodified copy. I’m not positive but a lot of the dual purpose ISOs and hand crafted don’t include bootx64.efi and I believe that’s the cause. I can confirm that a clean version of Windows 10 Enterprise x64 boots just fine.

Hopefully you found this guide useful. I’m sure it seems like a gigantic pain in the ass but really once you set it up you don’t have to mess with it again. I cannot stress how useful it is to simply be able to boot a VM play what I want and then turn it off without having to close all my shit by dual booting constantly. So let me know if this guide helped or how I can improve on it!