back to article GPU-accelerated VMs on Proxmox, XCP-ng? Here's what you need to know

Broadcom's acquisition of VMware has sent many scrambling for alternatives. Two of the biggest beneficiaries of Broadcom's price hikes, at least on the free and open source side of things, have been the Proxmox VE and XCP-ng hypervisors. At the same time, interest in enterprise AI has taken off in earnest. With so many making …

  1. Anonymous Coward
    Anonymous Coward

    In the meantime on ESXi

    GPU passthrough on ESXI:

    - Find the GPU in the hardware list and click on it

    - Click on the passthrough toggle button to enable passthrough

    - Reboot host

    - Open VM properties and add GPU

    That's it. No manual fiddling with PCI addresses or the command line.

    Considering that Proxmox and XCP-ng are often recommended alternatives to ESXi, the fact that both make a very basic task like passing through a PCIe device so convoluted is disappointing. And that's not just true for PCIe passthrough.

    But then, neither are really replacements for ESXi in larger installations anyways (Proxmox lacks a decent single pane of glass, suffers from occasional reliability issues and doesn't have enterprise grade support behind it, and XCP-ng is quite literally a technological dead end with the zombie that is Xen, while bringing back a lot of the misery from old versions of XenServer).

    1. abend0c4 Silver badge

      Re: In the meantime on ESXi

      Proxmox and XCP-ng are often recommended alternatives to ESXi,

      I'm not sure they are. They are a frequent answer to the question if I find myself unable to use ESXi, what else is available?, but that doesn't make them equivalent to ESXi - or even to each other.

      They're both generally less fussy about hardware, which means they have wider applicability, but that comes from running under a Linux kernel, so there's some unavoidable baggage in getting the Linux kernel to enable the right processor options and ignore the hardware that's going to be used exclusively for virtual machines. But they're serving a somewhat different audience.

      I can run KVM (which basically underpins Proxmox) on my laptop and have VMs with accelerated graphics on the laptop screen. That's simply not an ESXi use case and I have no problem with that. Horses for courses.

      1. t0m5k1

        Re: In the meantime on ESXi

        Since ESXi became subscription based many cannot budget for that and so many (like my employer, a large disti and service provider in the IT channel) have moved from ESXi over to proxmox.

        For us this change was relatively painless and now proxmox provides a full migration feature which made the process even faster.

      2. Bebu
        Windows

        Re: In the meantime on ESXi

        《if I find myself unable to use ESXi, what else is available?》

        Currently running proxmox - I quite like it as its pretty obvious and transparent. The debian foundations means you can fiddle with it. The out of the box ZFS support is nice. ;) In the process of configuring iSCSI storage - looks a little clunky but I shall see.

        I had a quick play with SmartOS (Illumos based) which supports both kvm and bhyve virtualization - I rather liked it but then I herded Sun boxes and BSD like Unixes for years. :)

        Haven't looked at Nutanix which I believe is attracting the greatest share of VMWare/Broadcom émigrés and refugees.

  2. Grunchy Silver badge

    Works marvelous! A little shrill, perhaps..

    Craft computing on YouTube has several “how to do it” videos for setting up a multi-user gaming server with outdated server cards and Parsec and/or Sunshine/Moonlight remote desktop.

    I got a 2nd hand Datto NAS equipped with a Celeron including onboard GPU, I was able to enable virtualization, install the Proxmox, install a butt-load of servers, and also an Ubuntu container running “eye spy” agent freeware monitoring several network security cameras strategically placed around the compound with full hardware accelerated encoding. The only hiccough arose because I decided to enable NFS for one of the ZFS volumes & apparmor started howling about unprivileged containers. Funny thing, up until that moment everything inexplicably “just works,” so you press your luck “just one more time” and now apparmor howls at me every single login, it really is very shrill. Wish it weren’t such a banshee but I suppose it’s only expressing its true nature.

    1. David 132 Silver badge

      Re: Works marvelous! A little shrill, perhaps..

      Upvoted. Also, other than books written more than 40 years ago, your comment may be the first instance of the spelling variant “hiccough” I have seen in quite a while :)

    2. diodesign (Written by Reg staff) Silver badge

      Re: A little shrill

      Hi -- glad you enjoyed the piece. Genuinely curious so that we can improve our writing: What did you think was shrill?

      C.

  3. Anonymous Coward
    Anonymous Coward

    Ovirt and OpenStack

    Need a section

  4. Tubz Silver badge

    Just setup Proxmox + OPNsense + pi-hole + Unbound DNS to replace my router on an old Dell T3420 Xeon 8-Core 64GB mini-desktop, that I got of eBay for £86, old router is now just an AP, wi-fi is perfectly acceptable for my connection needs, just got sick of buying overpriced ARM based con..sumer rubbish that manufacturers don't support.

  5. Bebu
    Windows

    "you may want to set up a display via ... the motherboard's integrated graphics."

    Worth trying on an old Z640 as its got a Quadro M6000 (3000 cores) and I can use the integrated graphics... Oh sod it, it's got a xeon (E5) so no integrated graphics and squeezing another card in (the m6000 takes two slots and the sas controller another) might be too big an ask. :(

    An interesting exercise nonetheless so I might perservere with it, not that I have any desire to run LLM software. ;)

    Just trying these apparently peculiar exercises often do have some practical, if unforeseen, application down the track.

  6. Anonymous Coward
    Anonymous Coward

    Looks production ready to me. ;-)

  7. BinkyTheMagicPaperclip Silver badge

    Missing a fair bit of detail

    The article will get you up and running to some extent, but there's a lot of gochas. This really is something ESXi does much better than the alternatives.

    It's only 'a basic task' as the first poster notes, when a load of code has been written and a strict hardware compatibility list is in use. Although it's hardly insurmountable elsewhere.

    As is noted later in the article when passing through a GPU be certain to pass through the audio device in addition to the graphics device, the HDMI audio part usually being nnnn:nn:nn.1

    On average you'll experience far more success with the official closed source Nvidia drivers in the VM than using Nouveau.

    There is nothing stopping you passing multiple GPUs or other PCI-e (or indeed PCI) devices to the same VM. Which gets the BIOS on boot if you're daft enough to use the passed through card as the only graphics output may be another question though.

    I'd recommend googling 'PCI_passthrough_via_OVMF' for Arch Linux to check iommu groups on your hardware. ESXi makes this easy by displaying it graphically. It's enforced/documented to a varying degree on other software. If for some reason you're passing through PCI, all PCI slots are by definition in the same IOMMU group because they don't feature root ports and note that any onboard graphics may be PCI. Stick to PCI-e!

    Most modern systems are probably OK, with one device per group. With older systems you may struggle. All devices in the same group must be passed through to the VM at the same time. If they're owned by and written to by different VMs/the host and a VM Very Bad Things Occur. If you're a consumer and want to fiddle with an old motherboard that supports VT-d beware, there are many sub standard BIOSes out there. Use modern hardware on a HCL.

    Unless you have a fancy and expensive multi controller USB card, you cannot pass through each port individually to different VMs. Use the USB mapping facilities within your virtualisation product.

    You may also not be able to pass through ports from a discrete USB card if onboard legacy USB is enabled (this is a BIOS setting). Except 'legacy USB' is used by some BMC to provide virtual media support, so you can't pass through discrete USB cards/ports and have virtual media enabled at the same time.

    Use Nvidia for GPU passthrough. Do not bother with AMD unless it is a workstation card that explicitly supports passthrough (they did make one at some point). Other AMD GPUs will pass through fine *once* (by 'once' this includes not rebooting the VM) . Thereafter unless you're lucky, and using specific (unreliable) workarounds, it may need a cold host reboot to get it working again.

    I've used ESXi, Xen & XCP, KVM, and FreeBSD (byhve) for passthrough in the past. ESXi is generally a doddle. KVM had GPU passthrough working years ago, Xen was more of a hassle but will have caught up now. bhyve is bleeding edge - it *does* work to some extent but it's a very movable feast and is immature as passthrough on VM systems goes, expect pain.

    1. Anonymous Coward
      Anonymous Coward

      Re: Missing a fair bit of detail

      "It's only 'a basic task' as the first poster notes, when a load of code has been written and a strict hardware compatibility list is in use. "

      Wrong. Whether a PCIe device is officially supported and on VMware's/Broadom's HCL or not is completely irrelevant for PCIe passthrough on ESXi. Because ESXi's hardware list shows every PCIe device no matter it's support status, and passing through a non-supported device works exactly the same as for one which is supported.

      It is, quite literally, always the basic task as noted in the first post.

      1. BinkyTheMagicPaperclip Silver badge

        Re: Missing a fair bit of detail

        This article is written about alternatives to ESXi. When I say it is not always a basic task, I mean *if you're not using ESXi*. Given the context and the line above where I say 'ESXi does it better' I didn't feel I needed to spell this out.

        It is an easy task for the end user under ESXi because they have implemented the fiddly details and recommend configurations that work. It is not a *basic task* in terms of the operations carried out behind the scenes, the workarounds for various flawed chipsets, and so on.

        If it was so easy every hypervisor out there would support passthrough flawlessly. Instead many halt at the functionality level of providing a guest environment with a very limited set of virtual devices. This is because anything else is difficult, fiddly, and subject to compensating for hardware and BIOS bugs.

        The HCL is a list of configurations where VMWare will provide support, and you can be confident the combination will work. It will not necessarily prevent either passthrough or ESXi installation on bare metal (except where VMWare have removed driver support for certain devices such as the onboard NIC).

        Take ESXi, an Nvidia GPU on the HCL, and the official Nvidia drivers on several different guest operating systems and it will in general Just Work.

        Take ESXi, an AMD GPU not on the HCL, and the official AMD drivers and it will probably work once until you reboot the VM. If the VM unexpectedly bluescreens the workarounds included in the Linux kernel/Windows drivers to compensate for AMD GPUs not supporting FLR will not run, and it's necessary to reboot the bare metal. This is much less probable with Nvidia hardware.

        Take bhyve under FreeBSD and it will work to varying degrees on different operating systems from 'quite well' to 'maybe once if you're lucky'.

        That is assuming what's occurring here is vGPU to use the card only as an accelerator, rather than GPU passthrough where the GPU is used for graphics output. Graphics cards are a nightmare in terms of BIOS interaction and other legacy resources used.

        This is why it is not 'a basic task'.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like