back to article VMware to kill SD cards and USB drives as vSphere boot options

VMware has warned users it will end support for non-persistent removable storage as a boot medium for its flagship vSphere VM-wrangler. A post last week delivered the news. "ESXi Boot configuration with only SD card or USB drive, without any persistent device, is deprecated with vSphere 7 Update 3," the post states. "In …

  1. elsergiovolador Silver badge

    Nanny

    We are now witnessing boot-related problems more frequently with ESXi 7.x

    How exactly are they witnessing it? Through "telemetry"?

    The nanny syndrome is creeping in...

    1. NoneSuch Silver badge
      FAIL

      Re: Nanny

      Had to roll back twenty-seven hosts to 6.7 after boot issues on 7.0.

      Absolute f****** shamble costing me and my team a full weekend.

      1. TonyJ

        Re: Nanny

        <i"...Had to roll back twenty-seven hosts to 6.7 after boot issues on 7.0.

        Absolute f****** shamble costing me and my team a full weekend..."</i>

        Pre-deployment testing? Rollback plan? Testing in between deployments?

        I would be genuinely interested how you got to 27 hosts before you noticed any issues. Surely you tested a limited number some time before a mass rollout?

        1. DJohnson

          Re: Nanny

          It's one of those delightful issues that may not show obvious symptoms for days or weeks, so it's actually easy for people to do all the right testing and still miss this. If you're looking at the vmkernel log you *might* notice some extra churn before host issues pop up.

          And yes, they are "noticing" because of people opening support cases.

        2. Anonymous Coward
          Anonymous Coward

          Re: Nanny

          It takes weeks or months to show up, depending upon how good / bad your SD cards are.

          We have many servers running ESXi on SD cards, they have been running ESXi from v4 (can update any further as no longer supported). But they have worked for 10+ years, but for some reason ESXi7 will kill these SD cards within weeks. They screwed up, logging far too much (no doubt due to the changes for kubernetes),

          As this was how we have run thing for a while and that its worked well, zero problems in over 10 years doing it, we purchased replacement r7515 from dell 2 years ago, with the boot option for SD with ESXi, but now we need to replace / purchase new drives for these servers as they cant run ESXi7.

    2. Jon 37

      Re: Nanny

      They are probably getting support calls from their customers. They don't need telemetry to see the problem.

      As far as "nanny" goes... If people call VMware for support that costs them money. I mean, one call makes no difference to them, but lots of calls mean they have to hire more support staff. So they are banning a whole category of support calls, which will save VMware money.

      It also gives IT departments a stick to beat the beancounters with: "VMware changed the rules, our old configuration is no longer supported" is a much stronger argument than "I'm worried the SD cards might fail soon". That argument also shifts blame nicely - it's all VMware's fault that they have to spend money and time reconfiguring systems, everyone in the company is blameless. Yes, this paragraph is full of nonsense big company bullsh*t, but that doesn't make it wrong.

      1. elsergiovolador Silver badge

        Re: Nanny

        If people call VMware for support that costs them money.

        Do they offer free support? I would think that support costs their customers money or is it just another victim of overselling?

        1. Jon 37

          Re: Nanny

          Many years ago, you paid annually for "support". I assume it's still the same. So VMware get the same amount of money if you made 50 calls or none. VMware would obviously rather get the money without having to do anything.

    3. DougMac

      Re: Nanny

      I already saw direct eveidence of this.

      I gave up on USB boot/SD-card boot of any server (including many VMware hypervisors) after so many failures over time. I experienced this with all other OSs as well.

      Sure, it works at first. And if you only have a few servers, you probably won't notice it that much. But if you have a lot of servers, you will see large #s of failures over time.

      Most "appliance" PCs come with Flash DOM modules, which are a bit more robust, but I have still had to replace many a DOM module as well.

      Full on SSDs have had a normal small range of failures from a large fleet of servers, well within my expected range. SC-card and USB flash boot failures are well over 50% over enough time in my environment.

      1. Sgt_Oddball

        Re: Nanny

        I second that. Old place of work used to have USB sticks for Esxi instances but after the second death, we ripped out the CD drive and hooked a small but quality sata SSD drive instead.

        No loss of SAS drives or hard disk space, just lose the DVD drive (which can be conveniently replaced with a usb dvd drive in a pinch).

        The biggest issue was getting hold of an adapter cable for HP's combi sata data+power cable.

    4. reeferman
      Mushroom

      Re: Nanny

      Definitely not nanny syndrome. More like customers screaming down the phone and sending SWAT teams to their homes.

      We've had to open support cases with VMware in relation to this. Expensive Dell hardware (FX2 / FC630) that has been rock solid for 3+ years with 6.x started to show problems with 7.x after unspecified period. Problem was not extant during precursor testing and took weeks to show up. Net result is no rollback as our entire real estate was over to 7.x before we knew the extent of the issue.

      Problem is down to a change in VMware 7.x which results in many more frequent writes/reads, choking on USB/SD storage, rendering the hypervisor unresponsive and requiring a fool reboot to fix. VMware recently released update 2c after much pressure to address this.

      Short version? Seems VMware 7.x was not adequately tested prior to release to the customer base and this should have been picked up.

      Rather than unpick some of the changes in 7.x the PHBs at VMware have simply decided to use the Unsupported Hardware billy club to make the problem crawl away and die.

      1. Martin M

        Re: Nanny

        A ‘fool reboot’ sounds a bit BOFH.

    5. Anonymous Coward
      Anonymous Coward

      Re: Nanny

      There is a VMware product that auto sends them your logs which allegedly can save them time when trying to diagnose an issue.

      Vrealize may have some insight into the issues, but I’d assume that an sd or usb card issue would be obvious.

      It looks like their change to the multi partition methodology concentrates reads / writes to a constrained area of those cards preventing those cards hardware doing wear levelling and causing accelerating wear to the most frequently used locations.

      Moving those frequent read/writes off card would be the best answer and looks like what they are proposing.

      Some kind of remote boot may be an answer, like going back to the future.

  2. John Robson Silver badge

    So do they sell magical SSDs that don't wear out?

    1. elsergiovolador Silver badge

      Maybe their customers are not aware of SSDs?

    2. Oneman2Many

      It's relative to the lifespan of the system. 10 years should be enough.

    3. Sgt_Oddball

      If its a drive...

      Rated at something obscene like 2 full re-writes a day for 3+years then it's pretty much a given that using it to manage an EXSi host will mean the drive will outlast the servers usefulness without dying.

    4. DavidYorkshire Silver badge

      SSDs will last a lot longer. I had to put them in our two main hosts urgently a while back as it was clear that the the SD cards were all failing at once (two SD cards in each host for supposed redundancy).

      The bigger question is why major server manufacturers ever thought it was a good idea to use SD cards as boot devices.

      There is the added advantage that they now boot much faster too.

      1. John Robson Silver badge

        a USB stick as a boot device is fine - it's maintaining a database on it that's the issue.

        Read only boot devices don't have a wear levelling issue.

  3. Alistair
    Windows

    NOooooo....

    The entire point of vmware allowing SD cards/USB sticks for boot was to allow HP and IBM to ship systems with the vmware boot image on the internal sd card, prepped and ready to go. The *idea* was that admins would run a proper install once racked and cabled, but folks is lazy and tend to run off the sd card. Basically, getting around the 'you can't sell our software man' legal rider they had in place. Later, vmware would audit the site and find x unregistered, unlicensed installs, and could oracle all over the client.

    Fishes, lures, hooks and gaffs.

    (And yes, I found over 150 of those when we decided to actually hunt down the underdesk, internal dev, but it works for production systems)

    1. Anonymous Coward
      Anonymous Coward

      Re: NOooooo....

      I (ab)use SD/USB boot routinely on my machines now. Why? Because I pass-through the HBA and run ZFS on the actual drives.

      Taking this away is just stupid. Sure, SD cards die if you write to them excessively. VMware shouldn’t be doing that, VMware are just too lazy to fix their crap. If you have GB of RAM to run VMs, there’s no reason it can’t carve out a few MB to make a RAM disk for the logs. This is exactly what SmartOS does!

      1. spuck

        Re: NOooooo....

        Logs in a RAM disk are all well and good, until you need them after a reboot.

        1. Anonymous Coward
          Anonymous Coward

          Re: NOooooo....

          Send them to a remote syslog server? VMware already supports that, I believe.

          1. Anonymous Coward
            Anonymous Coward

            Re: NOooooo....

            They still get written locally

    2. Androgynous Cow Herd

      Re: NOooooo....

      A lot of that work was pioneered at Dell, back in the day...something code named "VESO" that became the R805...

  4. Nate Amsden

    Almost never liked the thought of usb/sd boot

    I remember back when ESXi first came out and everyone was touting SD card and usb drive booting. Servers started coming with internal(??) SD card slots and stuff. Company I was at at the time deployed some using USB sticks I think and had failures pretty quick(had a failure within 4-6 months). At that point I realized I really didn't like the thought of the boot device for a $10-30k+ server being reliant upon such a cheap piece of crap for a boot drive.

    I looked a few times but could never find reviews or rankings of higher endurance usb drives/sd cards (perhaps that changed in recent years). HP (and probably others too) came out with a dual micro SD(?) USB stick at one point, I inherited 4 servers that ran that, and went through the associated recall (https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c05369827). Add to that as far as I could tell it was not possible to tell the status of the individual SD cards, you'd only know if both failed. Dell has a BOSS(?) card that sits in a PCI slot I think that uses NVMe drives, sounds pretty neat.

    I realize it worked fine for many people for many years. Past 10 years all of my hosts have been fibre channel boot from SAN. Except my personal esxi hosts which use local SSD storage.

    1. Anonymous Coward
      Anonymous Coward

      Re: Almost never liked the thought of usb/sd boot

      "Dell has a BOSS(?) card that sits in a PCI slot I think that uses NVMe drives, sounds pretty neat."

      That is until it fails and then you need to open it up to replace the drives.

      If its an ESXi server that is not a node using all its drive in vSAN then a BOSS card probably should be not used.

      Use normal SAS/SATA SSDs hotswap, no need to migrate your VMs, shutdown the server, open it up and replace the drives, just pull the drive and put in a new one. Its also cheaper and doesn't use up a PCI slot that you may want to use for something else.

  5. K

    Does my head in... I've used USB to boot dozens of ESXi for 10+ years... I've never had a single problem. I even have 5 at home running ESXi 7 with zero issues!

    Rather dictating they should simply ask, and suggest it has limited support. This all stems from "Two customers had problems, so we're going to screw you all over."

    I may look to see if PXE is feasible... I ain't wasting any more money on storage.

  6. picturethis
    FAIL

    Poor training....

    We've been running ESXi 5.x, 6.x for close to 10 years w/ the internal SD card booting. NO issues.

    But, on every machine, any tmp/temp files, all log files get redirected to external (Enterprise) storage where possible. This is done to reduce the exact problem they have (finally?) started thinking about: SD card writes.

    Rather than making such an idiotic move (eliminating/not supporting SD boot), they should provide some guidance on how to reduce writes the SD card.

    Really VMWARE? Stop being lazy.

    An alternative is to copy/replace the SD card every once-in-a-while (every year or two?) during a maintenance cycle.

    Although, I recently moved to SSD boot for RPi4's for the exact same reason, now that this capability is easier to set up and reversable, but there is also no convenience penalty, like there is on an enterprise system.

    1. Nate Amsden

      Re: Poor training....

      apparently writes aren't the only issue

      https://kb.vmware.com/s/article/2149257

      "High frequency of read operations on VMware Tools image may cause SD card corruption (2149257)"

      dates back to 6.0 and 6.5

      other issues

      https://kb.vmware.com/s/article/83376 Connection to the /bootbank partition intermittently breaks when you use USB or SD devices

      (note applies to 6.7 too with no resolution available)

      https://kb.vmware.com/s/article/83963 Bootbank cannot be found at path '/bootbank' errors being seen after upgrading to ESXi 7.0 U2

      probably others too just 3 that I saw in a recent thread elsewhere.

      Also vmware seems to be suggesting, perhaps requiring over 100GB of disk space for the boot disk, which probably factors into their decision to stop SD/USB support:

      https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.esxi.install.doc/GUID-DEB8086A-306B-4239-BF76-E354679202FC.html

      * A local disk of 138 GB or larger. The disk contains the boot partition, ESX-OSData volume and a VMFS datastore.

      * A device that supports the minimum of 128 Terabytes Written (TBW).

      * A device that delivers at least 100 MB/s of sequential write speed.

      * To provide resiliency in case of device failure, a RAID 1 mirrored device is recommended.

      Not sure if there are any USB or SD cards that have 128TBW life spans and 100MB/sec sequential write speed.

      I have yet to touch vsphere 7 myself, maybe next year.

      1. Anonymous Coward
        Anonymous Coward

        Re: Poor training....

        If you are wanting a supported version of esxi you will need to be moving to 7 next year. 6.7 was going to be EOL this year, but extended due to covid.

    2. Anonymous Coward
      Anonymous Coward

      Re: Poor training....

      Anon because my job is supporting ESXi for one of the Big Name OEMs....

      "They wear out" is not the whole story. It's the easy cop-out answer, the "face saving" move if you will. From where I sit the problem is mainly that VMware doesn't want to put any effort into making their USB-storage driver reliable. This has very, VERY little to do with how much I/O you push or with the type or 'grade' of flash involved. It looks like their _driver_ chokes and then they blame the hardware.

      On the flip side, as the hardware distributor we have no serious way to check or validate the health of the SD card subsystem. We can boot to a Linux live ISO and hammer it with 'dd'...usually works fine! But that doesn't really solve anything. The customer wants it to _work_, so saying "the hardware is fine" just sets up a finger-pointing blame game.

      I don't think this is purely apathy....probably VMware Engineering got word from TPTB to focus on some new shiny and let USB/SD die. I think it's a good move long-term, but horribly communicated to everyone involved.

  7. chivo243 Silver badge

    Hardware makers

    I'm sure they are completely ready for this, and will have a nice $olution...

  8. Piro Silver badge

    Great

    Just great. Thanks for the pile of crap, VMware!

  9. ObsidianAura

    Something makes me think this is a reaction to the massive f**k up VMware had on V7.0 U2

    Anyone else have their data centres go unresponsive as their systems stopped talking to the Host's SD cards?

    They're just scrapping that functionality now rather than deal with it. As my datacentres have no installed storage other than the SD card and the SANs I have the empty drive bays to spare.

    Was always a bit dubious when the salesperson recommended using an SD card in the first place though tbh.

    How easy is it to migrate from SD card to a conventional drive anyway?

    1. Anonymous Coward
      Anonymous Coward

      Re: Something makes me think this is a reaction to the massive f**k up VMware had on V7.0 U2

      That would depend upon if you also got you servers with RAID controllers. If you are purchasing without drives, usually no reason to get a controller either.

      Other wise its just install esxi again on the host and add back to the cluster.

  10. Anonymous Coward
    Anonymous Coward

    SD cards and USB drives

    when is a usb drive a non-usb drive? Is my 2.5 ssd plugged via usb 3 cable a usb or ssd? How about a usb oversized stick that holds nvme?

    1. Anonymous Coward
      Anonymous Coward

      Re: SD cards and USB drives

      Its a USB drive as its using the USB driver. Its their driver that is the biggest problem here.

  11. TommyKTheDJ

    Just use something like this, no?

    https://buy.hpe.com/us/en/options/boot-devices/os-boot-devices/boot-cards-devices-drives/hpe-os-boot-devices/hpe-ns204i-p-x2-lanes-nvme-pcie3-x8-os-boot-device/p/P12965-B21

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like