back to article AWS is fed up with tech that wasn’t built for clouds because it has a big 'blast radius' when things go awry

Amazon Web Services is tired of tech that wasn’t purpose built for clouds and hopes that the stuff it’s now building from scratch will be more appealing to you, too. That’s The Register’s takeaway from today’s “Infrastructure Keynote” at the cloud giant’s elongated re:invent conference, which featured veep for global …

  1. john.jones.name
    WTF?

    chip looks like the Annapurna Labs

    this is general purpose mid range ARM stuff that is cheap hence why they are putting it into a NIC yes it can run linux and do offload for the main general purpose processor.

    really this is general purpose processor being pushed into places to replace application specific integrated circuits that used to do offload, basically the complete opposite of what the exec is talking about...

    dont let the details get in the way of a good story for the executive...

    1. Dave 126 Silver badge

      Re: chip looks like the Annapurna Labs

      Nitro refers to a card, not an individual processor (the ambiguous term 'silicon' was used by the article author, not AWS's executive). Most of the chips on the card are designed by Annapurna, which Amazon bought in 2015, with this application in mind.

      https://community.cadence.com/cadence_blogs_8/b/breakfast-bytes/posts/the-aws-nitro-project

  2. Jim Mitchell
    Paris Hilton

    "Among the nuggets he revealed was that AWS has designed its own uninterruptible power supplies (UPS) and that there’s now one in each of its racks. AWS decided on that approach because the UPSes it needs are so big they need a dedicated room to handle the sheer quantity of lead-acid batteries. "

    I'm having trouble reconciling that the UPSes Amazon requires "need a dedicated room" but there is also "one in each of its racks".

    1. Anonymous Coward
      Anonymous Coward

      I think the one in each rack is the alternative solution to having a big room full of UPSs.

      Instead of one big bang on major failure, you'd get a small bang. (Which would probably cascade around the server room, but hey.)

      1. big_D Silver badge
        Mushroom

        That's how I read it as well. Instead of one big explosion that blows the walls down, it is like a string of firecrackers.

      2. disk iops

        the top-of-rackl UPS needs to have enough power for about 30 seconds worth of load - they might have as much as 2 minutes. Buffers can be flushed and checkpoints written before everything goes black. The Gensets fire immediately on loss of mains. But if one were to fail and the N+1 also, then yes, part of the DC goes magically silent.

        1. Crypto Monad Silver badge

          the top-of-rackl UPS needs to have enough power for about 30 seconds worth of load

          You'll do your back less damage if you put the UPS at the bottom of the rack.

          1. Claptrap314 Silver badge

            Also, which is more likely to damage which if it is on top?

            If (and I do me IF) per-rack UPS is a good idea, the UPS goes on the bottom...

        2. Gene Cash Silver badge

          The Gensets fire immediately on loss of mains

          Rather optimistic in my experience...

      3. Steve 53

        We've seen many occasions when a large DC-scale UPS fails to live up to it's name, and downs a whole DC. Equinix, BA to name ones in my recent memory.

        A few servers going down is a lot more tolerable than a whole AZ, and with a larger number of devices UPS failures become a routine problem - much better than a very large one-off problem.

        So, plenty of merit, until one of them catches fire or somesuch anyway...

    2. Anonymous Coward
      Anonymous Coward

      "I'm having trouble reconciling that the UPSes Amazon requires "need a dedicated room" but there is also "one in each of its racks"."

      Well, yes, it hit a question to me as well. I can get why they want one UPS per rack, no issue, given the size of their infra and the horizontal scaling.

      The only explanation I'd had for deporting the batteries is leaks of acid, but TBH, I've yet to hear about one single incident on top line commercial products, here.

      1. Anonymous Coward
        Anonymous Coward

        "given the size of their infra and the horizontal scaling"

        The largest AWS datacentres are US-East-X and weigh in at around 65MW each.

        AWS engineers have publicly said that the faults they see on those DC's continue to surprise them but allow them to build almost bullet proof 30MW DC's... If the US East coast wasn't so popular and latency wasn't such a big isue, they would downsize the larger DC's.

        Next time you see a major AWS outage, chances are that it will be in US-East

        1. Roland6 Silver badge

          >The largest AWS datacentres are US-East-X and weigh in at around 65MW each.

          AWS engineers have publicly said that the faults they see on those DC's continue to surprise them but allow them to build almost bullet proof 30MW DC's...

          That is probably because of the poor approach taken to scaling; an example was discussed here:

          AWS reveals it broke itself by exceeding OS thread limits, sysadmins weren’t familiar with some workarounds.

      2. Richard 12 Silver badge
        Boffin

        Modern sealed lead-acids will only vent hydrogen if you abuse them, and don't leak liquids - even if you physically stab them repeatedly.

        The former is trivially resolved by charging them properly, monitoring and replacing them before they become a problem. That's also really easy to do as the battery chemistry is very well understood.

        Wet cell battery rooms belong dead, before they kill more operators (and datacentres too)

    3. Anonymous Coward
      Anonymous Coward

      The problem AWS have had with large UPS's is that they have a tendency to detect a fault when switching from mains power to battery and that has left AWS with dead DC's either through the UPS's not failing over to battery (nbad but it could be worse...) or only some UPS's failing over to battery and creating overloaded conditions before failing completely (see...I told you it could be worse).

      The fixes are a case of balancing safety against availability - disabling some of the ground fault detection mechanisms and replacing them with physical safety mechanisms to protect people instead was AWS's preferred route versus accepting the defaults and $100m of dead DC. (Reference:

      https://perspectives.mvdirona.com/2017/04/at-scale-rare-events-arent-rare/)

      It looks like AWS now view in rack UPS with a failure radius of the local rack in 99.x% of cases as prefereable to larger UPS's where the failure radius is potentially much larger and much more expensive when they hit unexpected corner cases....

    4. Peter2 Silver badge

      The typical approach to UPS's provided by the retail manufacturers of UPS's is to provide small UPS's that have shitty inverters in them with batteries that can supply single digit sort of amp hour capacity, or decent inverters and batteries supplied by the additional rack per extra few minutes runtime.

      This is not actually electrically required, you could easily build yourself a UPS system more tailored to IT requirements for a lot less. (high power draw, relatively short runtime)

      Going googling for cheap stuff, £286.12 gets you a 10kw inverter designed for a solar panel application, sans batteries.

      Bought from Euro Car parts at retail prices (with the discount code applied when you try and leave the site having viewed something, but not bought it) you could buy a car battery for £38.99 that has a 40 amp capacity. Ten of these gives a ~400amp capacity of the system so this would cost £380.99 + £286.12 = £667.11 for ~9 minutes runtime at 10,000 watts.

      The APC alternative costs £7704 (but has 18 mins runtime at 10kw)

      My version doesn't include cables or connectors and would certainly need an electrician to sign it off under part p of the electricial safety regs, all of which scale. (you'd just buy a drum of cable and a box of connectors, and the cost of an electrician certifying the units falls with doing job lots of the same design compared to a single installation...)

      If I were speccing the thing, i'd probably actually want to minimise space used in each rack, so perhaps 3 batteries plus the inverter taking up what, 4U? in the rack with a runtime of ~2-3 mins. This with the reasoning that all the UPS's should be doing is providing contuinity of power until the generator picks up the load, and if it's not done that within 2 minutes then the response time for the engineer + fix to the genset is going to be wayyy more than i'm willing to spec batteries for.

      This then eliminates the requirement for a room full of batteries, and eliminates the UPS as a single point of failure. Heck, there's no particular reason you couldn't stick in a pair of inverters running off the same batteries in the rack if you had custom units made; then you'd have both PSU's in each server effectively run off different UPS's . The chance of both failing is tiny and with three batteries the failure of one would just knock off a third off of your runtime.

      So yeah, you'd have to be at data scale sort of size to think about bothering with something like this, but it's obviously easily viable to do at a quick "back of the envelope" sort of estimate.

      1. katrinab Silver badge
        Meh

        Isn’t it possible to put the battery between the power supply and the motherboard, then you don’t need to worry about inverters?

        Basically like a laptop power supply with 15-20x more watts.

        1. Richard 12 Silver badge

          Yes. There's a lot of good arguments for single-rack 12VDC busbar power supplies.

          That said, you don't want to run directly from the battery as the terminal and float charge voltages vary a lot, far more than the tiny SMPs in motherboards are designed to handle.

          Decent DC-DC boost/buck converters are a little more expensive than the really awful square-wave "inverters" in some UPS, but cost the same or less than a good sine inverter.

        2. Jon 37

          Yes. At one point, Google was fitting a small battery to each of their servers. Don't know if they still do that.

      2. Anonymous Coward
        Anonymous Coward

        you could easily build yourself a UPS system more tailored to IT requirements for a lot less.

        Sorry for the rant (hence anon!), but I'd dearly love to find a high-spec UPS for not much money - or to build it for myself - but although the "big name" manufacturers undoubtedly charge over-the-odds, there is a lot more to a UPS than battery plus inverter.

        Maybe you could, but not doing it the way you describe I think - unless you can show us real links?

        I'm not convinced you could get a decent 10kW inverter for that cost. If you can, it'll be square-wave or stepped-square wave, which to be honest is probably ok for the switchmode power supplies found in IT kit, but there are lots of things that complain at that, and you were just moaning about "shitty inverters".

        Oh, and how would you go about switching from mains to inverter? If you are looking at grid-tie inverters, most of them are designed to cut off the power completely on the loss of mains, to avoid back-feeding the grid or trying to power the cooker when the mains fails. So you need to factor in the switch.

        Car batteries are not ideally suited to data centre use. Although many of them are "sealed" these days, they still use liquid electrolytes - so must be used upright - and can vent gas. Stab them - as someone suggested you could do with a SLA battery - and a car battery will leak acid all over your underfloor cabling.

        And how would you wire them up? If you really are looking at a 10kW system, at the 12V of a car battery you would need over 830A of current at the input to the inverter, with ten in parallel that's 83A per battery. Any battery slightly out of condition compared to its siblings is going to suffer. If you are only going to specify three, each one will need to supply 277A for the whole period the inverter is running, and if you really are going to survive a battery failure (and have the necessary circuit to isolate that battery automatically), the remaining two will need to provide 415A each,

        Big UPSes have banks of batteries in series - to give more voltage, so a lower current per power output required. Our simple APC 3kW SmartUPSes at work have two banks of four batteries for an input to the inverter of 48V at 62½A (i.e. 31A per string). Of course, a failed battery in this scenario takes out a whole string.

        Let's assume you run all of your ten batteries in series, giving a nominal 120V. 10kW at 120V is still going to produce 83A, which will need cables of somewhere around 25mm² - this is the size of "meter tails" fitted to houses in the UK for the last 20 years or so. A car battery clamp should easily accommodate those.

        I'll assume that your inverter can handle cables of that size, but what about charging? Car batteries are pretty robust, but charging ten in series is fraught with technical difficulties, particularly with regard to monitoring temperatures and suchlike. Your back-of-a-fag-packet UPS doesn't seem to include any consideration for charging, and as for your idea of hanging two inverters from the same set of batteries, don't forget the control logic you would need to make sure they didn't both try to come on at once, and how would your kit cope if one inverter came on and instantly failed (I've seen it) meaning a double-dip on the power as there would of necessity be some small delay while this is detected and the second inverter wakes up.

        I'm also a little confused about the batteries themselves. Assuming you mean 40Ah batteries, not 40A batteries (car batteries are usually designed for high starting currents, typically 200 - 400A) a naïve calculation would imply the ability to supply 83A for somewhere around 28 minutes, but one thing I've come to understand about Lead Acid batteries is that the "amp-hour" rating is calculated in all sorts of exciting ways that never tally with your use-case, that a battery rated at 40Ah will only have that capacity at a certain discharge rate and that different manufacturers use different cut-off points for terminal voltage, which can be as low as 10V. For a constant power output the current must go up as the voltage reduces which has to be fed into the assumptions.

        A "400A" car battery is only designed to deliver that amount of current for the few seconds it takes to start the engine. In fact, it's even less than that because the big hit of current is starting the thing rotating in the first place. Once the engine is turning it takes less power to crank. As anyone with a dead car will tell you though, keeping the thing turning and turning and turning eventually leads not just to a flat battery but to "nasty niffs" from under the bonnet as the cable insulation begins to melt and oil and grime on the terminals begins to burn off. Chunky though the cables are, they warm up quite considerably when you pass a couple of hundred Amps through them.

        Personally, IT type UPSes have always confused me, and AWS's comments about "general purpose processors" is very relevant. One thing common to IT kit is that none of it requires 240V at the motherboard. In fact, most IT kit will not be using anything higher than 12V on the board so why go to all the bother of converting 12V from a battery to 230V and back to 12V again? I think someone else pointed this out.

        In fact, some kit already has this possibility. I met some Cisco switches which have both a 230V input and a 12V (I think) "backup" input. Just plug a battery straight in. You could do the same with most computer PSUs - imagine a typical server with a pair of modular power supplies, but where one of them takes 12V, not 230V.

        Like I said, sorry about the rant.

        P.S. £38.99 * 10 = £389.90, not £380.99 ;-)

    5. Roland6 Silver badge

      >I'm having trouble reconciling that the UPSes Amazon requires "need a dedicated room" but there is also "one in each of its racks".

      From memory one of the issues with commercially available UPS's is that they have gone for the simple approach, namely in line with the mains. Result 240v AC is converted into DC for the batteries, on discharge the DC from the batteries is converted to AC for distribution to devices, which in turn convert it back to DC for consumption.

      Amazon (and probably Facebook) determined at scale this use of convertors was daft, place the UPS local to a device and plug it directly into the device's DC rail. In this configuration, a smaller battery was needed to provide the same level of protection as that remote room of batteries and investors.

      What is probably surprising is that now, some years later, rackmount servers (ie. servers intended for the data center) are still supplied with an AC power connection rather than a DC one with an external rack sized invertor etc. Mind you you get a slightly different problem in the comms racks where the kit can require 12V 2A supplies, but can you get a USP to deliver this as an output and thus bypass the DC/AC/DC conversion...

    6. Anonymous Coward
      Anonymous Coward

      @Jim Mitchell

      Tripp Lite makes a UPS system for a rack.

      It will give you 15 mins or so... depending on what's in the rack so its more than enough time to move things elsewhere if the power is hit.

      At the same time... Amazon could build out their own sub station, depending on their electrical needs.

      Here's the issue. Cost.

      Buying rack based UPSs is going to be cheaper than building out the big bang systems.

      1. Roland6 Silver badge

        Re: @Jim Mitchell

        >Tripp Lite makes a UPS system for a rack.

        And it's just like all the other offerings - it has a 230V sine wave (aka AC) output.

        Obviously, Amazon has full control of the contents of its racks will have adapted the server blades to support the external feed of low voltage. I've not seem a vendor offer on the open market a server blade without an internal ATX/LPX power supply and so having sockets for 3.3V, 5V & 12V power conections.

        1. Crypto Monad Silver badge

          Re: @Jim Mitchell

          One rack of 2.5kW, supplied directly at 5VDC, would be 500 amps. That needs one hell of a busbar to keep voltage drops within reasonable bounds, and very inflexible power cabling.

          I believe 48VDC racks are (or were) common in the telco world. But each device still needs a DC-to-DC converter, so you might as well feed in AC.

          1. David Pearce

            Raw battery

            No, you have 48V battery and rectifier and generate all the server internal supplies from the 48 Vdc.

            This dc-dc convertor can be a tiny bit more efficient than most AC input supplies, which are nearly all designed to deal with an input of 110-240V and handle brown outs.

            The energy savings of not going back to AC can be significant

          2. madphysics

            Re: @Jim Mitchell

            Most recent open compute specs are moving towards 12VDC or 48VDC busbars. All the sensible folks are doing it.

            Efficiency for DC/DC is in the very high 90, and also very small, look at some of the Analog Devices DC/DC modules and you can do 100A @ 1.2V from something quarter the size of a matchbox with 12V input.

            Busbars are pretty easy, substantial but still not a problem. I've seen 100kW per rack prototype systems with 3x 48V busbars. There are EU funded supercomputer research projects pushing that level of density.

    7. DaveFlagAndTenDigits

      My interpretation of this was that there's switchgear/firmware in each of the racks and a fairly big battery array elsewhere, which does or does not make sense depending on the power supply architecture as a whole.

  3. jake Silver badge

    "“Software you don’t own in your infrastructure is a risk,” DeSantis said, outlining a scenario in which notifying a vendor of a firmware problem in a device commences a process of attempting to replicate the issue, followed by developing a fix and then deployment.

    "“It can take a year to fix an issue,” he said."

    Yep. Now ask me why I don't use or recommend clouds. If I own the software, and indeed the infrastructure, I can fix it today, not next Thursday when the AWS techs get around to it. If they ever get around to it.

    1. Rob Daglish

      Yeah - I know what you're saying - but I suspect that if you found an issue in APC's UPS Firmware, you'd be SOL getting a fix out of them at all... If amazon are talking about a year to get a fix from their vendors with the scale of purchase they make, the rest of us can probably wait until hell freezes over before anything gets done about it from our vendors - the argument isn't about cloud vs on prem - it's about how good vendor support is, and we all know how variable that can be!

      1. jake Silver badge

        I suspect that if I found an issue with APC's firmware, it would have been in acceptance testing, and it wouldn't have made it into production. Twice. There was no third time; I took APC off the approved vendor list.

        1. Disgusted Of Tunbridge Wells Silver badge

          At my old employer, a tiny firm, we bought an APC IP PDU.

          It was amazing. It decided to switch everything off for no reason at 2am on morning.

          The web interface was also so incredibly slow. Like I couldn't think how it could be that slow without it having a large call to sleep() as part of the web server's request handling.

          Never again APC.

          1. logicalextreme

            I've got an APC home UPS that switches itself off when it pleases, too. Considering I've had two actual power cuts in six years, I recently plugged my NAS directly into the mains to protect it from the UPS.

    2. big_D Silver badge

      "“Software you don’t own in your infrastructure is a risk,” DeSantis said, "but only when it comes to us, for you, our AWS service is the best!"

      Next time he should engage his brain, before opening his mouth.

    3. Dave 126 Silver badge

      That's all groovy, but what if you needed to provide a service to people across the world? To get comparable latency performance you'd have to build and maintain multiple data centres of your own across on several continents.

      You would need a damned good crystal ball too, in order to predict the demand for your business's services months in advance so as to commission and build the required capacity.

      So, if you have faster-than-light interconnects and and you see the future... then you're right, there's no need for any business to use cloud services.

      1. jake Silver badge

        Let's face it ...

        ... who really needs the infrastructure you describe? I'll bet you a plugged nickel that all of them own their own infrastructure and wouldn't touch AWS (or any other so-called "cloud") with a ten foot pole.

        1. Dave 126 Silver badge

          Re: Let's face it ...

          Anyone who isn't in a position to know their infrastructure requirements sufficiently in advance to commission their own, such as a growing company. Anyone who has only intermittent need for a lot of processing, such as an engineering company.

          I'm not saying cloud is suitable for everything, just that it is suitable for some things. Or at least it is the least bad option for some things.

          If this were not true then we wouldn't have the likes of Boeing exploring ways mitigating the security concerns of using the cloud. Mitigations include splitting up the data, and performing calculations on encrypted data.

          Of course it isn't a one size fits all situation. One step along from building your own infrastructure company might be to rent space in a data centre where the power and physical security is managed by another party, but only your staff install the servers and hold the keys.

          1. Claptrap314 Silver badge

            Re: Let's face it ...

            I was with you until you (by implication) used Boeing as a example of presumed best practices. That I idea went down in flames.

            1. jake Silver badge

              Re: Let's face it ...

              I rather suspect the Boeing move has more to do with hoping to put bonuses in the Board's pockets than it has anything at all to do with compute capability. Boeing is one of the few companies that measures internal compute power in Acres, a unit that is usually reserved for government TLAs.

      2. Doctor Syntax Silver badge

        "there's no need for any business to use cloud services."

        Quite true. We ran for years without such a thing existing but I don't suppose that's what you meant.

      3. Wellyboot Silver badge

        I see cloud as a useful option for smoothing out the bumps in a long term internal infrastructure growth strategy and providing capacity to cover short term events.

        It's when 'cloud available = company operating' that I get a bit twitchy.

      4. Anonymous Coward
        Anonymous Coward

        "You would need a damned good crystal ball too, in order to predict the demand for your business's services months in advance so as to commission and build the required capacity."

        While AWS/Amzon gets a lot of crap about tax avoidance, they see themselves in the AWS buildout phase.

        The estimate is that 120-150 DC's should be enough to keep the world happy depending on demand from less developed countries. Given 3+ DC's per Availability zone, thats 40 regions.

        They are currently at 24 regions/77 availability zones.

        While that may not cover everything you had in mind, it may make it easier to relocate your business than to worry about high latency. And for managing capacity they will likely use cost - do you really need your services hosted in region A that is close to capacity when region B is half the cost?

    4. Wellyboot Silver badge

      Somebody has just won a bet that they could get a senior exec to publicly give a good reason as to why using cloud products for mission critical isn't smart.

      I'd give this a '2' on the Ratner* scale

      *If you don't know Ratner, search for 'ratner prawn' for a level '10'

    5. Roland6 Silver badge

      >“Software you don’t own in your infrastructure is a risk,” DeSantis said

      I wonder if this is something He/Amazon have rediscovered or it is something they know because of their background.

      Back in the 1980's working on mass transport control systems, it was a requirement that the company had ownership and control of all the source code used in the system and had the means to maintain it for the design life of the system (20+ years); escrow arrangements weren't sufficient.

      1. Nick Ryan Silver badge

        Unfortunately it often now feels that the choice too often made is to have a third party cobble things together out of third party components and for support for the system to effectively finish the exact moment of deployment. At which point testing can commence.

  4. Ozan

    I half waited that he would say they developed software defined UPS.

  5. Headley_Grange Silver badge

    "Software you don’t own in your infrastructure is a risk"

    Oh, the irony.

  6. Anonymous Coward
    Anonymous Coward

    <quote> AWS, he said, is perfectly clear that its data centres are a disaster-proof distance from one another, but less than a millisecond of latency apart.</quote>

    And a hurricane that traipses thru Herndon and Sterling and the oh, at least 9 datacenters that sit next to each other cheek to jowl is what exactly? Sure, the other cluster for us-east-1 are in Manassas off of 234-bypass which is about 24 miles apart as the crow flies.. Then there is the tiny matter of the big power-lines that follow VA RT7 west of Lessburg. Topple a couple of those with a brick of explosive and also the ones that run North/South across I-66 and you've wiped out Manassas to some extent. The power for Micron Technologies (VA rt28) has major power infra but I think that comes in from points further south.

    US-east-1 is grossly over-built and over-concentrated. It really needs to be diffused, bigly.

    1. Dave 126 Silver badge

      I think you have the makings of a strategy video game, like Sim City but with more Godzillas!

      Define disaster. Hurricanes, earthquakes, wildfires, sure. War, meteorite strikes, zombie plagues... not forgetting squirrels, whose suicidal attacks on America's power grid has caused more outages than terrorist action has to date. Effects of disasters can cascade, such as a tsunami taking out the primary and back up power supplies to the pumps that cool a nuclear reactor. Or, all of your storage suppliers have their factories located in the same flood plain (and they still do).

      The point is, 100.0000000% resiliency can only ever be an ideal.

      1. Anonymous Coward
        Anonymous Coward

        there are 4 DC that are 1 minute walk from each other. That's disaster-proof? Sure, a raging inferno resulting in total outage of a single facility won't take down the region or even an entire AZ depending on the service. We also get 'twisters' thru the river corridor though generally nothing big enough to be more of an inconvenience on secondary roads and local power outage.

        The interesting exercise is the bidding war that erupts when 30-odd DC all need diesel shipped in and there aren't enough trucks to make it happen even if the Dulles fuel dump has the fuel reserves stockpiled. Logistics can be a cruel mistress.

        There's no good reason why they couldn't have put some facilities out toward Winchester, or a new region down by Richmond/Petersburg or Lynchburg..

        Dropping a nuke on DC or NYC is a worthless act. Put one on Dulles airport and Hoboken NJ and you do some useful damage.

        1. Dave 126 Silver badge

          Maybe that would make for a better video game... Instead of building and maintaining like Sim City, you get to bomb a continent with nukes... and squirrels!

          High score is for causing the most disruption with the least bombardment - as you say, taking out a single fuel depo rather than multiple data centres.

          1. Anonymous Coward
            Anonymous Coward

            put a point 2 miles due north of runway 1L. draw a circle with radius 3 miles. That's "half the internet" within the boundary.

    2. Anonymous Coward
      Anonymous Coward

      "US-east-1 is grossly over-built and over-concentrated. It really needs to be diffused, bigly."

      AWS engineers agree with you...the US-East-1 customers don't

  7. BeefEater

    Strange terminology

    What are all these "switch gears" turning?

    In 40 years of producing control systems for power distribution I've only ever heard it called 'switchgear".

    1. Dave 126 Silver badge

      Re: Strange terminology

      Well how do *you* level the output of your hamsters if not with gears?

      1. Korev Silver badge
        Coat

        Re: Strange terminology

        > Well how do *you* level the output of your hamsters if not with gears?

        I guess wheel never know...

        1. Richocet

          Re: Strange terminology

          You round the numbers

      2. jake Silver badge

        Re: Strange terminology

        "Well how do *you* level the output of your hamsters if not with gears?"

        With a very small manually operated muck fork, of course. The newfangled geared versions only exist to separate Millennials and Hipsters from their money.

    2. diodesign (Written by Reg staff) Silver badge

      Re: Strange terminology

      It's switchgear. Don't forget to email corrections@theregister.com if you spot something wrong.

      C.

  8. DougMac

    Don't most datacenters have separate battery rooms?

    Don't most datacenters have separate battery rooms?

    All the ones I've been in do bigger than some enterprise that has 5-6 racks in the basement.

    Although I can certainly sympathize with the horrible firmware on just about any management systems dealing with power.

    Typically I have to firewall them off completely from anything else management, because things like APC transfer switches respond to any stray SNMP scans no-matter what, and start sending email alerts out. (APC/RARITAN/TRIPPLITE/etc/etc/etc) managed rack PDUs are horrendous security nightmares. Up to a certain age, they had open SMTP, FTP, etc. etc. Why do you need to manage your rack PDU with FTP? Because you CAN!

    1. dgeb

      Re: Don't most datacenters have separate battery rooms?

      At a previous site, one of the DC operators we use did have a fire in the battery room - so I am certainly grateful for them being physically separate.

      I've never seen a rack UPS be as dramatic as that, but I have seen more than one blow a whole row of transistors in a big bang/flash/escape of magic smoke, and on one occasion it also tripped the downstream ATS (i.e., its safety cutoff engaged, rather than transferring load to raw mains). It does make me nervous, and I would definitely prefer to have multiple mid-sized UPSes across a few small rooms, doing roughly row-level power.

    2. J. Cook Silver badge

      Re: Don't most datacenters have separate battery rooms?

      Up to a certain age, they had open SMTP, FTP, etc. etc. Why do you need to manage your rack PDU with FTP? Because you CAN!

      Two reasons Default Configuration out of the box, and firmware upgrades.

      While I don't have any experience with the APC 'smart' PDUs, I do have enough experience with the raritans, and when they are working correctly, they are.... adequate. I've also seen them just lock up and refuse to obey commands from the controlling KVM, they tend to get netlost if you try to assign them a DHCP reservation or statically assign them an address, and the direct UI via web, telnet, and serial is... not good.

      I still recommend them, because there are not many products that do what they do as well as they do. (example: I had to remote powercycle a server not too long ago, because it ws wedged pretty badly. a couple clicks in the KVM interface, and I watched the console of said server as it went black, and 45 seconds later started it's POST from the KVM telling the PDUs to powercycle the outlets that were assigned to that server.)

  9. DS999 Silver badge
    Facepalm

    I thought the whole point of the cloud

    Was not having any single point of failure? If Amazon's cloud software works properly, then any individual server, rack or even datacenter should be able to fail without missing a beat.

    If they think they are eliminating those modes of failure by fixing the hardware/firmware they haven't been properly introduced to by far the biggest enemy of five 9s - the fact sysadmins are human!

    1. Anonymous Coward
      Anonymous Coward

      Re: I thought the whole point of the cloud

      That single point of failure removal is up to you to fix in you software, not AWS. They don't really tell people that.

  10. Claptrap314 Silver badge

    Still not learning their own lessons.

    Six years ago, I saw a video from a Netflix leader talking about DR. His default disaster was the failure of an entire datacenter.

    There are LOTs of ways to take out an entire DC at once. The solution is to have sufficient active capacity that no single DC going down takes you down.

    Amazon, one of your biggest customers has been preaching this for a long time. You might want to listen to them.

  11. Kevin McMurtrie Silver badge

    Why still lead acid?

    I don't understand why lead acid batteries still exist. UPSes even come with AGM lead acid batteries - the special type of safe battery that's down to 60% efficiency at just 1C power rates and maybe 15% efficiency (if you're lucky) at 10C. Yeah, it has a cheap price tag before you factor in shipping, storage, disposal, hazmat, needing a lot more of them than you thought, and frequently replacing them. I would never use them in a personal project of any size.

    1. Nick Ryan Silver badge

      Re: Why still lead acid?

      I'm not a battery specialist, but my understanding of lead acid batteries is that compared to the alternatives they are chemically quite stable, relatively safe by way of the chemicals used, and cope well with trickle charging. Another of the major differences between lead acid and other batteries is that they can cope with a very high starting output, a trait that useful for their automotive usage but also to be continually charged and discharged.

      The continual charge and discharge factor is important because of the two key types of UPSes: those where the output runs through the batteries all the time and ones where the battery is only switched to on a power event. The latter is cheaper but there is always a momentary drop in power during a switch event and the power passed through is not guaranteed to be clean, particularly during a power event. Running everything through the battery ensures clean power goes through all the time which can be an important factor for sensitive equipment.

  12. martinusher Silver badge

    Old tech with a purpose

    The huge amounts of low voltage DC current needed by old school mainframes was often generated by motor/generator sets, it was not just appropriate technology for the year but you had a flywheel that would store energy in case of a mains drop.

    Battery banks were the go-to power supply for the phone system -- a room full of accumulators for the basic 48v supply, not only a nice smooth DC supply but enough power for several hours of operation should the mains fail.

  13. TeeCee Gold badge
    Facepalm

    Hmm.

    Who was it recently who found that their in-house designed, purpose built for cloud, middleware turned out to have a hard scalability limit? For added "blast radius" fun and games, when they hit it, it turned out to utterly bugger their services until they rolled back their server deployments and be unfixable so they'll have to chuck pricey hardware at the problem.

    Yes, I wonder who that could have been?

    See also: Pots, kettles, stones and glass houses.

  14. Bruce Ordway

    “Software you don’t own in your infrastructure is a risk,” DeSantis said,

    No shit?

  15. Anonymous Coward
    Anonymous Coward

    Gee, then maybe Jeff should start developing his OWN technology instead of pilfering open source projects and giving back virtually NOTHING to the community despite the billions he rakes in from their hard work. :(

  16. trist

    Passing off

    I am surprised the author didn't reference https://forums.theregister.com/forum/all/2020/10/16/aws_headless_recorder/ where amazon decides to "own" software and erase the original author from history.

  17. Richocet

    Any of the proposed solutions are better than the IT company next door that switches to a diesel generator every Friday morning when they "run backups". The exhaust of this points into my back yard. Any washing I have on the line is toast and if they run it for more than an hour my house starts to fill with diesel exhaust fumes. These do not smell nice FYI.

  18. Stuart Castle Silver badge

    Pot/Kettle..

    “Software you don’t own in your infrastructure is a risk,” says the man who works for a company running infrastructure for others that they don't own, and expects us to believe it's perfectly safe.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like