back to article Inside Microsoft's Autopilot: Nadella's secret cloud weapon

Satya Nadella may have just taken the reins as Microsoft's chief executive, but he's already intimately familiar with one of the company's key internal tools to let it compete with Amazon and Google: a complex software system named Autopilot. Autopilot is the system that lets Microsoft knit together millions of servers and …


This topic is closed for new posts.
  1. Dave 126

    > "the keys to a multi-billion dollar car."

    My first thoughts on reading that were of the car that Homer Simpson designed:

    (I didn't know some crazy soul had built a real life version of it until I went looking for an image a moment ago! Heck!)

    1. Anonymous Coward
      Anonymous Coward

      Sounds exactly like Microsoft's System Center suite to me which already does pretty much everything mentioned. Most likely Autopilot is just what they call the configuration of that to Microsoft's needs...

      1. MIc

        I've worked with ther System Center stack for years and I can tell you that is not the case. System Center markets itself as being the private cloud solution but it doesn't actually pan out that way.

        1. Trevor_Pott Gold badge

          Autopilot is far closer to System Center + Puppet. Except instead of Puppet everything is a bunch of PowerShell, some imaging tools, a whole lot of sensor+trigger packs and this massive distributed scheduler.

          You can make System Center + Puppet do what Autopilot does. It would cost you about $5 Billion and take you three years, but it's possible. Or, you could just wait until Microsoft has found a place where they feel it's safe enough to freeze the code and work it back into their server offerings and sell it to you. Which they will do, as soon as they feel that doing so won't give away a competitive advantage to anyone that might threaten them.

          Microsoft are very big believers in DevOps. Religious about it, almost. The difference between "us" and "them" is that for us DevOps is Puppet, Chef or Saltstack. For them, it's Autopilot.

          1. TheVogon

            "You can make System Center + Puppet do what Autopilot does"

            But no one in their right mind would want to when System Centre already does the workflow / automation and Powershell deals with any customisation - both of which are easier to use and less labour intensive tools than Puppet...

        2. Anonymous Coward
          Anonymous Coward

          "System Center markets itself as being the private cloud solution but it doesn't actually pan out that way."

          Having setup several such systems - most recently with fully working deployment and automation between both public and private cloud, I can assure you that it certainly can....

  2. hplasm

    Microsoft Autopilot

    As fitted to Hotblack Desiato's stuntship.

    Full speed ahead!

    1. Dave 126

      Re: Microsoft Autopilot

      Indeed - I'd expect my multi-billion car to fly, in atmosphere at least if not in space!

      1. Anonymous Coward
        Anonymous Coward

        Re: Microsoft Autopilot

        Well it is a 747 car, I believe. Though even then rather expensive!

        What few know, though, is Mars used to have a third moon - we didn't know about it because it really was tiny. I don't know if it had a name, though presume it did. It went missing - in the '70's? - and what I heard was it was eroded, like with a pneumatic drill, and they found at the centre a single, giant crystal of diamond! Which this unnamed ne'er-do-well had cut into the shape of a steering wheel!

        My source had no idea where it is now. But I gather it made a REALLY uncomfortable steering wheel!

        1. Dave 126

          Re: Microsoft Autopilot

          Was your source Cab Calloway? "The diamond car with the platinum wheel"

          1. Anonymous Coward
            Anonymous Coward

            Re: Microsoft Autopilot

            Maybe. I did see the guy back in the '80's. With 'The Blues Brothers Band'. Although Belushi was gone by then...

  3. dogged

    ITT -

    people who either

    a) know nothing about massive infrastructure management

    b) are genuinely stupid

    c) have been to the pub

    d) any combination of a), b) and/or c).

    1. 's water music

      Re: ITT -

      b and c here. but mostly b

  4. RaidOne

    As expected...

    Ok, so it does autodeployment. It has some auto-healing capabilities. It has been ran a million times. What else?

  5. Anonymous Coward
    Anonymous Coward


    We make & sell a very similar system, biggest banks across the globe are using it, among many fortune 500 companies ... so what exactly is so cool about autopilot? We have a [private on-premise] cloud solution, as in, you pay per job, where job is one shell script, web service, VBS, or Abap for example. Or a datacenter solution with pay per server that runs jobs, monitored system come free of charge. We support all major UNIX/Windows (w2k to w8)/VMS/MVS/OS400 on most hardware, you name it, we handle most ERP platforms as well like SAP, Oracle, MS, and IBM, any database, almost - and we scale pretty well, too.

    SAAS customers of ours manage to squeeze every little cent out of their hardware investment with real-time startup/shutdown/switch.

    Another managed to divide the yearly book-close time required by 3.

    Oh, and central server runs almost anywhere, all it needs is a JVM and a database. So, this autopilot abortion can handle millions of systems, cool, of one provider only ? crap !

    Anon, for obvious reasons ...

    1. Anonymous Coward
      Anonymous Coward

      Re: LOL???

      "all it needs is a JVM"

      You can count your crap off my shopping list then. I'm never buying another product that runs on or requires Java if I can possible help it. Insecure, inefficient and clunky garbage that it is.

      1. Anonymous Coward
        Anonymous Coward

        Re: LOL???

        So you'll use .NET instead then? which is practically the same thing, a compiler interpreter (virtual machine).

        1. Anonymous Coward
          Anonymous Coward

          Re: LOL???

          Given the choice, yes absolutely - .Net all the way. It's no way 'practically the same thing' - It's a far better product, with better performance and scalability, no version compatibility issues, and a couple of orders of magnitude fewer security holes...

          1. Robert Grant

            Re: LOL???

            No version compatibility issues? .NET just makes you rewrite everything on a major version change. Biggest version compatibility issue I've ever seen.

            1. dogged

              Re: LOL???

              You might have "seen" but you haven't actually done it, have you? In most cases, a recompile suffices.

              Also, most people don't delete an older framework when a newer one comes along. To do so would be break all the rules of APIs.

              You're not telling us that you do that, are you?

            2. Anonymous Coward
              Anonymous Coward

              Re: LOL???

              ".NET just makes you rewrite everything on a major version change"

              No it doesn't - the .Net Framework can support all previous versions and co-exist without conflict. Meaning that you don't have to update your code unless you want to. And even if you do, it's normally just a recompile with a different target. Unlike with Java...

  6. Anonymous Coward
    Anonymous Coward

    Worked really well didn't it

    1. Anonymous Coward
      Anonymous Coward

      Re: Worked really well didn't it

      Sometimes things go down. Does the private datacentre have 100% uptime? No, because the only thing that can be predicted with 100% accuracy about a system, is that the system will fail.

      The real question is: Do you understand the service level that you've got in a cloud provider? and is it better or worse value for money than running it in house?

      Cloud providers like MS, IBM, Google, etc, have the advantage that they have massive scale and more and likely more highly skilled engineers than an in house IT infrastructure operation. An in house IT infrastructure operation has the advantage that you can should at your IT guys until they fix it. They'll still get tired though, and shout all you like they'll make mistakes.

      1. Anonymous Coward
        Anonymous Coward

        Re: Worked really well didn't it

        "Does the private datacentre have 100% uptime?" Mine has for the last 5 years.

        1. Anonymous Coward
          Anonymous Coward

          Re: Worked really well didn't it

          Yeah, I worked with a mainframe that had never had any downtime in about 20 years.

          Then one day it threw a catastrophic failure and had a decade's worth of downtime for a 5x9s service in one day.

          Of course you don't say what sort of system you're running, how big it is, what sort of workloads or user volumes you're supporting, but I still find 100% uptime for 5 years hard to believe.

          1. Anonymous Coward
            Anonymous Coward

            Re: Worked really well didn't it

            If you find 5 years hard to believe, you've mixing with inexperienced sysadmins.

            1. Anonymous Coward
              Anonymous Coward

              Re: Worked really well didn't it

              Or maybe you hang around with sysadmins who work is tiny, simplistic datacentres that they actually think are massive and complex due to inexperience.

              1. Anonymous Coward
                Anonymous Coward

                Re: Worked really well didn't it

                "Or maybe you hang around with sysadmins who work is tiny, simplistic datacentres that they actually think are massive and complex due to inexperience."

                Sounds to me like you hang around with people who's idea of a "datacentre" is a bunch of blades running Windows shoved in a rack. In which case 5 years of uptime would be a dream. However, some companies run serious OS's on serious hardware and 5 years uptime for the centre is no big deal at all.

    2. Anonymous Coward
      Anonymous Coward

      Re: Worked really well didn't it

      "Worked really well didn't it "

      Historically it has been more reliable with fewer major outages than say Amazon S3....

  7. vmcreator

    The Real Secret Weapon

    The Real Secret Weapon is the rumour that Google is working on an advanced home entertainment centre that is built from modular colour cubes ( Google likes colours ). The cubes are added as you expand and will be Nexus type slab compute (cube), games console, iTunes type media ( audio and video) Freeview etc, with interface for a advanced phone to plug in to the cubes for portability on your belt or in your Google in-car entertainment, all cloud access available of course.

    All the information you need in one system.

    Bye bye Xbox, PS2, iTunes, iPad, Surface, Bose, it was nice knowing yer :-)

  8. Anonymous Coward
    Anonymous Coward

    "compares Autopilot to a 747 jet"

    1970's technology then?

    1. Albert Stienstra

      Re: "compares Autopilot to a 747 jet"

      Not much wrong with a 747

  9. Anonymous Coward
    Anonymous Coward


    A bunch of douches wrote a batch file, named it AutoPilot and won and award?

    1. Adam 1

      Re: Seriously?

      No! I won't have that! It was a PowerShell script

      1. Anonymous Coward
        Anonymous Coward

        Re: Seriously?

        "It was a PowerShell script"

        For the uninformed - that's like UNIX shell scripts but more powerful and more secure - with features like object orientation and code signing....

        1. Denarius

          Re: Seriously?

          @AC Powershell; Seriously, more powerful ? If using a 1974 Bourne Shell might be. If using ksh93 using builtins no way in a MVS system. (unix hell) And as soon as the word object appears, quadruple memory and CPU requirements, drop to 20% of the speed and gain absolutely SFA in any language. Ironically MS licenced MKS tools which were a tolerable subset of unix shell tools, then did little with them. Pity, as they made coding in mixed environments much easier than fudging around with Cygwin. UWIN tools were better, but never seemed to get mind share.

          Usable Powershell scripts, IMHO, is a testament to the persistence and skill of sysadmins who do the best with what they have.

          1. Anonymous Coward
            Anonymous Coward

            Re: Seriously?

            "Powershell; Seriously, more powerful "

            Unquestionably so. Powershell passes objects, not just text for a start. You don't have to parse text to get your data back. With Powershell you won't need the analog of Perl, sed, and awk when the built-in shell functions reveal their limitations.

            Here are some more detailed advantages:

            The PowerShell object pipes allow for in-memory objects to be piped without the need for serialization/deserialization. With bash the tools must serialize to text and deserialize from text at each step in the pipeline.

            Object pipes allow for filters and selections to be formulated using property syntax instead of tools referencing columns, field numbers etc.

            Object pipes in PowerShell are strongly typed. Datetimes are handed over from one command to the next with out text serialization. In bash, text serialization is brittle and e.g. dates, numbers etc may be subject to different interpretation by commands depending on system locale.

            PowerShell object-orientation allows for more terse commands concentrating a single purpose for each command, i.e. finding or producing objects and not offering control of output format. Output formatting in PowerShell is handled by separate commands (format cmdlets). Many commands in bash have extensive options for controlling output formatting, rarely coordinated between the commands.

            PowerShell includes integrated remoting. Remote control in bash is handled using SSH.

            PowerShell has been designed as a "hostable" engine - i.e. an application can host PowerShell and use it's scripting capabilities to manipulate/automate it's own in-memory objects. Several general and special purpose applications are using this. Visual Studio NuGet package manager, for instance, runs inside Visual Studio. Bash runs as a monolithic process, an application can *talk to* bash but not use it to manipulate objects of it's own process.

            PowerShell commands (cmdlets), functions and scripts declare parameters and other properties which can be used by the shell to discover metadata about the commands. In bash, commands are "opaque" and no metadata can be gleaned from them.

            PowerShell uses command metadata to automatically support completions and (in the ISE) even autosuggestions. Bash uses externally defined completions, i.e. completions only work for those commands for which metadata has been defined.

            PowerShell uses command metadata to discover type information for parameters, validations etc. PowerShell performs validation and type coercion *prior* to invoking the command. This allows command authors to *declare* validations instead of *coding* them. In bash, validation metadata does not exist and types do not exist (everything is text). Bash always invokes the commands and leaves it to each command to perform validation.

            PowerShell has a module concept allowing for self-contained and isolated modules to be distributed. bash does not have a module concept. In bash you will use script files/libraries instead of modules. Script libraries does not offer the same isolation (e.g. private global variables) as modules.

            PowerShell has integrated risk management. Commands that may alter system state can be invoked with -WhatIf or -Confirm for simulated execution and confirmed execution, respectively. If you build a PowerShell script and invoke it with -WhatIf this "simulation" mode is set for the scope of the script as if every command was invoked with -WhatIf. Bash does not have shell-defined risk management. Individual commands may allow something similar to simulated or confirmed execution (it's rare).

            PowerShell has resilient sessions, you can disconnect from a session and reconnect later or from another computer and continue with the same session, same variable state, same jobs etc. Bash does not have resilient sessions.

            PowerShell has workflows: scripts that save their execution state regularly and can be suspended and resumed, even across system restarts. When the script is resumed that state of every variable is restored and the script continues as if it was never suspended, *even* on another machine. Bash scripts cannot automatically save their state to a durable medium and cannot continue if they were stopped or killed.

            PowerShell scripts can branch out and execute in parallel. Bash scripts cannot branch out. Bash can use an external command such as GNU parallel to execute commands in parallel; but it cannot execute *the script* in parallel

          2. Michael Wojcik Silver badge

            Re: Seriously?

            Ironically MS licenced MKS tools which were a tolerable subset of unix shell tools

            The Windows Interix POSIX subsystem and Microsoft Services for Unix were based on the OpenNT implementation from, guess what, Interix (formerly Softway Systems). Not MKS.

            (UWIN came from AT&T, of course. Personally I found it inferior in many respects to SFU; there were a number of integration issues, and while tricks like mounting the Registry as a filesystem were clever, performance issues meant you had to be careful with run-of-the-mill commands like "find / ...".)

            As for the rest of your rant ... I'm not overly fond of Powershell myself - I work almost exclusively in bash when I'm on a Windows system, and either bash or ksh on UNIX/Linux. (TSO on MVS, but that's neither here nor there.) But Powershell does have plenty of capabilities that are not part of bash or ksh per se. Of course that fits UNIX's original goal of many small, limited-purpose tools wired together in pipelines - it's a design decision, not evidence that one approach is inherently superior. And as for OO languages: rejecting a language paradigm out of hand is just small-minded ignorance, and not worth arguing.

  10. Adus


    It all sounds very impressive, and I am sure it is. I love hearing about these kind of systems.

    But then I remember that it didn't stop the fairly major problem of Azure's HTTPS certificate expiring in their production environment.

    The problem with these huge monolithic management solutions is that you have to be pretty sure it's all working, all of the time, otherwise you end up with a huge problem on your hands.

  11. amanfromMars 1 Silver badge

    Wild Wacky Western Delight or Exotic Erotic Eastern Pleasure that Leads with SMARTR Drivers?

    Gaining access to Autopilot, Windows Azure's general manager Mike Neil told The Reg, is like being handed "the keys to a multi-billion dollar car.”

    Here be major, miner leading questions for Messrs Neil and Nadella to consider and not ignore and/or avoid, for there be plenty of competing contemporaries with a notion to supply trusted seed feed bodies with autonomous automatic autobotic provision of future lead role play events.

    The grok art though, and a/the most vital elemental that insures and assures and ensures astronomically increasing vehicular value to Autopilot for Microsoft, [and that be surely to be figured in the trillions if Microsoft Azurista have got/are finally getting their ACT together] is in being suitably fit and sublimely able to driver and pilot the motor to deliver ITs passengers to desired key destinations. Anything less and it is just a phormerly attractive hunk of future junk and a mouth-wateringly expensive depreciating asset.

    The power Autopilot gives Microsoft is vast, as it helps increase the efficiency with which the company harnesses its billions of dollars worth of computers. As Microsoft shifts to being a "devices and services" company under cloud-expert Nadella, the importance of Autopilot will only grow over time as Redmond seeks to lash more of its digital universe together. With Autopilot, Neil thinks Microsoft has "the operating system for this new cloud world.”

    And from that new cloud world wielding the New Microsoft Macro OS, can Autopilot remotely pilot AI New Orderly World Orders with a New World Order of Great Game Players …. Virtual ARGonauts of Extreme Means and au fait with Alien Memes.

    And if that fucks up the present future plans of olde analogue world establishment players with their simple control of fiat currency, then they can easily buy into the New Controllers with wise working seed capital spend investment in them for AI Proxy Remote Virtual Command of Future Eventing, and for which they do not have the Keys and Key Proprietary Intellectual Property.

    And yes, there are no questions there to answer or avoid other than the perfectly direct one[s] here now to Microsoft head office …… Is their Use of Autopilot for Command and Control of Olde World Orders, and that which is Perceived and Presented to Humans as Reality, with the Creation of a New Orderly Virtual World Order Supply Chain, Provided by and Providing SMARTR Machines and Global Operating Devices?

    Or will Others take Command and Control with all of that?

    I Kid U Not, El Reg, and that be Real Red Hot Breaking News which I sincerely hope and wish causes you an enigmatic dilemma …… “To pursue and lead, or not to pursue and lead, and to just trail behind and meekly follow, that is the question to answer with solution for the existence of being and meaning for life and memes”

    It is pretty plain to all who think and contemplate on such enigmatic dilemmas, methinks, which is the nobler course of action and proaction to pursue HyperRadioProActively with IT Command and Critically Creative Cyber Control.:-)

  12. John Smith 19 Gold badge

    Lots of AC's out today.

    Of course that's Windows only.

    Whenever I think of cross platform scheduling (IE Windows, OS400, z/OS etc) I think of what used to be Canadian company Cybermation.

    Got bought out by CA.

    But they were pretty impressive back in the day.

    1. Anonymous Coward
      Anonymous Coward

      Re: Lots of AC's out today.

      1 - Azure isn't Windows only.

      2 - We're talking about process scheduling not batch scheduling.

      3 - re: ac comments, who really cares?

  13. John Sanders

    Lets see...

    Big company with lots of resources builds big expensive system that generates lots of revenue, and has a DevOps system to take care of the mundane tasks so you do not require so many bodies to administer the big expensive system.

    As being an in-house system is not available for customer purchase, but "Mike Neil", manager who has never had seen such a large system in detail anywhere else was impressed, "its magical" he said.

    Well I find hard to believe that MS has produced anything that resembles 100% "reliable" never fails type of software. Considering that their flagship product doesn't have a stellar record on that field.

    Is it possible that they have a wonderful management system for their cloud platform? Sure, like the one Google uses, the one Amazon uses, and so on and so on.

  14. Christian Berger

    Rule of thumb

    Whenever someone tries to sell you a new technology which is "complex", it either means that it doesn't work or that it will not work reliably.

    Good technical solutions rarely are complex. The Internet is simpler than the X.25 networks it replaced. UNIX was simpler than MULTICS.

    1. Anonymous Coward
      Anonymous Coward

      Re: Rule of thumb

      That's an argument which hinges on the meaning of "complex". The international postal system is extremely complex at the detail level, but conceptually very simple indeed. Unix was simpler than Multics because it was designed to do a lot less, it wasn't intended to replace Multics.

      Running a huge cloud system with many server farms in many locations connected to the Internet at many different points with many simultaneous users loading and unloading compute instances which handle many different workloads, while dealing with failures - is always going to be hugely complex at the detail level even if each component of the service is easy to understand.

      As an example of a simple solution which has become complex, look at the IC engine. The early versions had natural aspiration, crude carburation and ignition, and were rather unreliable. Ordinary car engines nowadays may have superchargers, turbochargers, variable lift and timing valve systems, EGR, catalytic converters, CVT or automated gearboxes, and a range of technologies found once only on exotic F1 engines, like cermet cylinder liners. They all have electronic control and fuel injection. Yet the complex solutions are more reliable, have higher specific output, use less fuel, produce less pollution and weigh and cost less per kW than the crude engines.

      And then there's LCD versus CRT, condensing domestic water heaters versus the old back boiler, eukaryotic versus prokaryotic cells, and pressure diecasting versus gravity casting.

      I can probably think of a lot more cases where complex works a lot better than simpler, but I really can't be bothered.

    2. Michael Wojcik Silver badge

      Re: Rule of thumb

      Good technical solutions rarely are complex.

      Yes, that's why no one uses RDBMSes, automobile engines, CAT scanners, power plants, or those troublesome computer things.

      The Internet is simpler than the X.25 networks it replaced.

      Not when you include the application-protocol layer, it isn't. Hell, even tunneled-transport protocols like SSL/TLS far surpass X.25. Just because IP and TCP deferred some1 complexity to higher layers doesn't mean the complexity went away.

      UNIX was simpler than MULTICS

      There are plenty of people who think that was a mistake. They may be wrong, but simply contradicting them doesn't constitute an argument.

      1And only some. With things congestion control, Path MTU, window scaling, PAWS, and so on, TCP itself gets pretty complicated. Look at the historical arguments over how to properly implement Nagle on comp.protocols.tcp-ip some time, for example. And it's under-specified in some key areas, which is why we have inconsistencies like how OOB data is handled, or how to deal with a full listen queue (contrast BSD and Windows behavior).

  15. PeteA

    Do one thing, and do it well

    AutoPilot sounds like the exact opposite of that philosophy. Hardly surprising, given the MS track record of build software that "does everything, badly or with mediocrity". Of course, it _might_ be composed of smaller modules because they do seem to be slowly grokking that concept, but my guess would be that it's yet another monolith. If that's the case, then it'll be interesting to see how it pans out the next time one of its fundamental assumptions changes...

  16. Magnus_Pym


    You can tell no body really knows what it does by the shear number of vague and ill defined metaphors used to describe what it. 747's , multi-billion dollar cars, puppets WTF."

  17. davefellows

    Autopilot design

    If you're interested in learning more about Autopilot, I stumbled up this ppt sometime last year - it goes in to the design of Autopilot as well as wider Microsoft DC design. Pretty interesting stuff. The numbers on power consumption and other associated costs are pretty good reading. It's a few years old and doesn't mention Azure but I don't believe the underlying technology has changed much based on what I know of Azure in the data centers.

This topic is closed for new posts.

Other stories you might like