back to article Google Cloud takes a gap year. It may come back with very different ideas

Taking a look at the latest financial results from Google/Alphabet made some of us do a double-take ... and not because of the $40bn+ in ad revenue. If you read closely, you'll see that Google Cloud has lessened its habitual loss by extending the operational lifespan of its cloud servers by a year, and stretching out some of …

  1. jake Silver badge

    Or perhaps ...

    "Maybe Google's gap year is an indication that business will not resume as usual."

    Or perhaps it's because people are starting to notice that all "cloud" means is that your business is being run on someone else's computer(s), and as such you have no control whatsoever over that very important part of modern business.

    I think the clouds they are walking on are wearing thin. Snake-oil is snake-oil.

    1. J.Teodor

      Re: Or perhaps ...

      As someone, who had to deal with on-prem servers for a long time, I never want to go back.

      To beg BOFH for another test server instance to be deployed, or get the god-wannabe DBA to increase the size of your production data store. God forbid the request would be after three o'clock, because that is when the "gods" would be ready to go home, and you would get immediately denied.

      No, never again.

      My applications run on servers that get automatically patched, I can resize my DB as needed, I can spin up a new test server in a minute or few... and the management is happy, because the costs are way, way down, stability and reliability are up.

      This may be a different story if your $JOB has hundreds or thousands of servers, with a dedicated team of data center professionals supporting those servers. But this is perhaps 1..2% of the companies. For the rest, cloud is godsent.

      1. Doctor Syntax Silver badge

        Re: Or perhaps ...

        What you describe is probably a people problem. The problem may lie with the wrong people administering the systems. Equally it might lie with the people holding the purse strings and your sysadmins & DBAs may have equivalent but different problems in getting the sign-offs to provide you with your test server. But ultimately you're using somebody else's computer as a sticking plaster to cover the self-inflicted wounds of a dysfunctional company.

        1. J.Teodor

          Re: Or perhaps ...

          Partially, it is a people problem, but partially it is a resource and cost issue.

          Up to a certain number of servers/resources, AWS, Azure and GCloud are a lot cheaper to use than on-prem. I do not know where the break-off point is - it may depend on service type, e.g. if you run mainly simple web APIs or processing jobs, the break-off point may be at tens of thousands of servers, but for databases (including licensing costs), it may be way below hundred, especially if you are one of those unfortunate souls still using Oracle.

          And I am not just talking about cost of the hardware, licenses, and running the service - but the environmental footprint of highly optimized cloud data center is probably tenth of what those servers would be in average "cubicle next to toilet" on-prem. Then comes the maintenance costs, and failover, and backups, and regionality, and cost of setting up a new dev server, and... you get the point.

          There are hybrid models (we actually use Azure like that) and private clouds, both of which may offer better benefits than relying on a big cloud provider. But YMMV with that.

      2. Nate Amsden Silver badge

        Re: Or perhaps ...

        As someone who has dealt with on prem servers since ~1998 I never want to go back either (back to cloud that is). I was exposed to 100% cloud infrastructure for about two years across two different companies. Moved the second(current) company out almost 10 years ago exactly now(company was "born" in cloud from day 1), I think the day was Feb 23 2012 that we moved our first infrastructure out, and about July 2012 when we moved the last of it out.

        DBAs have always been on my team, and generally always reasonable when requesting resources. We have tons of metrics so can generally 100% agree upon when something needs to change(I can even add CPU or memory or disk live to my systems with zero impact to the DBs). My team has always controlled administrative access to every system in the environment(all linux systems managed by Chef configuration management). Developers and others had no issues with that(except one guy, I told him he had to host his stuff on the IT side of the house if he wanted a windows server and have admin rights to it). Developers praised us on occasion for providing such a robust environment that just worked. Not a single server has had to be rebuilt as a result of hardware failure in a decade of operation. Prior to the ops team forming the developers (with IT's help) ran their own VMs that they managed and even they knew it was a shit show.

        Speaking of databases I remember to this DAY (this was over 10 years ago!!) a phone call with amazon support about TERRIBLE database performance on our RDS at the time, I even took a screenshot and kept it all these years:

        http://elreg.nateamsden.com/rds-cloudwatch.png

        I remember a comment the support person said, they said oh we are getting great performance look at those IOPS, oh yeah 3,000 IOPS is good but look how much data was transferred... 200 KILOBYTES? Write latency over 150ms ? CPU usage maybe 5%? WTF IS GOING ON.

        We were(at previous company) a "beta" tester for amazon's early "performance" EBS system. I forgot the technical term this was back in 2010, basically you get more IOPS with more space. The idea was interesting but the implementation(at the time) didn't work. I'm sure they've fixed that since though.

        Back to cloud - the lack of reliability, the lack of in depth monitoring, the endless list of small failures, the forced reboots, the head scratching moments WTF is going on and why? Because of the variability and constant manipulation of the infrastructure drove me mad. The lack of ability to precisely size systems, the lack of ability to oversubscribe. The lack of control, the INSANE COSTS.

        My former manager was talking with google cloud last year, and the cost for hosting our production databases(about 30 systems) was about as much as it cost us to run our entire datacenter operations(about 750 systems) - according to him I never spoke with the google people. That wasn't even taking into account the extra capacity we have waiting to be used(which could easily run another 500 systems). It's comical. 2-3 years ago we had a VP who wanted to go cloud(no reason other than to help his resume I think), we told him it was too expensive. He said "he had a guy" who can make the numbers work. Well ~6 months later the VP was gone didn't really hear about the concept again.

        I have seen many people who have loved cloud stuff, those people also don't seem to care about the costs. Many others don't believe cloud (generally) so much more expensive than running it yourself.

        The last company I was at I hit a wall in convincing the board of directors to move out of cloud, despite having CTO/CEO onboard, and the rest of the company really with a $1.6M savings in the first year of operations. But I left shortly after that, my (original) hiring manager at that company hired me at the next(current) company. Previous company collapsed a couple years later. Their cloud spend was upwards of $500k/mo for a tiny startup(maybe 100 employees?) at peak, more common was in the $200-250k/mo realm. Current company was about $80k/mo when we moved out.

        Current company I'd say conservative savings has been $10M, more practical savings of over $15M over the past 10 years, that is with a peak of ~5 racks of equipment. Currently about 3.5 racks. Not talking super scale here.

        I remember hosting a load balancing software called Zeus(at the time) in Amazon cloud because the ELBs were such pieces of shit. The cost of running Zeus (as an appliance distributed by the amazon store thing), which was CRIPPLED because it could only have a single IP address was huge. It alone would come to about $10-20k/year for a single system I think? That could pay for a real hardware load balancer very quickly(my current load balancers ran upwards of about 450 IP addresses(at peak) on several networks for various workloads and fail over within 1 second, Zeus as it used Elastic IPs took something like 20 seconds).

        I'd be all in on cloud if it provided a superior experience(or at least equivalent - control my network, connectivity, end to end storage metrics down to the links and disks/ssds etc). That is offering the level of control and availability that on prem can offer (that includes data center facilities where everything is N+1 power/cooling). If cost was never a concern would help too.

        Oracle said it pretty well at one point in the last year or so, they want their customers to go cloud because it makes Oracle so much more money.

        One area where cloud can make sense though is SaaS. Abstracting most of the failings of the major cloud providers behind an application that is hopefully robust. But as we've seen with recent cloud outages even that can fall apart.

        I've been told by multiple people over the years nobody can run a data center operation like I can(at least in their experience). So I am somewhat unique in that ability. It is sad that companies have yet to realize that even if they need 3-5 people on the team to do stuff it's likely going to be done far cheaper and better on prem.

        The cloud marketing bullshit hype cycle is deafening though. People talk about "hybrid cloud" a lot. Current app stack I manage has nearly 20 micro services. There had been talk in the past about running some of those in public cloud to provide more scalability. Despite the fact we had no lack of capacity on site, people were clueless when it came to things like latency between services. Whether you are distributing an application across two different data centers or a data center and a cloud there will be a huge latency hit regardless unless that 2nd location is very close(say within 50 miles). But many people(for some reason) don't realize that. Some apps wouldn't care but most transactional ones would care a ton.

        Most of the failings of the major cloud providers are BY DESIGN, and haven't changed in the past decade(not betting on them changing in the near future too).

        But as with anything, you can do things on prem very poorly, and you can do things in cloud very poorly. You can do things in both very well in rare occasions.

        1. Anonymous Coward
          Anonymous Coward

          Re: Or perhaps ...

          Thanks hugely for that - I'm not at all in the business but found it a fascinating and apparently well-reasoned discussion, hope it continues well for you!

  2. andy 103
    Stop

    We're beyond needing to upgrade anything

    "sometimes even letting you pick the [CPU] you want"

    Here's the thing. Most people deploying and running software they develop don't really know or care what underlying hardware it's running on. The old days of dedicated web servers were a great example of this. What's the difference for my use case between CPU "A" and CPU "B"? Unless you're doing something really specific and know the underlying differences of how that could affect your code, it's just not something that's high on anyone's list of priorities.

    In the same way as using a mobile phone from 3 - 4 years ago is probably still good enough for most peoples' needs, the cloud market has got to a point where upgrading all the hardware all of the time is simply counterproductive.

    One of the things I've learnt over the years is that when it comes to infrastructure there is a rule: Boring = Good. In other words things running predictably isn't a bad thing at all, and frankly if it works...it works. Personally I'd prefer to use a cloud provider that stays clear of pointless upgrades.

    1. Nate Amsden Silver badge

      Re: We're beyond needing to upgrade anything

      Boring = good, I like that.

      My production ethernet network hasn't had a state change since Oct 2019 (maybe the last time I did a software update on them many of the devices are 8-10 years old now software is super stable/boring). Boring, good no issues. If you ask the vendor how to deploy their equipment and how I deployed it they wouldn't suggest my method because it's not sexy, it's boring(the core relies on a unique protocol they haven't touted in 15 years).

      My primary all flash storage array has had 0 hardware or software failures since it was put in service in Nov 2014. Boring, good no issues.

      I can really probably count on one hand the number of different hardware server failures that I've had in the past 3 years. I can probably count on one hand the number of VMware support cases I've had in the past 5 years. I don't even need to count the number of full or partial power outages of our primary data center in the past 10 years (we had 2 partial power outages at a 2nd facility in Europe maybe back in 2017? I had two network devices that were single power supply, of course lost one of them each time the power went out but they were redundant so no impact in the end).

      Last time we had a major internet connectivity issue was either the first big DDOS against Dyn DNS years ago(cloud provider...), or perhaps DDOS against our ISP (targeting other customers mainly gaming customers I think and it made news here on el reg at the time) maybe that was in 2016 or 2017 I don't recall. I'm excluding CDN outages(cloud again..) as that only impacted our website not the entire "datacenter". Have had probably 2-3 hrs of CDN downtime over the past several years.

      What vmware am I running ? ESXi 6.5 + vCenter 6.7. BORING = good. v7 seems like it needs more time to mature. I ran ESX 4.1 past EOL, and ran ESXi 5.5 past EOL, I'm thinking 6.5 past EOL will be fine too. Upgrading even to 6.7 doesn't get me anything. If you think I'm worried about running EOL vSphere, well I'm really not given the track record, and given the number of other software products internally that are even much more EOL than that(EOL years ago) that I don't have control over.

  3. Mellipop

    Waiting for ARM Server processors

    Or Alphabet is waiting to replace their servers with lower power ARM chips. Think of the marketing coup that would be. It'll save millions in Lx bills and stop heating the planet so much.

    Oh, you didn't mean to link to two stories together?

  4. HildyJ Silver badge
    Boffin

    Waiting for Google server chips

    Google has already designed a custom chip (the Tensor) for its phones and has announced it is designing one (presumably Tensor based) for its Chromebooks.

    I would not be surprised if they are also designing a server focused variant.

  5. xyz123 Bronze badge

    Google said Stadia was "taking a break from updates".

    Google said Nest was "taking a break from updates"

    Both things in the middle of cancellation (layoffs of all support and technical staff and management first - announced cancellation sometime in Q2 2022.

    Now Google cloud is "taking a break from updates" - hmmmm

  6. elsergiovolador Silver badge

    Catch-22

    If people don't start to run services on premise, the monopolies will not break.

    There is also a myth that you need to have plenty of hardware coming from where servers weren't powerful but pretty expensive and cloud was indeed helpful to mitigate that.

    Most services corporations run could be served from a single modern dedicated server without breaking a sweat.

    And for the same money they are spending on the cloud, they could afford full time dev ops team and redundant servers.

    Difference is that when someone goes wrong, you are not at a mercy of some nameless workers on the other side of the globe.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like

  • AWS adds bare metal support to EKS Anywhere
    And throws some cold water on the 'K8s works best inside a VM' argument

    Amazon Web Services has made a small but important change to its EKS Anywhere on-prem Kubernetes offering – the option to install it on bare metal servers instead of exclusively inside a VMware vSphere environment.

    "Amazon EKS Anywhere on bare metal enables customers to automate all steps from bare metal hardware provisioning to Kubernetes cluster operations using a bundled open source toolset built on the foundation of Tinkerbell and Cluster API," states the cloud colossus's announcement of the offering.

    The offering is free, but AWS generously offers service subscriptions.

    Continue reading
  • $6b mega contract electronics vendor Sanmina jumps into zero trust
    Company was an early adopter of Google Cloud, which led to a search for a new security architecture

    Matt Ramberg is the vice president of information security at Sanmina, a sprawling electronics manufacturer with close to 60 facilities in 20 countries on six continents and some 35,000 employees spread across the world.

    Like most enterprises, Sanmina, a big name in contract manufacturing, is also adapting to a new IT environment. The 42-year-old Fortune 500 company, with fiscal year 2021 revenue of more than $6.76 billion, was an early and enthusiastic adopter of the cloud, taking its first step into Google Cloud in 2009.

    With manufacturing sites around the globe, it also is seeing its technology demands stretch out to the edge.

    Continue reading
  • Google recasts Anthos with hitch to AWS Outposts
    If at first you don't succeed, change names and try again

    Google Cloud's Anthos on-prem platform is getting a new home under the search giant’s recently announced Google Distributed Cloud (GDC) portfolio, where it will live on as a software-based competitor to AWS Outposts and Microsoft Azure Stack.

    Introduced last fall, GDC enables customers to deploy managed servers and software in private datacenters and at communication service provider or on the edge.

    Its latest update sees Google reposition Anthos on-prem, introduced back in 2020, as the bring-your-own-server edition of GDC. Using the service, customers can extend Google Cloud-style management and services to applications running on-prem.

    Continue reading
  • Google calculates Pi to 100 trillion digits
    Claims world record run took 157 days, 23 hours … and just one Debian server

    Google has put its cloud to work calculating the value of Pi all the way out to 100 trillion digits, and claimed that's a world record for Pi-crunching.

    The ad giant and cloud contender has detailed the feat, revealing that the job ran for 157 days, 23 hours, 31 minutes and 7.651 seconds.

    A program called y-cruncher by Alexander J. Yee did the heavy lifting, running on a n2-highmem-128 instance running Debian Linux and employing 128 vCPUs, 864GB of memory, and accessing 100Gbit/sec egress bandwidth. Google created a networked storage cluster, because the n2-highmem-128 maxes out at 257TB of attached storage for a single VM and the job needed at least 554TB of temporary storage.

    Continue reading
  • Transport giant picks up Google Cloud AI to aid package delivery, tracking
    When an exec asked for help tackling supply-chain woes, were they told to 'just Google it' or what?

    Even in the waning days of the pandemic, extended lead times and delayed packages are an inescapable reality. Logistics giant XPO this week picked Google Cloud to try to change that.

    XPO is among the largest freight-transport brokers with more than 42,000 employees operating in 731 global locations. In a collaborative effort with Google, the company plans to deploy workloads on Google Cloud Platforms’s (GCP) AI/ML and data analytics platforms to mitigate supply-chain disruptions and improve package delivery and tracking services.

    “We’re bringing out innovative AI/ML and data analytics solutions to XPO to help it transform supply chain management, ensure its deliveries are on time, and give its customers an accurate, up-to-date view on the location of their freight," Hans Thalbauer, managing director for global supply chain logistics at Google Cloud, said in a statement Monday.

    Continue reading
  • Algorithm spots 104 asteroids in huge piles of data
    Rocks stood out like a THOR thumb for code

    Researchers at The Asteroid Institute have developed a way to locate previously unknown asteroids in astronomical data, and all it took was a massive amount of cloud computing power to do it.

    Traditionally, asteroid spotters would have to build so-called tracklets of multiple night sky images taken in short succession that show a suspected minor planetoid's movement. If what's observed matches orbital calculations, congratulations: it's an asteroid. 

    Asteroid Institute scientists are finding a way around that time sink with a novel algorithm called Tracklet-less Heliocentric Orbit Recovery, or THOR, that can comb through mountains of data, make orbital predictions, transform sky images, and match it to other data points to establish asteroid identity.

    Continue reading
  • Google picks business chiefs for European Advisory Board
    A sign that the company is taking data sovereignty concerns more seriously

    Google has established a European Advisory Board for Google Cloud made up of executives drawn from across industry in the region.

    The move comes just weeks after the internet giant announced data sovereignty controls for its Google Workspace service to address the concerns of EU organizations.

    According to Google, the European Advisory Board has been set up to help Google Cloud improve the value and experience it can deliver for customers in Europe. As the board is made up of "accomplished leaders" from across industry, it will serve as an important feedback channel for ensuring Google's cloud-based products and services meet European requirements.

    Continue reading
  • Alibaba Cloud adds third datacenter in Germany
    More Euro-presence than any other Chinese company, but still nowhere near Google or AWS

    Alibaba has pulled ahead of its Chinese rivals in Europe with the opening of a third datacenter in Germany.

    The company said the Frankfurt datacenter serves cloud computing products to Europe and "adheres to the highest security standards and the strict compliance regulations set out in the Cloud Computing Compliance Controls Catalog (C5) in Germany."

    The addition brings Alibaba Cloud to a network of 84 availability zones in 27 regions worldwide. The company's first European cloud center arrived in Frankfurt in 2016.

    Continue reading
  • Google Cloud hopes to woo factories with its usual fare: Analytics and AI
    A different kind of assembly language

    Google has deployed a pair of AI-related services to woo factories and assembly lines onto its cloud.

    These offerings are: Manufacturing Connect (MC), an automation tool and data processor that supports more than 250 machine-communication protocols, and can thus receive data from a wide variety of sources; and a Manufacturing Data Engine (MDE), an analytics tool that reports on data gathered from Manufacturing Connect in what is intended to be an easy-to-use format by staff. The overall goal is to help manufacturers better understand what's happening at their plants, and monitor their incomings and outgoings.

    According to Google Cloud Tech director of manufacturing, industrial and transportation Charlie Sheridan, manufacturing businesses as a group are digitally transforming, though many of their efforts stall when scaling up. 

    Continue reading
  • Google's plan to win the cloud war hinges on its security aspirations
    VP Sunil Potti talks strategy with The Register

    Interview Google's quest to steal cloud customers from rivals Amazon and Microsoft will be won – or lost – based on its strength as a cybersecurity provider.

    The web giant is pumping billions of dollars into its security offerings so that this big bet will pay off. This includes mergers and acquisitions as well as building out technologies to work across AWS, Azure, and on-premises environments.

    Though the ultimate goal remains moving large organizations to Google Cloud, helping customers shore up their network and computer defenses during that transition is a key aim, according to Google Cloud Security VP Sunil Potti. 

    Continue reading

Biting the hand that feeds IT © 1998–2022