back to article Boffins: cloud computing's on-demand biz model is failing us

Cloud vendors’ commercial models poorly serve scientists, forcing them to struggle for value amid tightening budgets, according to research. Modern science - from bioinformatics to astrophysics - depends heavily on sophisticated computer modeling, yet cloud providers' business-focused models clash with how scientific projects …

  1. ParlezVousFranglais Silver badge
    Boffin

    So form a collaborative organisation globally to negotiate with the cloud providers on provision of resources for academic/scientific institutions - while individual scientists and laboratories may only need resources sporadically, the global needs at any given time must be absolutely colossal - such an organisation could also then gather and disseminate usage patterns and peak/low-demand times to provide more certainty for occasional very high-usage requirements

    Coming up with a solution to this problem doesn't seem to be rocket science, given they are supposed to boffins...

    1. elsergiovolador Silver badge

      given they are supposed to boffins...

      Don't want to sound cynical, but given the wages in the academia, I'd say the people who remain are those who couldn't find job elsewhere (or they have a trust fund and don't have to rely on scraps).

      1. that one in the corner Silver badge

        > I'd say the people who remain are those who

        You know, it is actually possible to go and talk to academics and researchers. To find out why they do they do the job they do.

        You may even find out that there are more things in life than the size of your wage packet and if you think that your wages will allow you to buy them all you are in for a great disappointment.

        1. elsergiovolador Silver badge

          I wasn’t criticising individuals - many brilliant people stay in academia out of passion, not comfort. My point is structural: when wages and conditions are so poor, the system filters for people who can afford to stay (financially or psychologically). That inevitably narrows the talent pool and distorts incentives. Passion alone doesn’t make a strong research ecosystem - stability and fair pay do.

          The romantic idea that “true scientists don’t care about money” is precisely what allows institutions to justify exploiting them. A vocation shouldn’t require martyrdom.

          1. sketharaman

            Well said!

        2. elsergiovolador Silver badge

          An Associate Lecturer in Mathematics, on his fourth fixed-term contract, has his boiler break down. The flat is freezing, so he calls an emergency plumber.

          The engineer arrives, eyes the “Landlord Special” boiler, and gives it one sharp tap with a spanner. The flame kicks in.

          The mathematician is relieved - until the engineer’s tablet buzzes.

          “Right, that’s £380 for the call-out, mate.”

          “Three hundred and eighty pounds?!” the academic sputters. “That’s half my monthly take-home pay!”

          “Jesus. Where do you work?”

          “At the university.”

          “Tough gig. Listen, mate,” says the engineer, “come work for us. We handle service contracts for all the big suppliers. You just drive about, hit a few pipes - you’ll clear six grand a month, easy. Just get your ACS cert, it’s a week’s course. Honestly, it’s a joke. Oh, and don’t mention the PhD - they’ll think you’re a flight risk.”

          The mathematician does it. Within a year he’s paid off his student loan, bought a house, and almost forgotten what a Fourier transform is.

          But then the firm gets bought by a private-equity group in Dubai. New management announces a mandatory Efficiency and Compliance Day to “standardise best practice.”

          The entire workforce is herded into a hotel conference room off the M1.

          The first session: Core Numeracy for Service, run by a 24-year-old management consultant.

          “Right, team!” she chirps. “Let’s touch base on the fundamentals. To maximise routing efficiency, we need to understand flow rates. So let’s start simple. Perhaps you, sir,” she says, pointing to the mathematician, “could come and write the formula for the area of a circle?”

          He gets up, goes to the whiteboard, and his mind goes blank. He begins deriving it from first principles.

          He fills the board with integrals. Wipes it. Fills it again. Finally arrives at minus pi r squared.

          The minus sign bothers him. He wipes it clean and starts over - this time in polar coordinates. Again: -πr².

          He turns to the room, defeated.

          The hungover Gas Safe engineers, all scrolling through booking.com for their next holiday, whisper in perfect unison:

          “Change the limits of integration…”

    2. that one in the corner Silver badge

      > given they are supposed to boffins...

      The boffins themselves are no doubt well aware of the possibilities of your suggestion. But the creation of a collaborative organisation would fall into the purview not of the many research groups but the funding bodies, hopefully with the participation of the University management. That is to say, not tye boffins but the other b-word: beancounters.

      Individual researchers and their research groups can - should, must even - work together to point out the value of such a global resource, by their usual methods of analysis and presentation of results, and hope to persuade the money men to fund it. Guess what TFA is all about.

    3. Anonymous Coward
      Anonymous Coward

      boffins

      "Grant-funded researchers ... typically can't ... influence how their institutions spend on computing infrastructure "

      "research groups simply don't wield enough influence over institutional procurement"

      Institutional problems, not the fault of the cloud providers. This is the sort of issue that Pro-Vice Chancellors should be chairing committees on.

      There has always been a tension between IT as a utility, like libraries and toilets ("Broadband in every bedroom!"), and IT as a significant element of some specific academic activity ("Physics needs a giant computer!"). But it's up to the boffins and their management accountants to decide what compromise they can afford.

    4. Anonymous Coward
      Anonymous Coward

      Or buy them the modern equivalent of some good old fashioned Sun Workstations

      https://www.theregister.com/2025/10/14/dgx_spark_review/

      1. Anonymous Coward
        Anonymous Coward

        That really won't touch the sides for most serious academic compute work. It's more aimed at small businesses wanting to fine tune a small quantized LLM model. Slowly.

  2. elsergiovolador Silver badge

    Remember

    I am old to remember that:

    In contrast, scientific runs are typically short and infrequent. A scientist might need a cluster with specialized

    It was the main selling point of the cloud - you can add resources quickly in the event of surge of traffic and scale down when you don't need it.

    Fast forward few years later, it is now much cheaper to use own hardware in the long term. The "surge of traffic" never comes and it is just fear created to extract money from customers. Current machines are more than capable to serve existing demand.

    That said for scientists, the reality is - bite the bullet - and buy hardware needed for research.

    1. Anonymous Coward
      Anonymous Coward

      Re: bite the bullet - and buy hardware needed for research.

      Almost .... "bite the bullet - and somehow obtain funding to buy the hardware needed for research." :-/

      /ob not-even-enough-budget-to-buy-an-8T-hard-drive-here

      1. Snake Silver badge

        Re: bite the bullet

        Sub-headline: "Boffins realize that adding a middleman to your resource requirements increases costs. Surprised Pikachu faces ensue."

    2. that one in the corner Silver badge

      Re: Remember

      For scientists, the reality is to use whatever is physically available during the period of their research funding.

      Past funding has been available for - and used to build - kit that supports both specific research goals (e.g. simulations of galactic formation at Durham) and for general shared use (e.g. University computing centres and the occasional "national computing facility").

      However, as you point out, the Cloud Clods promised access to variable scaling, the beancounters fell for it[1] and, right now, the reality is that they have left a hole in the availability of the sharable resources required by, in particular, research groups that are smaller than, say, the LHC collaboration.

      > it is now much cheaper to use own hardware in the long term.

      *Now*? Hasn't it always been? In the long term, that is.

      But the long term is not a match for the funding regimes of the scientists, that is the area where the funding bodies should be working; they are ones that operate long term and should be working to get the best bang for their buck, across the multitude of groups that they fund.

      [1] not that the researchers themselves are all innocent here; I recall a dinner conversation a few tears back, when the evening's speaker was adamant that buying time on someone else's compute was an amazing *new* idea, without which he couldn't have done his work. The antiquity of timeshare fell on deaf ears.

      1. elsergiovolador Silver badge

        Re: Remember

        *Now*? Hasn't it always been?

        I remember working at an outfit that had its own little data centre. They actually experienced surge in traffic that their existing servers couldn't manage. To scale up, they had to find and order a couple of extra servers and the cost was massive and lead time quite long. Once we installed the servers the hype for their service was long gone and they very much sat idle for the remainder of the year.

        They moved to cloud next year and I heard it was cheaper.

        A little anecdote - they didn’t label the servers at all. The guy who installed the first batch left, and no one knew which physical machine ran the database, which were task runners, etc. To identify them, we had to ping each IP and power down servers one by one until the ping stopped responding. Rinse and repeat. We couldn’t trace them by unplugging network cables - it was a rat’s nest, and I’m pretty sure there were knots previously unknown to humankind.

    3. BBRush

      Re: Remember

      "It was the main selling point of the cloud - you can add resources quickly in the event of surge of traffic and scale down when you don't need it."

      The harsh reality is that the science is not a key market for the larger providers and, as such, it is niche and expensive, meaning the cloud provider will charge way more for this kind of flexibility, assuming they make it easy for the research team to even scale down to 0 or scale up to 11 when needed.

      Cloud computing is market driven and exists for the masses, their generic workloads and the profits they make for the business, not the small, specific, low margin/cheap use cases.

    4. doublelayer Silver badge

      Re: Remember

      Except for the types of projects covered in the article, it's not cheaper to use your own hardware in the long term because it would be idle for a lot of the time. Buying your own means obtaining a lot of expensive hardware and facilities to install it in just to have it powered down most of the time. Bigger universities often build a computing resource which is shared between project teams in some way where they can express the amount of compute they need and get put in a queue*, but individual teams are not going to have the funding to do that, so if they can't do it altogether, then they have fewer options.

      The problems they are having are for very spiky use cases which are a strong point of the cloud but, as the article points out, not infinitely so. The cloud providers aren't too eager to have lots of idle expensive kit either, so they don't overprovision enough that lots of instances can be set up simultaneously. Nobody wants to have lots of idle expensive kit which is why it is hard no matter who you get your equipment from. Running your own hardware is not a magic solution to this unless you started with the magic massive grant, and if you did, the cloud bill might still be the cheaper option leaving more of the magic grant for other expensive things.

      * The shared computer has its costs as well. When I was a student, I had access to that and the ability to run jobs. You had to schedule those well in advance and occasionally coordinate with other researchers to make sure you weren't going to cause problems for one another by running too long or interrupting something before it was done.

      1. Anonymous Coward
        Anonymous Coward

        Re: Remember

        https://www.theregister.com/2025/10/14/dgx_spark_review/

        1. doublelayer Silver badge

          Re: Remember

          Your point being that you can buy boxes with GPUs in them? Because if they can do their research on one of those, great, although depending on how long they need it, renting a similar cloud machine would probably still be cheaper. The problem being that most of the kinds of things they're running can't be run on one of those. Depending on the work, they'll either need a very large number of those or they'll need something completely different, for example something with a lot more CPU power since that is only optimized for GPU.

          Or, if they're using GPUs, they might want faster performance on those. That box is as cheap as it is because it doesn't use typical VRAM with its GPUs. In order to be able to fit large models in the RAM, they've gone for cheaper but much slower LPDDR5 shared with the CPU than what you'd find on a normal GPU or on their other accelerators. The cluster I had access to had a lot of real GPUs and was no doubt very expensive to create and operate. It only made sense because the cost could be shared among the biology, physics, and astronomy departments as well as little extra users like me, and people still had to queue and negotiate for sufficient access. A $4000 box with nice GPU performance numbers if you use Nvidia's marketing with numbers for FP4 performance which most research cannot use is really not the same thing.

  3. Korev Silver badge
    Boffin

    Modern science - from bioinformatics to astrophysics - depends heavily on sophisticated computer modeling, yet cloud providers' business-focused models clash with how scientific projects consume computing resources, argue Vanessa Sochat and Daniel Milroy, both post-doctoral researchers in computing at Lawrence Livermore National Laboratory.

    I don't know what the Astrophysics boffins get up to, but a lot of bioinformatics workflows are embarrassingly parallel[0] and work well with AWS Spot instances[1]

    [0] An example would be the Fastq files from a sequencing run which need to be QCed and aligned

    [1] Other clouds are available

    1. Paul Kinsler

      I don't know what the [...] boffins get up to,

      It's likely to vary. For some, it will indeed be analysing a giant chunk of just-harvested data, or running very specific simulations, and so might well be occasional and compute intensive. But at the other extreme, there is continuous processing of incoming data in real-time (or near real time) ... as those in space weather, e.g. doing CME modelling/ forecasting, based on satellite or radio-telescope outputs.

  4. Tron Silver badge

    People go to specific universities for a reason.

    They have the resources and staff for the work they intend to do.

    If they do not, either go somewhere that has those resources, or do something else.

    The world does not revolve around you because you are a scientist. Like everyone else, you have budget limits and time limits. So pick something to research that you have the facilities available to do.

  5. Nate Amsden Silver badge

    IaaS was always flawed to begin with

    It's bad by design, whether it's cost, efficiency, availability, and complexity. It's got a bunch of shiny marketing BS selling it to people is really the only thing it has going for it. I realized this myself fifteen years ago(won't link to my blog post yet again..), situation is unchanged.

    People shocked "why isn't XYZ multi region or multi cloud?" The obvious answer is cost, and complexity. You should have little doubt most orgs are well aware of the risks they just choose not to do anything about it(Looking at you Amazon Alexa). Look at South Korea and their lost data, they said "there was too much data to backup", meaning nothing more than "we weren't given the budget to make it right".

    I'm sure scientists were all giddy about getting access to a cloud dashboard thing and be able to spin up things whenever they want, they didn't think or care(initially) about the costs associated with it(nor do many orgs), but now they are being forced to face it, they cry.

    Look at companies like Geico (apparently spent a decade moving into public cloud spending $300M/year realizing it was 2.5X more costly and now are moving back), and SAP (burning untold billions probably) and last month announcing $20 billion investment in their own facilities over the next decade, filled with people who flat out didn't care about the costs...

    until they were forced to care.

    If I can save a small company $1M+/year ($10M+ over a decade, probably closer to $12M really), and Geico can save $120M a year, who knows how much SAP will save, you can save a ton of money getting out of public cloud IaaS whether you are small or super huge. I actually had someone ask me recently was the money saved including "extra costs for staff" for operating equipment. I sort of laughed. I told them the truth, the same people that ran cloud stuff ran the infrastructure, no staff changes. Don't trust me? Look at the well documented "37 signals" move out of AWS over the past year where their CEO said exactly the same thing(on LinkedIn anyway unsure if he mentioned it in his blog posts). Even if it didn't include extra employee costs, the savings at my small company were in excess of $1M per year, you can hire a bit more people if you really needed to and still be saving a ton.

    Me? I care about the costs a bit, I mean it's the easiest way to justify things to non technical folks. But for me really at the end of the day moving out of IaaS was more about control, availability, and peace of mind, systems running for months and years without issue. Not having to rebuild a server due to some failure in over a decade of operation(was a semi regular process while using AWS). My oldest flash storage array still in production just entered it's TWELFTH YEAR OF CONTINUOUS AVAILABILITY. 12 #$#@$ years, that is insane, and being that it's flash, and has 4 storage controllers it's still damn fast and works perfectly fine(I added 3 more refurb quad controller arrays to distribute the risk two years ago), and of course I have four hour on site hardware(no software) support. So far beyond my expectations, and only a single component failure in that time. I have network switches that just passed FIVE THOUSAND DAYS OF OPERATION, no faults (Technically I have had replacements for them for 2 years on site just haven't gone on site yet to deploy them, planning on next year, last time I was on site I ran out of time).

    1. Throatwarbler Mangrove Silver badge
      Devil

      Re: IaaS was always flawed to begin with

      Counterpoint: it depends on your usage model. For a company that wants to spin up resources and iterate quickly, IaaS/public cloud is a godsend. There's no capital expenditure and much less time expenditure required to get going, and there's a robust ecosystem of canned applications and services which can adapted to suit one's needs on top of whatever other development one needs to do. IaaS also makes scaling wide and across regions much simpler. The cloud provider, in principle, also takes care of ongoing maintenance tasks like hardware replacement, data center operations, etc., allowing the customer to offload those concerns, which can be especially helpful in regions where that expertise is hard to come by. Accountants also, for some reason, prefer operational cost over capital expenditure, which the cloud makes possible, even if the longer term costs wind up being higher (in fact, at a previous role, our beancounters eschewed multi-year support contracts, even with a discount, because it was easier for them to manage per annum renewals).

      As you correctly point out, the benefits are less pronounced for a stable enterprise, and it certainly makes sense for more static organizations to crunch the numbers and determine whether they're actually getting value from their cloud costs vs. bringing their infrastructure back on prem or operating in a hybrid mode.

      One of my pet peeves about some El Reg commenters is the tendency to generalize from personal experience: "This works for me, therefore it's good for everyone!" No. No, it's not.

  6. Anonymous Coward
    Anonymous Coward

    Sounds like a good use for spot instances ..

    At least with AWS

    1. doublelayer Silver badge

      Re: Sounds like a good use for spot instances ..

      The article covered that. They can't do their computing in little chunks that can get interrupted. They need a bunch of instances all running at the same time that stay running until they're done. With some modification of the software they're running, they could probably make it more fault-tolerant so it can restore itself when instances become unavailable, but that won't fix the biggest problem of needing a lot of capacity at once. It just means it will be stalled rather than broken.

      Unfortunately, no matter how you go about getting that capacity, that is expensive. You can wait until the cloud provider you're using has that much free, you can buy the hardware for the few times you'll use it, but any way you manage it, it comes at a high price.

      1. Ken Shabby Silver badge
        Alert

        Re: Sounds like a good use for spot instances ..

        I’ve heard of companies that power down their apps overnight to save money.

  7. Anonymous Coward
    Anonymous Coward

    Rubbish

    Never heard so much rubbish as a headline. As if on-prem is without cost! It's just fixed and half of it hidden. Not that I'm saying cloud is always cheaper. So, researchers can spend a little time comtemplating costs and how to manage them which in turn can enable them to do work they might otherwise not be able to afford. If it's a big project it might be worth hiring someone to manage cost.

    1. David Hicklin Silver badge

      Re: Rubbish

      I think you have misread the article: it's not so much about cost per se but instead the much vaunted promise of the cloud spinning up more capacity as you need it is not being kept, and the part of it they do get is not enough to run whatever they are doing, thus it has to sit idle (costing money) until more come available.

      Sometimes it doesn't and all the efforts (and cash) have been wasted.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon