back to article NASA to launch 247 petabytes of data into AWS – but forgot about eye-watering cloudy egress costs before lift-off

NASA needs 215 more petabytes of storage by the year 2025, and expects Amazon Web Services to provide the bulk of that capacity. However, the space agency's budget didn't include download charges, an omission that has left the project and NASA's cloud strategy in peril. The data in question will come from NASA’s Earth Science …

  1. Pascal Monett Silver badge

    Just wondering

    I wonder if anybody at NASA made a comparative cost evaluation of how much it would be for NASA to upgrade its DAACs to meet the 240+ PB storage mark vs properly costing AWS to get the job done.

    With the download costs added to the mix, I really wonder if it wouldn't be better to go and upgrade the DAACs.

    1. A Non e-mouse Silver badge

      Re: Just wondering

      I've seen numerous reports which say that cloud is great for bursty/on-demand workloads, but for a constant load, on-prem usually works out cheaper.

      1. lglethal Silver badge
        Angel

        Re: Just wondering

        But, but, CLOUD!!!

        1. Wellyboot Silver badge
          Mushroom

          Re: Just wondering

          Clouds only look good from a distance!

          1. BebopWeBop

            Re: Just wondering

            Yes, frequently white fluffy and inherently cuddly.

            1. DwarfPants

              Re: Just wondering

              I have been in a cloud it was cold and wet.

          2. Anonymous Coward
            Anonymous Coward

            Re: Just wondering

            Clouds are known to hide perils...

          3. Gene Cash Silver badge

            Re: Just wondering

            I've looked at clouds from both sides now.

            1. zuckzuckgo

              Re: Just wondering

              >I've looked at clouds from both sides now.

              From up and down and still somehow

              Its download costs I don't recall

              I really don't know clouds at all

        2. Terry 6 Silver badge

          Re: Just wondering

          That's the thing. I know nothing. But from the depths of my ignorance it's hard not to think that maybe, just maybe, someone put Using the Latest Thing ahead of Due Diligence.

          Just a thought.

        3. Warm Braw

          Re: Just wondering

          So many things I would have done

          But clouds got in my way

          1. David 132 Silver badge
            Thumb Up

            Re: Just wondering

            Upvote for the Joni Mitchell reference. And by staggering coincidence, I was listening to the Johnstons' "Give a Damn" album (which contains a rather good cover of the song) when I read your comment. Spooky.

        4. vtcodger Silver badge

          Re: Just wondering

          "But, but, CLOUD!!!"

          You've looked at clouds from both sides now, And now ...

      2. Anonymous Coward
        Anonymous Coward

        Re: Just wondering

        >I've seen numerous reports which say that cloud is great for bursty/on-demand workloads, but for a constant load, on-prem usually works out cheaper.

        This is largely correct. The reality is the boundary between "bursty" and "constant" is much lower than people realise. Start breaking 20-30% utilisation and you're almost certainly better doing something CAPEX-y rather than OPEX-y. Of course AWS are now attempting to serve that market with large pre-pay deals, but that's basically just outsourcing at that point.

        The one true exception to this rule is raw storage. S3 is comparable to a respectable performance SAN with dedicated networking and guaranteed QoS to every host in the environment. Object storage on public cloud is overwhelmingly cheaper than pretty much any comparable alternative option, and if you don't need your data accessible then you can make it even cheaper with offline storage like iceberg.

        Which identifies the other category of workload the cloud is a good fit for - anything using a small but varying subset of a large, mostly cold data archive. Scientific computing tends to meet this criteria. They're not doing daily batches, they're doing on-demand experiments on small subsets. That's a perfect fit for object storage.

        1. AndyD 8-)₹

          Re: Just wondering

          "if you don't need your data accessible then you can make it even cheaper with ... my bargain special offer write-only storage system"

      3. spireite Silver badge

        Re: Just wondering

        Yep, if you want something 24hrs a day, then it's likely to be a magnitude more. First thought, "great, I can get rid of my techs - it'll manage itself! - and for less", which 6 months after you have, becomes "I need some techs to manage this thing", followed by 6 months later again ...... "I've been sold a pup, we need to bring back inhouse into a traditional datacentre"

        1. Dave 13

          Re: Just wondering

          I've seen this cycle in customers already. Existing CIO had things humming nicely, gets lured away by better offer. New CIO hired with a "cloud first" mentality. Dumps on-prem gear because "it's really expensive to run at $1.5m per year" and moves all data and computing to AWS. 6-9 months pass and the economics of the AWS deal become apparent and the CIO flees in disgrace. Next CIO buys new on-prem gear and gets everything humming.

          Lather, rinse, repeat..

          1. Alan Brown Silver badge

            Re: Just wondering

            "6-9 months pass and the economics of the AWS deal become apparent and the CIO flees in disgrace"

            But he got paid, got a bonus when he reduced costs and no penalties ensued when this was proven to be wrong.

            In all liklihood he got a _GLOWING_ reference to get rid of him and some sucker will hire him for the cycle to repeat.

            Such individuals need to be named and shamed, but they have misuse of Employment and Data Protection laws to hide their misdeeds down to a fine art

    2. Brewster's Angle Grinder Silver badge
      Mushroom

      Re: Just wondering

      Damn space scientists - always with their heads stuck in the clouds.

      CLOUD ICON ---->

      1. Jimmy2Cows Silver badge
        Boffin

        Re: Damn space scientists...

        Heads above the clouds, Shirley...

    3. oiseau
      Facepalm

      Re: Just wondering

      ... wonder if it wouldn't be better to go and upgrade the DAACs.

      Wonder?

      Hmmm ...

      I'd put my money on upgrading the DAACs, which has the distinct advantage of NASA having full control over the stored data instead of it being under Amazon's/Bezos' control.

      O.

    4. The Nazz

      Re: Just wondering

      It's alright everyone quoting Joni Mitchel but won't someone spare a thought for Carly Simon whilst she's drinking the caffeinated brew of her choice?

    5. 0laf Silver badge

      Re: Just wondering

      They will have a "cloud first" policy like every other muppet. It took me months of persuasion to get our strategy to be "Cloud where appropriate".

      I'm sure people in high places are trying to make Amazon realise the PR opportunity from cutting NASA a break for this particular SNAFU.

    6. MachDiamond Silver badge

      Re: Just wondering

      Pascal, I expect you are correct. NASA is generating AND using so much data, that outsourcing the management doesn't make much sense. Perhaps having AWS as a back up could be useful for files that aren't super sensitive.

      NASA would obviously be a major customer and would get priority support, but an in-house team and storage array means even more direct service. All of this information IS the value of NASA. You don't outsource your key asset.

  2. SJA

    What if the Cloud also catches Corona?

    We all know that computers are at risk from viruses/ii... what if corona spreads into the cyberspace? Mutations are not uncommon among viruses/ii

    1. DavCrav

      Re: What if the Cloud also catches Corona?

      "Mutations are not uncommon among viruses/ii"

      It's viruses. Viri are men. (Virus in Latin means venom or slimy liquid, and is neuter. It didn't have a plural in Latin, but if it did, it would have been vira.)

      1. KarMann Silver badge
        Headmaster

        Re: What if the Cloud also catches Corona?

        Actually, in classical Latin, it's a fourth declension noun, not the usual second declension, so the plural would have been 'vīrūs'. But, in checking my reply, I did find that there's a Neo-Latin form which does use standard second declension inflexions outside of the nominative/accusative, so then it's 'vīra', as you said. OK, I'll allow it, with that Neo-Latin caveat.

        1. Jonathon Desmond

          Re: What if the Cloud also catches Corona?

          Romani ite domum

          1. KarMann Silver badge
            Joke

            Re: What if the Cloud also catches Corona?

            Actually, in classical Latin, it's 'Romanes eunt domus.'

            1. OssianScotland

              Re: What if the Cloud also catches Corona?

              Now write it out a C times

              1. David 132 Silver badge

                Re: What if the Cloud also catches Corona?

                "...and if it's not done by sunrise, I'll cut yer balls off"

        2. BebopWeBop
          Thumb Up

          Re: What if the Cloud also catches Corona?

          Ahh - a better and more educated class of pedant here

          1. Anonymous Coward
            Anonymous Coward

            Re: What if the Cloud also catches Corona?

            With such knowledge, he should be PM...

        3. dajames
          Headmaster

          Re: What if the Cloud also catches Corona?

          Actually, in classical Latin, it's a fourth declension noun, not the usual second declension, so the plural would have been 'vīrūs'.

          It appears not: In usage by classical writers (Vergil, Cicero, and that crowd) virus is seen to be second declension. There is no known plural form in classical writing but, because its gender is neuter, its plural -- had it one -- would be vira.

          At least according to Dr. Smith's dictionary and Kennedy's primer.

      2. Andy The Hat Silver badge

        Re: What if the Cloud also catches Corona?

        Hang on, my red-lead brush is running out ...

        Now, if can you conjugate that verb again please ...

        1. dajames

          Re: What if the Cloud also catches Corona?

          Now, if can you conjugate that verb again please ...

          Actually, it's a noun -- and I decline to!

          1. IGotOut Silver badge

            Re: What if the Cloud also catches Corona?

            Ah, the joys of using a static dead language to define "new" discoveries.

          2. John H Woods

            Re: What if the Cloud also catches Corona?

            As an antediluvian biologist, i prefer viridae

  3. Anonymous Coward
    Joke

    Cloud costing...

    Cloud costing ... if only it were as easy as rocket science...

    1. wolf29

      Re: Cloud costing...

      Cloud-costing seems to be designed to produce the loudest exclamation of dismay from your company's CFO every month. Since download quantity cannot easily be estimated (nobody in the Financial office ever having to have thought this way before) there is a tendency to under-estimate it wildly.

      1. Alan Brown Silver badge

        Re: Cloud costing...

        "there is a tendency to under-estimate it wildly."

        There is also a massive vulnerability if your competitors want to play dirty by arranging massive levels of low-level access

        I'm not suggesting that would be a way of taking out dodgy marketing outfits, That might be illegal

    2. Strahd Ivarius Silver badge

      Re: Cloud costing...

      According to SLS reports it is as easy as rocket science...

  4. Steve K

    Hang on...

    I may well be missing something here but:

    * If scientists are going to download so much data locally that went to the cloud first, then surely that data should have gone directly to a NASA data centre first anyway (either for use or for pre-processing prior to AWS ingress)

    * They still have to store that data locally (and back it up since you don't want to incur the egress charges again if it gets lost), so you still need some kind of significant storage capability within NASA

    * This could be mitigated by pre-processing/sifting data sets in AWS (incurring CPU/GPU compute charges of course) - assuming that the workloads don't require specialised hardware unavailable in AWS - so that there is less to download. You still have to store that reduced data set locally though

    * Assuming that there is no specialised hardware required then can they sidestep this by working on the data in the cloud entirely (i.e. no need to download most of the data) using cloud compute/virtual workstations? I imagine that the CPU/GPU compute charges would be eye-watering BUT they would have burst capacity on demand (although they would soon learn about marginal costs of instances spinning all year long...)

    1. diodesign (Written by Reg staff) Silver badge

      Re: Hang on...

      "surely that data should have gone directly to a NASA data centre first anyway"

      That's the rub. NASA didn't want to run its own data centers: it opted to upload all the stuff gradually to the cloud. According to the audit report, it didn't realize that people can't download this stuff "for free" from the cloud – someone has to pay for the bandwidth. NASA, in this case.

      C.

      1. Anonymous Coward
        Anonymous Coward

        Re: Hang on...

        "“Collectively, this presents potential risks that scientific data may become less available to end users if NASA imposes limitations on the amount of data egress for cost control reasons.”"

        And that's why cloud deals shouldn't be decided only by freaking accountants who don't have a clue data is going both ways in whatever pattern the business needs it to.

        They failed to understand that and that's what it is gonna costs NASA.

        Next time, NASA, understand IT pros are useful ...

      2. Anonymous Coward
        Anonymous Coward

        Re: Hang on...

        > someone has to pay for the bandwidth. NASA, in this case.

        So this is partially true. AWS supports a mode called "requester pays", which you can deploy to reverse the charges. The problem is that egress charges tend to be the dominant part of the cost of large-scale S3 storage (to encourage you to keep everything in AWS), so NASA would then be externalising the cost of downloading their datasets to the institutions that want to use them. Not impossible, but those kind of costs need to be carefully managed, and it means the institutions needs to be AWS-savvy enough to engage with requester pays.

        ArXiV and a couple of other places are set up like this. Downloading something small/routine? They bear the cost. Downloading the full archive? Get your wallet out.

        1. Anonymous Coward
          Anonymous Coward

          Re: Hang on...

          " so NASA would then be externalising the cost of downloading their datasets to the institutions that want to use them. "

          Yup. NASA tried exactly that and was told to FUCK OFF in no uncertain terms by everybody involved - up to and including threats to remove critical instruments from the missions.

      3. mrcrilly

        Re: Hang on...

        "According to the audit report, it didn't realize that people can't download this stuff "for free" from the cloud"

        Where does it say that?

      4. Anonymous Coward
        Anonymous Coward

        Re: Hang on...

        Anonymous because I have to work with these people.

        I have to work with NASA (amongst other space agencies) and this kind of "oversight" isn't particularly surprising.

        They also take the attitude that we should bork OUR site security for THEIR convenience when it comes to data transfers, instead of actually doing some work to fix their underlaying problems - because the same admins behind this kind of story refuse to allow anyone inside NASA to sort things out - much like it was before my dealings with "Cancer Omega"(a senior NASA data admin) in the 1990s led to the discovery that hackers had broken into - and had free rein on - the same computers being used for command-and-control of Mars Pathfinder along with a slew of orbiting satellites(*) - yes, this really happened and is documented. It led to a shakeup of NASA security and data policies which apparently has been undone again...

        (*) At that point they were mostly being used for IRC denial of service attacks and running bots, but those same NASA computers participated in various large-scale late 1990s attacks on commercial network service providers - in at least two cases driving those providers out of business.

    2. rg287 Silver badge

      Re: Hang on...

      If scientists are going to download so much data locally that went to the cloud first, then surely that data should have gone directly to a NASA data centre first anyway (either for use or for pre-processing prior to AWS ingress)

      I was likewise confused at first. Surely the likes of NASA could just negotiate a couple of fat peering connections with AWS and bump off egress charges (in the same way that Bandwidth Alliance deals allow customers to discount egress costs from Azure if they are using something like Cloudflare, because Azure and CF peer and it costs MS nothing to send data to CF compared with sending it over transit).

      I think the point is that this is the cost derived from arbitrary users - people at academic institutions around the world. I wouldn't be surprised if this is where the oversight has occurred - NASA will likely be able to get at their data in AWS for free, and it was forgotten that most of their data users are outside NASA and EDSIS will get stung for the egress to them.

      I think.

      1. Anonymous Coward
        Anonymous Coward

        Re: Hang on...

        "Surely the likes of NASA could just negotiate a couple of fat peering connections with AWS"

        NASA has limited egress speeds from its public archives to about 100Mb/s _total_ for YEARS in order to try and discourage use - and that bandwidth includes feeds to the institutions providing the instruments producing the data.

        The agenda has been going on for a long time. The people behind this have an agenda that data storage and IT in general is "not part of NASA's core business", therefore must be farmed off to market forces. They've also been starving NASA's IT budgets on the same basis.

    3. vtcodger Silver badge

      Re: Hang on...

      "then surely that data should have gone directly to a NASA data centre first anyway "

      Well yes ... sort of. Back in the day, that was called a ground station. It took the data, unpacked the multiplexed telemetry data into structures that were/are a bit more conventional and added appropriate metadata. Again, back in the day, that was necessary because vehicle bandwidth was very costly and therefore data streams were as compact as possible. And data rates were often very high because the link could only be up during short time windows. So the ground station equipment was specialized, expensive, and often project unique. Then the data went to a project command center for analysis before being archived.

      Assuming things are still much like that, the AWS part is presumably just the archiving.

  5. Blergh

    Just charge the users

    Ok the users might not be too happy to be incurring a new cost but presumably spread out among all the users it might not be so expensive for each of them. This is assuming ti is possible to set up a system to pass on these costs.

    1. Hawkeye Pierce

      Re: Just charge the users

      Assuming that we're talking about data held in AWS S3 buckets then that already supports "Requester Pays" buckets. Yes it means that the requester has to have an AWS account but that's not unreasonable.

    2. A Non e-mouse Silver badge

      Re: Just charge the users

      Isn't there some rule in America that all research data paid for by the tax payer has to be freely available?

    3. Anonymous Coward
      Anonymous Coward

      Re: Just charge the users

      It is indeed possible to charge users by passing on the cost to them, but they need to have an AWS account in order for the billing etc to be handled.

  6. tallenglish

    Someone got a good kickback

    This cloud business seems very similar to the energy ponzi scheme that brought down Enron, buy cheap and sell at extortionate rates leaving people in Calefornia with large bills and power outages.

    I guess with AWS, this will be high rates to get your data back and lost data because do we really think amazon will have redundancy to keep data over multiple DC securely. Just look how they treat the staff, cheap and nasty, so why do we think the staff will care much about the data they care for.

    1. Cederic Silver badge

      Re: Someone got a good kickback

      Yes, we do really think Amazon will have redundancy to keep data over multiple DC securely.

      Mainly because there's plenty of evidence that they already do.

      So no, no lost data. How do Amazon pay for this remarkable service? By charging for things like data transfer out of their cloud. It's almost as though it's a transparently priced self sustaining model that's been in operation for a couple of decades.

      1. eldakka

        Re: Someone got a good kickback

        But isn't that a specific, extra-charge tier of service?

        I seem to remember some online services losing all their data on AWS and having to close up shop because they assumed cloud meant redundant data. It doesn't unless you pay for it as an extra service?

        1. Cederic Silver badge

          Re: Someone got a good kickback

          Users choose and pay for the level of resilience they require. You're saving on capital expenditure and data centre overheads, not on architects and developers.

        2. Alan Brown Silver badge

          Re: Someone got a good kickback

          "But isn't that a specific, extra-charge tier of service?"

          It is, and NASA haven't paid for that either (check the reports)

        3. Alan Brown Silver badge

          Re: Someone got a good kickback

          "I seem to remember some online services losing all their data on AWS and having to close up shop because they assumed cloud meant redundant data."

          Even worse than that:

          Some outfits thought they'd gotten redundancy by distributing their data across multiple cloud providers.

          What they discovered was that the "multiple providers" were sock puppets (or resellers) of ONE service provider with all their data resting in the same place in the same cloud which went "phutt"

  7. John Robson Silver badge

    At that scale

    You're already going to be getting most of the benefits of scale associated with being "cloudy" anyway, even if you run it in house.

    What's the annual storage budget for that, I bet most of use could plan some hardware (including power costs and disk replacements) that would cope for 5 years and pay a healthy salary for a (small) team to look after it.

    1. Yet Another Anonymous coward Silver badge

      Re: At that scale

      Ah, I see you have the machine that goes 'ping!'.

      This is my favourite. You see, we lease this back from the company we sold it to - that way it comes under the monthly current budget and not the capital account.

    2. Anonymous Coward
      Anonymous Coward

      Re: At that scale

      Once you go past a couple of hundred TB (well before that figure actually), cloud becomes wildly uneconomic

      We've let academic staff rant about cloud being cheaper and let them go get quotes for equivalent services from AWS, Google, etc.

      EVERY SINGLE ONE who did so has come back and apologised.

      The one who ignored us and pressed on regardless discovered he was paying five times as much for his data storage/access and tried to charge that back to central IT - which didn't go down well with the "powers that be" when he tried to claim nobody had told him about the extra charges and printed documents were then pulled out of filing cabinets showing the emails giving exactly those warnings.

  8. BebopWeBop
    Devil

    So a decent accountant might have (been of) value after all

  9. Jason Bloomberg Silver badge
    Coat

    Outsourcing 101

    "Oooh; look at all the savings we can make!" - With hardly a thought given to the hidden costs and downsides.

    Of course, in most cases, the downsides are borne by the users so they can usually be disregarded as they rub their hands with glee.

    The one with the convincing advert in the pocket.

    1. Anonymous Coward
      Anonymous Coward

      Re: Outsourcing 101

      Government procurement rules.

      We developed a bunch of analysis software for a previous mission around NAG libraries because we got a government site-licence and had no budget to pay for a developer to integrate a free alternative.

      But we were then forbidden from sharing any models or source code with researchers outside the boundary of the site-licence. Collaborators would have to come to us to use the data.

      But somebody's budget shows a saving from buying a site-license over using free software

  10. PrcStinator

    Someone definitely protested.

    But was summarily ignored to oblivion, probably punished too.

  11. johnnyblaze

    Bad move

    Watch this go bang. Once NASA factor in the ridiculous Internet pipes they'll need, with umpteen levels of resilience and bandwidth in the 10's of GB/s range, and pay AWS bazillions forever more. I can't quite believe how they think this will save them money. In all reality, NASA engineers will be twiddling their thumbs while things upload/download, or AWS goes down, or they suddenly find they're trying to access data from a much slower region. Trust me, it won't work, and NASA will pay the price. Nobody saves money moving to the cloud!

    1. oiseau
      Pint

      Re: Bad move

      Nobody saves money moving to the cloud!

      Finally ...

      Some common sense.

      Have a beer --->

    2. Anonymous Coward
      Anonymous Coward

      Re: Bad move

      Senior people inside NASA have been issuing exactly those warnings for years - and gagged or fired for their troubles.

      The people who came up with this hair brained scam get their bonuses and skedaddle before the hammer comes down. Their successors get the shower of shit and blame.

  12. Anonymous Coward
    Anonymous Coward

    Because tax payers' teat.

    Need more money just go back to congress.

    1. cschneid

      Absolutely. They should just outsource the whole mess to the lowest bidder and inject adverts into the data streams.

      1. EhWhat
        Joke

        'inject adverts into the data streams'

        It's a shame this is about new data and not SETI ...

  13. Anonymous Coward
    Joke

    "how an agency capable of sending stuff "

    The keyword is "sending" - after the Shuttle they have nothing getting back, so they just calculated the costs of sending data to the cloud, not the cost of data returning back from the cloud...

    1. Richard Pennington 1
      FAIL

      ... thus achieving ...

      ... the ultimate Write-Only Memory.

  14. spireite Silver badge

    Save money?

    Even on our current projects, the financial pigeons are coming home to roost.... but these pigeons have become gold-plated....

    Cloud - a likely more expensive datacentre than your existing one.. which frighten the beancounters post-POC, and then it's too late!!

  15. Brush

    NASA Fyling into cloud, should have checked if they had a valid IFR rating....

  16. HammerOn1024

    Nitwits-r-Us!

    You have GOT to be kidding me. NASA... get your SHIT together.

  17. steviebuk Silver badge

    So many....

    ...PHBs insist on "cloud, cloud, cloud" and "Infrastructure free" yet don't realise how fucking expensive it ends up being.

    1. Terry 6 Silver badge

      Re: So many....

      Outside of IT ( and in it too, I guess) a good rule of thumb is that using external services is the same cost as using internal ones minus infrastructure savings, but plus suppliers costs and profits.

      So the only advantage is (infrastructure savings-supplier cost).

      Beancounters, who should be able to create and apply formulae, seem to struggle to manage this one. Because they distinguish between capital investment and budgeted costs. And don't seem to realise that spending on the former saves money on the latter.

      i.e. short term v long term thinking.

      And it's not a new problem. In my sphere it's been school buildings that were cheap to put up a few years before, but then are unaffordable to maintain. Costs kicked down the road.

      1. Alan Brown Silver badge

        Re: So many....

        "In my sphere it's been school buildings that were cheap to put up a few years before, but then are unaffordable to maintain. Costs kicked down the road."

        In Scotland it was school buildings put up cheaply and then had to be emergency closed as they started falling down.

        But nobody prosecuted the administrators responsible for this, at most they pointed fingers at the substandard building contractors those administrators hired.

        Guess who then gets to merrily skip along to another government department or organisation, rinse and repeat?

  18. Anonymous Coward
    Anonymous Coward

    250PB sounds like a lot but it's only a half dozen racks. We've deployed over 9PB in an office of 200 so NASA should be able to manage that in house without too many issues.

    1. Degenerate Scumbag

      I'd be curious to know how you achieve that density. My back-of-a-fag-packet calculations made it 24 racks, at a cost of around $10mil. Still, the point stands that this would be far cheaper for them to keep in house.

      1. John H Woods

        me too

        I'm thinking 10 4U units in a full height rack, 60 x 16TB HDDs in each, thats a 10PB rack weiging about 1 tonne and taking, what, in the order of about 5-10kW?

        So yes, two dozen racks, not half a dozen.

        1. Degenerate Scumbag

          Re: me too

          Yeah, that's along the lines I was thinking, except racks of up to 48u are available, so you can get 11 of those 4u 60 drive enclosures per rack (you could actually fit 12, but then you'd have nowhere to mount the switch). Also, Seagate are promising to release 18TB drives in the first half of this year, which ups the potential density a bit. Power-wise, you could _just_ manage with 5KW per rack constant power, but only if you carefully manage the spin-up of the drives - if they all start at once, the initial surge would draw 15KW.

        2. Alan Brown Silver badge

          Re: me too

          "60 x 16TB HDDs"

          Try 90 to 100+:

          https://www.supermicro.com/en/products/chassis/4U/946/SC946ED-R2KJBOD

          http://www.chenbro.com/en-global/products/RackmountChassis/4U_Chassis/RM43596

          http://www.chenbro.com/en-global/products/RackmountChassis/4U_Chassis/RM43699

          https://www.westerndigital.com/products/storage-platforms/ultrastar-data102-hybrid-platform

          "Up to 1.4PB raw capacity in a 4U enclosure"

          I've seen 120-drive prototypes too.

          If you don't feed a rack like this with 10-15kW capabilities you're not going to fare well.

          The 9-14W per unit IDLE power consumption of these very large drives is just that - sitting there spinning with the heads parked. As soon as you actually start seeking the power goes way higher than that. Count on a 30% increase in power draw simply by having the heads flying, let alone seeking.

  19. Ken Moorhouse Silver badge

    What Goes Up, Must Come Down

    Most people are familiar with this adage. However, NASA are arguably more interested in seeing things go up, rather than coming down, so it is understandable they haven't factored this into their costings.

    Civilisation has seen abandoned supermarket trolleys, abandoned e-commerce trolleys, can we soon expect to hear about abandoned datasets?

  20. MatthewSt Silver badge

    Public Dataset?

    Could have saved themselves a fortune! https://aws.amazon.com/opendata/public-datasets/

    1. nagyeger

      Re: Public Dataset?

      "we will cover the costs of storage and data transfer for a period of two years," sayeth the small print.

      Hopefully someone asks: What happens after two years?

      1. MatthewSt Silver badge

        Re: Public Dataset?

        Someone else's problem by that point

  21. Anonymous Coward
    Anonymous Coward

    NASA should be above the cloud.

  22. Anonymous Coward
    Anonymous Coward

    Conflict of interest ?

    I seem to remember that Bezos is also in the space game.

  23. Mike 137 Silver badge

    "cloud egress charges"

    It's quite wonderful how jargon gets invented to hide obvious problems. I suppose "cloud egress charges" sounds less expensive than "you have to fork out continuously to get at your data".

  24. Sirius Lee

    Maybe auditor have not considered all AWS options

    The data has to be stored somewhere and wherever that is there's going to be a code. As for access to the data, on the AWS S3 pricing page (https://aws.amazon.com/s3/pricing/?nc=sn&loc=4) on the 'Data Transfer' tab is this little nugget which may be relevant:

    You pay for all bandwidth into and out of Amazon S3, except for the following:

    • Data transferred in from the internet.

    Data transferred out to an Amazon Elastic Compute Cloud (Amazon EC2) instance, when the instance is in the same AWS Region as the S3 bucket.

    • Data transferred out to Amazon CloudFront (CloudFront).

    (my emphasis) So providing 'data scientists' are working from an EC2 instance in the same region as the NASA bucket there will be no charge to NASA. Of course there will be a charge to the 'data scientist' to run the EC2 instancein the same region and to add sufficient storage but that's not the concern of NASA. In this scenario the data transfer the costs will be $zero. Of course this assumes NASA staff configure the bucket correctly and only allow access from IP addresses in the same region.

    1. Alan Brown Silver badge

      Re: Maybe auditor have not considered all AWS options

      "So providing 'data scientists' are working from an EC2 instance in the same region as the NASA bucket there will be no charge to NASA"

      They're not - and that's all I'm going to mention.

  25. Fenton

    Egress? Surely this is ingress!

    You pay for Egress but generally not for ingress.

  26. jelabarre59

    Stacked?

    But wait, isn't NASA one of the two originators of OpenStack, the *on-premises* cloud? Couldn't figure out how to maintain one of their OWN projects?

  27. The Vociferous Time Waster

    wrong model

    If you want to get your data out then AWS is the wrong model. It's great for ingesting data, processing it and then returning targeted information based on that data (reporting and visualisation etc) but just as somewhere to store it for repeated access it sucks ass.

  28. Kevin McMurtrie Silver badge

    Public distributed filesystem

    If this was a smart startup, they'd create a distributed filesystem app and give people free trinkets from the gift shop or a cool screensaver if they can run it correctly.

    1. MatthewSt Silver badge

      Re: Public distributed filesystem

      https://tardigrade.io/ - similar to your idea but people get paid to run nodes

      1. Alan Brown Silver badge

        Re: Public distributed filesystem

        The idea of using torrents for the data was raised - and thrown out pretty violently....

  29. pwihmag

    Given googles mission statement they should offer to host the data

  30. Anonymous Coward
    Anonymous Coward

    I wouldn't what to be the room when Trump finds out that NASA is paying Amazon/Bezos.

  31. sbardong@deepstream.dec

    This story sounds a bit odd. I design AWS architectures for a living and whilst 250PB is a lot of data I see very little reason for large downloads that would incur these charges. The one reason to donwload the data would be to access on-premise compute resources, which would make little sense to do for organization doing research. AWS compute offering is super cost-effective and incredibly scalable if designed correctly. It would be interesting see more about their use case and the need to download so much data. For analytics, AWS supports all the major languages, ML notebooking, and more.

    1) I found this case study for NASA that build this library (https://images.nasa.gov/) basesd on a architecture for delivering image content to the public. (https://aws.amazon.com/partners/success/nasa-image-library/)

    This CDN implementation would incur cost with content getting stored at edge locations.

    2) Open Data on AWS stored in S3 Object Store, which is are available in us-west-2

    A collection of Earth science datasets maintained by NASA, including climate change projections and satellite images of the Earth's surface.

    https://docs.opendata.aws/nasa-nex/readme.html

    It would be interesting to dig into this one in detail and figure out how they have things configured....

    1. diodesign (Written by Reg staff) Silver badge

      "This story sounds a bit odd"

      FWIW we're reporting what the auditor said - so if something looks odd, you mean, the auditor's findings are odd.

      C.

    2. Alan Brown Silver badge

      "I see very little reason for large downloads that would incur these charges. "

      You don't do planetary or space science data.

      I do and these kinds of transfers are NORMAL.

  32. aqk
    Big Brother

    Cloudy days ahead? Demand your own SSDs

    Well, I just wanna know when Amazon will sell me a hard disk with one or two Exabytes on it.

    Gonna need it for my modest Pr0n stash....

    SSD preferably.

    Please- do not suggest this "cloud" nonsense to me!

  33. TuckJo

    NASA's 225PB should be much better managed for 2025.

    225 PB of storage is not a super large amount, especially by 2025. Today individual spinning disk drives and NVMe flash drives can support sizes of 16TB for a single drive. Storage subsystems deployed in a single T42U Rack can support up to 25 PBs.. So today that could be service in ~10 racks. By 2025 much better density should be expected. NASA should store its 225 PBs on and off prem. Access to the data on prem would not need to be encumbered by the high AWS (or ASURE) charges for access by all the offsite customers. NASA should also negotiate with the cloud providers for nearly free access from the worlds scientific communities, or choose another cloud.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like