back to article Amazon tries to freeze out tape with cheap 'n' cloudy Glacier

Amazon is digging deeper into the enterprise with a data back-up and archival service designed to help kill off tape. The cloud provider has just launched Glacier, which it says takes the headache out of digital archiving and delivers “extremely low” cost storage. Glacier has been built on the Amazon storage, management and …


This topic is closed for new posts.
  1. alikhajeh
    Thumb Up

    It's cheap as chips

    We just ran a quick cost forecast in PlanForCloud and it's interesting: If you start with 100GB then add 10GB/month, it would cost $102.60 after 3 years on AWS Glacier vs $1,282.50 on AWS S3!

    1. Skoorb

      Re: It's cheap as chips

      Well, it's sort of cheap.

      I've just had a good look around their site (and the AWS blog) and have found out a few things.

      First, the data is stored redundantly (specifically can cope with failure of two stores simultaneously), and you can choose if you want it in the US, EU (Ireland, 10% more expensive) or APEC (Singapore, 12% more than the US).

      You store data in 'archives'. Once you have uploaded an archive, you cannot change it (though you can add to it and delete the whole thing), you are charged for three months of storage as a minimum, and if you want to download it, you have to get the whole thing. So make sure you split your data up - each archive needs to be a file!

      After requesting an 'archive' for download, you have to wait 3-5 hours before you can start to download it. You then have 24 hours to get it.

      You need to know what you have stored. A list of the description (if you provide one), creation date and size of each archive is available, but is only updated once per day; if you need any more info you have to download the thing.

      You can only download 5% of your stored data per month *pro rated daily* for free. After that, prices go up very fast! As an example, if you stored 1TB of data, and wanted to get the whole thing you would be charged about $369.80 (excluding taxes). (again, 10% more for EU, 12% more for APEC).

      So, only good for archiving if you are pretty sure you're not going to want to get most of it back.

      Working for the download charge:

      Peak hourly retrieval for the month = 36 gigabyte per hour (80Mbps)

      Billable peak hourly retrieval = Peak hourly retrieval (36) - Free retrieval hourly allowance (1.7GB) = 34.29

      Retrieval fee = Billable peak hourly retrieval (34.29) x Hours in the month (720) x retrieval price ($0.01) = $246.92

      Then you add the data download fee at $0.120 per GB. So 1024* 0.12 = $122.88. 122.88+246.92 = $369.8

      1. Skoorb

        Re: It's cheap as chips

        Oh, and as well as the $369.80 fee for a 1TB download at 80Mbps, it's probably good to know that you can't assign file names to archives (Object Keys in AWS speak). So have fun with that one when it comes to download.

      2. Anonymous Coward
        Anonymous Coward

        Re: It's cheap as chips

        I may be wrong, but it would seem like you could use the Amazon Import/Export to get your full TB (or multiple TBs) back for much cheaper than transferring the whole thing over the internet. As long as you didn't need the data *quickly* that would make sense.

      3. Ken 16 Silver badge

        Re: It's cheap as chips

        "After requesting an 'archive' for download, you have to wait 3-5 hours before you can start to download it. You then have 24 hours to get it."

        By my reckoning if peak recovery rate is 36GB/hour you're never going to get that TB back within your download window. Am I missing something?

        1. Skoorb

          Re: It's cheap as chips

          That's a very good point. I naively assumed that it meant you had 24 hours to *start* downloading the job, but after having a look at the actual API reference it looks like at some random time after 24 hours it may just reset the TCP connection and return a 404 for any attempts to resume. That's just plain stupid.

          Which unfortunately means that it's essentially unusable if the amount of data you store on it is greater than the maximum you can pull down your internet connection in 24 hours. That is unless you fancy doing a lot of maths to request multiple jobs about 12 hours apart and you can guarantee that you can maintain a constant download rate over the whole period.

          1. Ken 16 Silver badge

            Re: It's cheap as chips

            either way, you're not going to get it back within 2 hours as you would with a tape drive (uncompressed)

  2. GrumpyJoe

    I tried S3...

    it was glacially slow - was it just me? I've heard of others with the same kind of problem - and I was using the EU node for my instance.

    What kind of transfer rates are we talking here? If they've fixed that it may be just what I was looking for for my Synology NAS cloud backup.

    1. Code Monkey

      Re: I tried S3...

      I guess they've managed expectations well with the name Glacier. Think how angry you'd have been had it been called "TurboNutterFastBackup"

  3. 0laf

    Outlook Cloudy

    Just remember when your CEO/CIO/CFO comes into your face waving the savings that Amazon has promised him to tell them that the big fat pipe you'll need to use this doesn't come free or the redundant one you might want to back it up.

    Then you might want to look at the Article 29 working group report into the Cloud .

    Then you might want to order some more disks for your SAN

  4. Yet Another Anonymous coward Silver badge

    tasked with regulatory compliance

    And their SLAs guarantee that the data on this life insurance policy or land deed will be available in 99years time?

    That all my data won't dissapear if the US suspects that somebody on Amazon is hosting a pirate movie?

    And there is no price rise when I suddenly want to move all my data off their platform to a competitor?

  5. James 100

    Very slow retrieval

    At first, I thought this was a slower, low-cost variant of S3: same concept, but bigger, cheaper SATA disks and more use of RAID than straight duplication. The multi-hour retrieval times quoted would be consistent with tape, but they denied in interviews that it's tape based - some kind of disk library, perhaps, where the disk is stored powered down in a vault somewhere, then spun up when you request your data back? That could explain a few hours - spin up and mount a RAID set, then copy the data off to a staging S3 bucket for you to read from. Throw in some smart placement (keep all your stuff together, destaging it from S3 in big batches) and they should avoid the worst case scenarios (lots of little requests for different archived objects, spread out in time.

    A dozen 4Tb or two dozen 2Tb drives in a pod, with double or triple parity protection, would fit with their 40 Tb maximum object size plus a bit of overhead - and they've set up infrastructure for hooking up big external drives to S3 already for the Import/Export stuff.

    I like the price compared to S3 - but it's $120/yr for a terabyte. Probably about what you'd expect to pay to rent a pair of 1 Tb SATA drives for the year, sitting in quiet corners of two different Amazon sheds, plus a small share of a couple of shared drives for parity protection?

    1. Androgynous Cupboard Silver badge

      Re: Very slow retrieval

      Wouldn't be surprised if the disks are online alreayd, and the multi-hour retrieval is artificially added to stop people dumping the much-more-expensive S3...

      1. Yet Another Anonymous coward Silver badge

        Re: Very slow retrieval

        You might spin them down to reduce cooling costs. But you would need customers with a LOT of data, otherwise you would be constantly spinning up a 3Tb disc because one of the 3000 customers with 1Gb on it wanted a file.

This topic is closed for new posts.

Other stories you might like