back to article Facebook warehousing 180 PETABYTES of data a year

Facebook’s data warehouses grow by “Over half a petabyte … every 24 hours”, according to an explanatory note The Social Network’s Engineering team has issued to explain a new release of open source code. The note says the warehouse performs "ad-hoc queries, data pipelines, and custom MapReduce jobs process this raw data around …


  1. Anonymous Coward
    Anonymous Coward

    Didn't realise there was any privacy left to steal..

    Didn't they nick most people's personal data already? How come there's any left?

  3. Khaptain Silver badge

    We know that they have lots of data but

    Who are the people buying this shit.

    On Tuesday morning at 08:00, George, a 20 year old student from London, stated that he liked eating lemons whilst watching morning television

    On Tuesday morning at 08:02, Michael, one of Georges friends, a 23 year old unemployed bricklayer from Berkleyshire, clicked on a button named "Like" in reponse to Georges amazing revelation.

    What kind of people buy this kind of information. George and Michael are probably representive of "Nothing".

    Do FB just try and hope that their Data Mining techniques will eventually pull up statistiques that might eventually be usefull to a possible startup that is attempting to bring another useless product into the market. Are the major companies also buying into this shit ?

    Honestly who is paying good money to analyse what facebook users are doing.

    It is so f/****cking sad that FB even exists never mind people buying data from Zuck and Co.

    It makes my skin crawl when I look at what society is becoming. The governments don't even need to try and herd the sheep into a corner they are doing it all their own.

    Why is there no Angry icon.....

    1. Ian Michael Gumby

      Re: We know that they have lots of data but

      Its a bit more than just that.

      You now see newspapers and other sites using FB to authenticate the users. This would imply that embedded in to the articles, there are tags that track the user. And then there are other sites. Based on the scale, if you use FB, they can essentially see everything or almost everything you do online.

      Now do you start to get a picture of the scope of things?

    2. Synonymous Howard

      Re: We know that they have lots of data but

      It brings a deeper meaning to "garbage in, garbage out".

  4. Anonymous Coward
    Anonymous Coward

    New world record

    The biggest pile of worthless shite ever accumulated under the name "data".

  5. Wanda Lust


    BIG DATA baybeeee!

  6. Ian Michael Gumby

    Meh! YARN, Mesos, Spark...

    Just another cluster fsck manager to extend Hadoop.

    Trying to stay relevant in the Cloudera / Hortonworks smackdown coming to a cable station near you.

    Mine's the coat with all of the bookie slips in the pocket....

  7. Dexter

    How is Corona different from YARN? Or better?

