back to article Google open sources MapReduce compression

Google has open sourced the compression library used across its backend infrastructure, including MapReduce, its distributed number-crunching platform, and BigTable, its distributed database. Available at Google Code under an Apache 2.0 license, the library is called Snappy, but Google says this is the same library that was …


This topic is closed for new posts.
  1. JeffyPooh


    Fr sm rsn, I fnd tht rtcl vry ntrstng.

  2. Anonymous Coward
    Anonymous Coward

    Does this mean?..

    ..that there is something snappier, bigger, and more reductive under development in the Chocolate factory? Companies as big and subtle as Google dont just opensource key parts of their infrastructure like that.

    An act of 'kindness' from Google means that there is some deeply evil malevolence afoot somewhere, probably involving crappy ads and privacy losses.

    ..mine's the one with the click thru banners in the pockets.

  3. Ryan Barrett


    ...for the engineers and scientists who've designed and implemented these algorithms.

    Totally understandable for a company like google: keep the talent happy by letting them give away their old toys when they've got something shiny new and much better to play with.

  4. Paul Shirley

    old as the hills

    So basically Google have rediscovered what game developers were doing 2 decades ago? And network engineers probably 2 decades before that ;)

    Back in the early 90's, faced with unpacking graphics from ROM to graphics RAM in real time, I used a variety of hand tuned, entropy coding free, codecs and achieved similar compression rates. It ain't rocket science, especially if it has to run on an original Gameboy or Sega's 8 bit consoles.

    That's why they're giving it away, where Microsoft would BS the US PTO into issuing patents.

    Still, having a standard scheme across vendors might be useful for net traffic... a bit like the compression modes modems used to use!

    1. Charles 9

      It may be old hat...

      ...but try cranking the on-the-fly throughput to what's being advertised and you get a feel for what Google's been doing: trying to send as little data down the pipe as possible while at the same time not bottlenecking the whole works.

    2. Andy 73 Silver badge

      I would imagine..

      ..that the difference is that Google have analysed the last digit of performance out of their algorithms, whereas you just got yours to work.

      There really aren't that many computational processes that haven't been done before, and before and before. The difference usually comes in optimising the latest implementation to suit the environment in which it runs.

      1. Paul Shirley

        @andy 37're assuming I (and the other programmers implementations I know of) didn't tune our compression codecs for speed and compression. Mine got continuously tweaked, measured and improved for >4 years. I didn't throw in 'similar compression rates' for no reason.

        What is different is Google need to tune the compression end, not just decompression. I also did that but just because it was fun!

        Faced with the same problem G chose the same old solution. Well known pattern recognition algorithms are blindingly fast, entropy coding never is without hardware assistance.

        1. Anonymous Coward

          Is that *the* Paul Shirley?

          The genius behind Spindizzy, one of my favourite games of all time?

          If so ... very many belated thanks!

          If not ... move on, nothing to see here.

    3. c3

      And they probably use FOR and IF in their code

      Just like I did in a HelloWorld application years ago. This must clearly mean I invented their algorithm and they have no merit whatsoever.

  5. JaitcH

    Another public donation from the allegedly 'evil' Google.

    Apart from the question Why, is is good to see that some of the spirit of early InterNet days where people helped others out, still active in Sourceforge, etc., especially from a large entity like Google.

    Likely it will help their competition, too, which is a sign of real philanthropy.

  6. Anonymous Coward
    Thumb Down

    Sucks faster

    I look at those compression ratios and think it's not worth the added software complexity. Throw in a few worst-case data patterns here and there and the compression advantage could vanish.

This topic is closed for new posts.

Other stories you might like