back to article Dotcom's Mega smacks back: Our crypto's not crap

Kim Dotcom's comeback cloud storage service, Mega, has responded to criticism about its approach to cryptography and password security after security researcher Steve Thomas (@Sc00bz) released his MegaCracker tool, which cracks hashes embedded in emailed password confirmation links. In a blog post designed to reassure users, …


This topic is closed for new posts.


  1. NoneSuch Silver badge

    There is a major difference between Dropbox and Mega.

    I am a Dropbox customer.

    1. Anonymous Coward
    2. Andy Gates

      So you trust Dropbox not to read your stuff because... they're just not an obnoxious wide-boy?

      1. Chris Harden
      2. Anonymous Coward
        Anonymous Coward

        No, I trust them because I KNOW they can't see it after having it encrypted myself (transparently and automatically) before uploading it to them, by a software that has nothing to do with them. Try looking up "encfs" some day...

    3. Psyx

      Great spin

      "Mega was already catching up with Dropbox in daily usage."

      That's.... so stupid it's not even worth mentioning. That's like saying increasing a number from 0 to 1 makes it closer to infinite.

      If I open a new shop and it sells a single spork in the first week of trade, then I'm 'catching up' with Amazon.

      1. Brewster's Angle Grinder Silver badge

        Re: Great spin @Psyx

        "That's like saying increasing a number from 0 to 1 makes it closer to infinite."

        I hate to point this out, but 1 is exactly the same distance from "infinity" as 0 - for a given choice of "infinity".

        1. Jolyon Smith

          Re: Great spin @Psyx

          "I hate to point this out..."

          You didn't NEED to point it out. That was the whole point.

          1. Brewster's Angle Grinder Silver badge

            Re: Great spin @Jolyon

            Well, if you're right about Psyx's point then the second example conflicts with the first.

            Sales of 1 are effectively zero when compared to Amazon. But they're still a finite distance away, and if I keep increasing my sales by one I will eventually get on par. Adding 1 to 0 makes me absolutely no closer to infinity.

            So Psyx's first example is about absolutes. The second about practicalities. They're very, very different things and make very different points. So either sloppiness or hyperbole.

    4. Levente Szileszky

      My hovercraft... full of eels.(*)

      (*: yes, I can actually speak Hungarian.)

  2. Anonymous Coward
    Anonymous Coward

    Love to hate

    It seems to me that this is an excuse for Security bods to try and make themselves look good by throwing around accusation.

    Personally, if Mega are telling the truth, their securiyt looks better than most.

    1. FartingHippo
      Thumb Up

      Re: Love to hate

      Agreed. I might not have the same level's of security as a Swiss Bank, but then Mega is not intended for the storage of billions and billions and billions of Francs/Pounds/Euros/Dollars.

      As long as the resources required to break in are several orders of magnitude more than the value of the encrypted data, it's good enough. And remember, even if there is some juicy stuff on Mega, it's still swamped by crap by a very large ratio (needle in a haystack, etc).

    2. Anonymous Coward
      Anonymous Coward

      Re: Love to hate

      Exactly what I thought. I also don't think anyone peddling "Convergence SSL" (a pretty piss poor, half-baked attempt at crudely patching up a sinking rust-bucket) is in any position to be slinging mud around.

  3. Fuzz


    I don't get it, how do you dedupe encrypted files when they're encrypted with different keys?

    1. Matt 21

      Re: dedupe

      Calculate a hash key on the client before encrypting?

      1. FartingHippo

        Re: dedupe

        Or chunkify(TM) the data such that, given enough input data, several chunks will match but will decrypt into different data depending on the key. Reassemble files on the fly from a chunk index or chindex(TM).

      2. jubtastic1

        Re: dedupe

        Well I got burned for suggesting they simply chunked the encrypted files and looked for matches on the last post, but seems to me that the other suggestion that they hash the files before they're encrypted isn't going to work either, as mega are saying they're not keeping any keys and user b's key isn't going to decrypt user a's file even if they were the same before uploading.

        Here's my optimistic thinking: The majority of the files that get uploaded to a service like this are already tightly compressed, and that the encrypted versions of these files are going to be bigger and a sliver more compressible, and that given the scale they expect to be working with, any dedupe, no matter how minor is going to reduce costs.

        here's my realistic thinking: The announcement that they will be able to change your password and re-encrypt your files suggests bullshitting, if all files can be decrypted with a users private key and mega's master key then everything they've said they're doing is possible except the lie about not having access to users files.

        1. 01508

          Re: dedupe

          I beleve dedupe is for encrypted files copied between users on the server. It would also allow easy takedown of all copies of a file identified as infringing

      3. Kevin McMurtrie Silver badge

        Re: dedupe

        Hash before encryption is it. Nobody will know what is in your original and personally created data but the hash matches will allow for reverse lookup of known files. Very small files could be brute-force decoded. It's not great privacy.

        Big hashes do create false positives sometimes so there can be data loss. Sure, it's a chance of 1 in an nearly infinitely big number, but the amount of data in the world is nearly infinite too. Math says that a smaller number of bits can't represent all the patterns of a larger number of bits.

      4. Anonymous Coward
        Anonymous Coward

        Re: dedupe

        In which case mega know what you've uploaded thus invalidating their assertion that it's impossible for people to find out what files you're sharing.

    2. Steve the Cynic

      Re: dedupe

      You dedupe the encrypted data, so if you and I both upload a 1MB file, and the encrypted versions are exactly the same (significantly less likely than the cleartext versions being identical, I know), they get deduped. And you can raise the chances of successful deduping by doing it on chunks rather than whole files - now our files are different when encrypted, but the third page of (say) 4KB of mine matches the 100th page of yours, so those pages get deduped.

      1. frank ly

        @Steve Re: dedupe

        With 4KByte chunks of what is essentially random (encrypted) data, the probability of finding an identical chunk is even less than 1/2^(8*4000) in theory. This is vanishingly (and ridiculously) small. That can't be the technique that Mega uses.

        (I realise that in practice, no chunk will be all zeros or all ones, etc; but it will still be a tiny probability.)

        1. Annihilator

          Re: @Steve dedupe

          A chunk has just as much chance of being all zeroes or ones as it has being any other combination :-)

          However, I'm not convinced dedupe works on such random data (which in effect it will be). The key for reconstituting the deduped data would end up being the same size as the original data to begin with.

          I've recently deduped 3TB of storage down to 2 bits, just a 1 and a 0. Reindexing it all is going to be a bit of a chore though...

        2. Anonymous Coward
          Anonymous Coward

          Re: @Steve dedupe

          I think that's what the flatulent herbivore was getting at with "Or chunkify(TM) the data such that..." i.e. make the blocksize sufficiently small (a byte or word perhaps!?) and there'll be a good spread of matching blocks in even rather small datasets. As long as the chaining method isn't TOO clever you should be able to use simple heuristics to match the patterns even though the actual data forming those patterns wouldn't match. Bit of a crypto disaster under normal conditions but perhaps advantageous here? Trivial to defeat though... compression or, better, pre-encryption with password as filename leap to mind.

          Personally I think it's all a Dotcom get-out-of-DMCA bluff though. In that I doubt he's done anything like that at all... I expect the statement is just there for the lawyers and means something like "if you can demonstrate a match, I'll happily delete offending files (but you never will, 'cos your keys will differ :P)"

      2. Eddie Edwards

        Re: dedupe

        "the third page of (say) 4KB of mine matches the 100th page of yours"

        This doesn't work. The chances of getting any matches is astronomically small even for 16B chunks. For 4KiB chunks it's as close to zero as you will ever get in any practical measurement situation.

        My money is on the impossibility of doing what he claims they're doing. If the encryption method obeys certain constraints it is possible, but those constraints *seem* to imply a trivial plaintext attack revealing the key. Strong crypto algorithms don't have trivial plaintext attacks revealing the key. I look forward to a real cryptanalysis of the claim, but most cryptographers appear to have the same gut feeling as I do. *If* he has cracked this problem, someone in his organization is a genius, and that seems less likely than that he's simply lying his ass off.

    3. bonkers

      Re: dedupe

      The way to de-dupe as suggested above is to run a hash of the original, say SHA-1 at 256 bit length and store it "in the clear" . This does allow an identical file to be matched to yours, and copyright owners could make a rainbow table of all their stuff and detect copies, but, it only takes one bit to be different, say in the metadata, and the hash will be greatly different. This same property will cause any attempt at de-dupe to fail also since it needs bit-identical duplicate files.

      Perhaps a method a little like Shazam's - taking a "fingerprint" of the file before encryption, so things that sound similar measure similarly. A corresponding process for video might look a the overall structure of the compressed video, its entropy versus time or something - again allowing similar fingerprints to be matched.

      Of course, these approaches allow the rights-holders to trawl through the hashes (that would be extracted under court order) and identify stuff that looks a bit like theirs. Proving it is another matter - for that you need to be given a key, so for it to work as a file-sharer service then Mega must never own the keys, you have to ask the folder owner each time. So what if big copyright set up a load of shill accounts?

      1. Androgynous Cupboard Silver badge

        Re: dedupe hash-before-upload

        You can't hash before encryption for dedupe. That will allow you to identify identical blocks, yes. But if your block is deleted as a dupe, you can't decrypt the other copy as it's encrypted with someone else's key.

        Either there's no deduplication and it's in the license agreement just-in-case, or there's a per-block master key which is accessible to multiple users. And if that's the case, your data is no more secure than on dropbox.

        Either way Mr Schmitz, convicted fraudster, is talking shit. Which shouldn't come as much of a surprise.

    4. elsonroa

      Re: dedupe

      @Fuzz The trick here is that the files are _not_ encrypted with different keys. Each file has a per-file symmetric key which is generated when the file is first uploaded. When the uploader wants to share the file, they share this key using PKI to protect it. Since the PKI transaction is all done client side, Mega have no way of intercepting the per-file key and decrypting the files - but do end up with two files on their system which have the same contents and the same key which can therefore be deduped.

      As for the no password recovery - the whole point about this system is that Mega _never see_ the password to a user's master key because it is all generated client side. The fact that they can't do password recovery is actually a good sign here (modulo the entropy issues).

      Whatever you might think of Kim Dotcom, I can't help thinking that he's got some smarter people working for him than many of the self-appointed security experts who seem incapable of understanding these basic points...

    5. Gordan

      Re: dedupe

      The only way there could be meaningful deduplication is to use a scheme broadly similar to Freenet and Entropy. You split the file into blocks of equal size, compute the hash of the block, and encrypt the contents of the block with that hash. You end up with a bunch of encrypted blocks, and an equal of hashes of plaintext that can be used to decrypt those blocks. You take those hashes and you encrypt them with the user-provided symmetric key.

      So each "file" consists of a number of encrypted blocks and a key chain to decrypt the blocks and glue them back into the original file, and the key chain is encrypted with the password that only the user knows.

      The problem, of course, is that it is not beyond a powerful attacker to enumerate all of the files they believe they own copyright to, chunk the files in exactly the same way, compute the hashes, and encrypt each block with the hash. There is a possibility that they could then persuade a judge somewhere to produce a court order that demands that the following specific blocks of cyphertext and all files referencing those blocks be deleted and the owners of the accounts containing the files identified. In other words, a sufficiently well resourced entity could relatively easily identify the files and still issue takedown notices, it would just take more computing resources to do so compared to simply searching the metadata for file names.

      Of course, if it were properly encrypted, this couldn't be done - but the data would also be completely undedupeable and uncompressible. If Mega really does use deduplication, I rather expect they might regret it.

      Of course, the reality is somewhere inbetween. Mega are unable to decrypt the content, and they definitely don't own the rights to the content, which means that they would have to engage in piracy in order to police the content - so in theory, they might be off the hook for not policing the said content. OTOH, if the well resourced rights owners check the contents and hashes of most of the versions of their content that is pirated, they can provide enough identifying information on the file blocks to issue takedown notices. It shifts the policing burden toward the copyright owners, which is probably all the goal was in the first place. From there on the copyright owners can go after the users as they could traditionally - business as usual.

      Thinking about it, Mega would have probably done better if they just kept quiet about the deduplication features.

      1. Justin S.

        Re: dedupe

        @Gordan That's one way. The other possibility-- perhaps mentioned in someone else's comment; quite a lot of chaff has been posted with the wheat-- is that deduplication is enabled but effectively applied on a per-user basis.

        That is, if we accept that user data is being encrypted with the user's master key, and that only that single instance of the encrypted data is being stored by Mega (e.g. a second copy, encrypted with a Mega-owned key, is not also being stored), then the only *likely* instances of duplication the system will see will come from the user him/herself, either in the form of entire duplicate files or identical data chunks within those files (assuming the data chunks are encrypted independently of each other).

        Data savings might be large enough to justify this, if we consider that there is a possibility for users to maintain multiple copies of the same music file (for example), either as identical tracks from different albums or as part of playlists. Yes, I know it is much more efficient to maintain playlists as text files pointing to member tracks, but it's often more convenient to copy the playlist tracks to their own directory. Of course, metadata for the tracks will probably be different-- different album names, publish dates, etc-- so deduplication is only likely if independent encryption of data chunks is performed.

    6. Anonymous Coward
      Anonymous Coward

      Re: dedupe

      It shouldn't be possible to de-dupe properly encrypted data. It's equivalent to compressing a load of random bits, which can't be done. De-dupe will only work if there's a security hole and the bits aren't random.

  4. The FunkeyGibbon
    Thumb Down

    Yawn. Bored with 'Mega' now.

    Not only is it a shit name for a service but frankly if I wanted cloud I'd go with somebody who doesn't have an ego the size of North America. When the guy running the business is bigger news than the products that's a worry.

    1. Dave 126 Silver badge

      Re: Yawn. Bored with 'Mega' now.

      >it a shit name

      You must have been a SNES owner : D

      1. Giles Jones Gold badge

        Re: Yawn. Bored with 'Mega' now.

        Or he watched Megaforce as a kid (the inspiration for Team America).

  5. Jason Bloomberg Silver badge


    Such deduplication ought to be impossible if Mega truly didn't know the contents of uploaded content, according to critics.

    If A+B => X and C+D => X there seems no reason they cannot say X is the same and deduplicate without knowing anything about A, B, C or D.

    "Knowing that two files are the same, even without knowing the content, nevertheless leaks information about the data".

    Does it leak any useful or usable information though? I suspect not. If it does then surely the fact I have an encrypted file already means I can theoretically know every other file that could encrypt to the same end result.

    1. Amonynous

      Re: Deduplication

      Given that that the main use case that got the predecessor service shut down was sharing big content's precious assets, it is reasonable to assume that is the main use case of the all new service. If it isn't why so much effort aimed at saying to the law "we don't know what's in the files"?

      So you can guarantee a high level of de-dupe efficiency because everybody is uploading the same stuff and knowing what it is, or you can hope for some lesser degree of de-dupe based on a chunk/block level process and remaining ignorant. A smart person would go for the latter, but the thrust of the article is how incredibly naive/dumb/reckless these guys are (no password recovery process, really?). It wouldn't be much of a stretch to think they may be doing something far stupider that does allow one to de-dupe based on the unencrypted content in the interests of saving costs to pay the bail money.

      Even if it is de-dupe after encryption, any decent forensic investigator would be able to join the dots by looking at patterns of usage of shared folders/keys stitched together with IP address logs to track the *really* popular stuff being uploaded, downloaded and re-uploaded again. I suspect it would not take too long to provide sufficient evidence for the big content lawyers to have a once-more-round the block with this guy.

      Really, the whole thing is the cloudified equivalent of a two-year old covering their eyes and thinking they are invisible because they can't see you. It'd be funny if it wasn't so tragic. No wait, it is just funny.

    2. Valeyard

      Re: Deduplication

      Does it leak any useful or usable information though? I suspect not. If it does then surely the fact I have an encrypted file already means I can theoretically know every other file that could encrypt to the same end result.

      If I made a film and I want to see everyone who has it, surely I just upload a popular torrent version of my own film and let the de duping software flag up everyone else who uploaded the same file to give the feds a basis to start on

      or something like that anyway..

  6. Pat 11

    Could it genuinely not be for file sharing?

    Lots of people want to assume the worst of them, but how would a pirate share files through Mega? They'd have to give away the password to the storage account.

    1. DJ Smiley

      Re: Could it genuinely not be for file sharing?

      You have sharing keys - keys which decrypt just whatever you've shared with whomever you give the key to.

      However it doesn't seem to specify anywhere if you can give the same key out multiple times (print it on a website) or if it's on a per user basis (so someone writes a script to do it for you).

    2. Pie

      Re: Could it genuinely not be for file sharing?

      Once a file is uploaded, clicking on it gives you the option of getting a link to that file, visiting the url offers you the chance of downloading the file or importing it (I presume to your own mega account)

  7. Jeff 11

    You can certainly dedupe encrypted data if it's a copy of the same file uploaded into the same account, but the recurrence of an encrypted block of data of any appreciable size is infinitesimally likely. So either Mega's using encryption that's somehow dedupe-friendly (i.e. insecure), their dedupe feature is just crap, or they know more about your data than they should.

    It's little wonder people are deriding Mega's marketing as disingenuous, at best.

    1. Ole Juul


      I think that was indeed a marketing mistake to mention that. It's probably not even important in the overall scheme.

      1. Anonymous Coward
        Anonymous Coward

        Re: dedupe

        "It's probably not even important in the overall scheme."

        Their business model relies on a third party uploading files that neither they, nor Mega, have the rights to and then selling Mega users access to those infringing files by the MB. Their previous business got raided. If they want to attract pirates to their new business they need to make them feel secure in the knowledge that they won't be caught should the new business get raided as well. They also need to convince the feds that this time they really don't know if a file infringes somebody's copyright. Hence the 'ZOMG, we have encryption!' spiel.

    2. David Neil

      They won't de-dupe the whole file

      Split each file into 4k blocks say - even after encryption your going to hit some duplicates

      1. Chris Harden

        Re: They won't de-dupe the whole file

        Assuming they actually de-dupe the data right now of course, it might just be in there to give them the oppotunity to dedupe in the future (however they decided to do it) without getting everyone to re-agree to the T&Cs.

        If I were going to be doing a file hosting service of that size, I'd certainly want the oppotunity to save space at some point in the future.

      2. Tom Wood

        Re: They won't de-dupe the whole file

        even after encryption your going to hit some duplicates

        Not any time soon you're not.

        4kB = 4*1024*8 = 32786 bits. Not 32786 possible values, 32786 bits. So basically you're flipping a coin 32786 times, repeatedly, and hoping you get the same pattern of heads or tails on multiple attempts.

    3. Jason Bloomberg Silver badge

      Marketing hype?

      the recurrence of an encrypted block of data of any appreciable size is infinitesimally likely

      I was thinking that, but if you've got enough data in small enough blocks the odds get better. I guess someone better at maths than me can work out those odds. They might be able to dynamically apply an additional level of encoding to make a file/chunk more likely the same, carry that around as metadata, which could improve the chance of a match.

      Dedupe or not; it doesn't make much difference to me as I really don't care how much disk space Mega are using or saving. Maybe they've got it and maybe it doesn't work very well in saving disk space. Not my problem.

  8. P_0

    The second line of concern arises from Mega's terms of service. These explain that the service "may automatically delete a piece of data you upload or give someone else access to where it determines that that data is an exact duplicate of original data already on our service". Such deduplication ought to be impossible if Mega truly didn't know the contents of uploaded content, according to critics.

    This doesn't seem right to me. AFAICS the concern would only be legitimate if Mega is talking about different users uploading the same file. But the same user, using the same encryption key, would generate the same message digest on encryption, meaning Mega could compare message digests of files from the same user and delete one if it is a duplicate. A sensible rule, possibly.

    1. Psyx

      What I think it might mean in real terms is that if copyright holders can be bothered, people can easily get caught out. Simply upload an unshared but commonly-pirated version of the file and wait to see how many people get shared user rights because they also uploaded it.

  9. amanfromMars 1 Silver badge

    The world is changing ...... get used to it. It is only natural

    It is most odd that there is all this fuss about possible dodgy content being securely uploaded and stored in Mega vaults and yet there is no hassle at all and no media and security attention paid to the physical equivalent which has possible dodgy goods and ill-gotten riches stored in secretive safety deposit facilities which banks offer to customers with no questions asked.


This topic is closed for new posts.

Other stories you might like