Reply to post:

This storage startup dedupes what to do what? How?

Vic

it is reasonable to assume that when two blocks have the same checksum, they are in fact the same block.

That is not reasonable.

An m-bit hash may take 2^m different forms. An n-bit block may take 2^n different forms. So if n > m, there are 2^(n-m) different blocks that would generate the same hash. Nothe that n<=m is an entirely useless operation, so we have this risk.

This is the Counting Argument. It's used to disprove "magic" lossless compression algorithms that will store anything in nothing...

Vic.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

SUBSCRIBE TO OUR WEEKLY TECH NEWSLETTER

Biting the hand that feeds IT © 1998–2020