So basically it becomes a Catch-22. Information is useless if it cannot be accessed, and if it can be accessed, it can be accessed improperly, as there's no way to distinguish the real user from a well-disguised impostor: the ultimate "analog hole". And therein lies the hardest problem of information security: namely, that raw information can't distinguish between users, much as a piece of paper can equally by picked up by any human with applicable digits.
Developers of encrypted databases and security researchers are at loggerheads – and it's over a study that claims property-preserving encrypted databases may be vulnerable to attack. The researchers – Muhammad Naveed of the University of Illinois at Urbana-Champaign, Charles Wright of Portland State University, and Seny Kamara …
Tuesday 15th September 2015 17:28 GMT Michael Wojcik
That's ... not really the point of any of this work.
We have protocols for distinguishing between authorized and unauthorized users, but that's really not relevant to EDBs.
The whole point of PPE schemes and other techniques used by EDBs is that they allow some operations on the data while hiding some of the information in it. And, in fact, they demonstrate why vague, high-level mutterings about what "raw information can't distinguish" aren't very useful.
Tuesday 15th September 2015 03:39 GMT a_yank_lurker
Tuesday 15th September 2015 17:36 GMT Michael Wojcik
Apparently the commentariat couldn't be bothered to read the article, much less the original paper. Well, that's hardly surprising.
The root problem that Naveed et al demonstrate is a real one. EDBs depend on PPE (DTE, etc, is really just a PPE) schemes to encrypt in a way that leaves one or more attributes - such as the relative order of records - unencrypted. By definition and design, PPEs leak some information.
Naveed & co are saying, look, in the way you'd use these EDBs for a real use case (medical records), you're going to end up leaking enough information that an attacker can recover some of the original data with unacceptably high probability (primarily by correlating it with data from other sources).
Popa says "you're holding it wrong", and she has a point too; the CryptDB design makes some accommodation for that problem. Now the two sides are arguing over whether you can lock an EDB down enough to meet the requirements of your threat model, while still getting some benefit from the EDB features.
Of course, that is going to vary by use case and threat model. And it's not really a surprise. What's good about this new work is that Naveed et al have shown examples of practical attacks and their results, so people can weigh those against their threat model and decide how much tightening they need to do, and whether there's any point to continuing to use an EDB once they've done that.
Sunday 20th September 2015 13:05 GMT Anonymous Coward
Re: Reading comprehension
Of course I can't be bothered to read it. This reeks of academics coming up with clever non-solutions. If they'd ever been out in the real world they'd have known the answer is no, locking the db down sufficiently is not feasible, and they'd have found another pet project for their dissertation or whatever.
Data security in the medical industry is a joke. Nobody outside of IT has a clue. The only way to keep records private is to leave them out of EHR databases.
Monday 21st September 2015 16:03 GMT cryptoguy
The attack is not new.. Already shown in March 2014
Please note that the attack already written in http://www.ijiss.org/ijiss/index.php/ijiss/article/view/58/pdf_17 in "Section 4 Security & Efficiency of CryptDB" in March 2014. It basically says that "First of all, CryptDB is open to frequency attack where the adversary knows the frequency of the plaintext. Namely, if the RND layer is decrypted to the DET layer in EQ onion, then the frequency attack is possible to apply because of deterministic
encryption in the DET layer. In this attack, the adversary that observes the queries can determine the ciphertext simply by looking at the results’ row count. This attack can only be fixed by the RND layer, which has no usable functionality in practice. For example, assume that we have left part of Table 11, and its encrypted form is the right part in Table 11. By using the knowledge of the frequency, one can learn the corresponding plaintexts from the right encrypted part in Table 11. This issue can be solved easily by using random IV based symmetric encryption, however, this will prevent executing all queries. "