Yes security is an issue.
Yes, depending on your Hadoop vendor, the difficulty of implementing security varies.
But some of the comments, including the CSA section are pure bollix.
Posted Anon for various reasons..
Size isn't everything. Big data may be about storing terabytes or petabytes of information but it is also about complexity, and complexity often brings security challenges. Are you ready to handle them? Right now, someone in a marketing or finance role somewhere in your organisation is probably putting together a big data …
I agree that “they probably have no idea of the security burden it will bring” and will end up with a lot of sensitive data that will lead to a security crisis:
1. I think a big data security crisis is likely to occur very soon and few organizations have the ability to deal with it.
2. We have little knowledge about data loss or theft in big data environments.
3. I imagine it is happening today but has not been disclosed to the public.
There is unfortunately a shortage in Big Data skills and an industry-wide shortage in data security personnel, so many organizations don’t even know they are doing anything wrong from a security and compliance perspective.
So we need to take a data-centric approach to Big Data security and I agree to encrypt “data to help protect it from attack.”
But unfortunately Hadoop only offers file layer encryption. This approach with coarse-grained encryption is old school security and will not provide the needed balance between security, regulatory compliance and data insights, since the whole data file is either encrypted or decrypted and wide open to attackers.
We also know that “homomorphic encryption” is a very interesting research area but unfortunately not a viable solution any time soon.
I agree with CSA which “advises wrapping NoSQL databases in a secure middleware layer to shield direct access to the data.” since most Big Data platforms are lacking the security that we find in traditional database environments.
I think that new practical security approaches that provide fine-grained encryption or data tokenization are required. Today, vendors such as Teradata, Hortonworks, and Cloudera, have partnered with data security vendors to help fill the security gap. What they’re seeking is advanced functionality equal to the task of balancing security and regulatory compliance with data insights and “big answers”.
Ulf Mattsson, CTO Protegrity
Nice article.I sit on a Research Ethics Committee, and we are getting an increasing number of applications from researchers wanting to mine health-related data. So many come along with proposals that have not considered how easy it can be to reconstruct "anonymous" data so that the PII can be derived. I'll use those entropy figures the next time someone comes up and says "but there is no way our data can reveal the subject!"