Re: Better defaults would be nice
"Better defaults would be nice"
You are absolutely right. Unfortunately in Hadoop land it's challenging for a number of reasons:
0) Hadoop is inherently a distributed system made of dozens of bits of software with no real single-node analogue, so you can't just lock everything down to a single interface and be done with it.
1) Security was bolted on long after the core product, and has taken something of a Unix approach of letting other (older/better written systems) do the heavy lifting. Authentication requires a Kerberos environment, proper authorization requires a user/group resolution system (e.g. SSSD to AD), encryption is arcane black magic configured through impenetrable XML (backed to an external keystore) and TLS is a clusterfuck of settings for the 20+ pieces of software and thousands of daemons that constitute a Hadoop deployment. You can't just "switch it on". There are efforts underway in all the major distros to kerberise by default (which would eliminate most of these attacks) but it's nontrivial, and ...
2) There's little commercial drive for the vendors to fix this. Like ES and Mongo there's little interest in your one man and his dog 1-3 node setups on AWS for your as-good-as-personal use, and everyone who is or could be a customer will be doing this properly already.
On the plus side, because the data in HDFS are immutable and the encryption system requires additional daemons and utterly incomprehensible configuration there's no chance of this evolving past vandalism into ransomware.