Re: Too complicated is not the whole problem
same can be said for kubernetes. Too complicated for most orgs. Though that hasn't stopped the hype around that tech(yet anyway). I felt hadoop was similar as well back when it was at it's peak hype. I actually had a VP of my group at the time suggest that HDFS could be our VMware storage, we didn't need a SAN just run the VMs from HDFS. Company built a 100+ node hadoop cluster back in 2010 after I left (using hardware from a vendor I was against), I was told it took them over a year just to get the hardware stable(after suffering ~30% failure rate in the cluster over a extended period which resulted in ~50% of hardware being offline due to quorum requirements?). New VP came in decided on different hardware. They still struggled with writing correct jobs I was told, several people complaining why it was so slow, turns out in some cases the jobs were written in a way that prevented them from being run on more than one node. But at least they had the data, probably 15TB of new data a day. One of the folks I knew at the time was at a company which deployed hadoop as well but they had something like 500GB of data total. WTF why are you deploying that, he said they wanted it.
Some forces at my current org wanted Kubernetes. Not because it was a good tool for us but because it was cool. VMs were old school they wanted the fancy auto scale up and scale down. I knew it wouldn't work the way they expected. Spent at least 3 years I think on the project, even got some services to production. All if it was shut down last year when the main person working on the project left. Had tons of problems, one of which they spent 6+ months trying to resolve(ended up being a MTU problem on the containers that were built). Auto scaling didn't work as advertised(perhaps due to lack of performance testing, something I warned about many times but was ignored). Lots of kubernetes alerts saying oh hey I'm low on CPU, or I'm low on memory I can't start new containers. Look at the host and it has TONS of CPU and memory, in some cases there was 10GB+ of available memory. But because of bullshit like this bug open since 2017 (https://github.com/kubernetes/kubernetes/issues/43916), the systems complained regularly. Also had a problem with data dog monitoring where it would consume large amounts of disk i/o (upwards of 10k+ IOPS) took again months to track down they eventually found the cause it was running out of memory in the container(not sure why that would cause the i/o as there was no swap on the host) but increasing memory on the container fixed it. Data dog could not suggest to us how much memory was needed for monitoring X metrics so we just had to monitor it and keep increasing memory over time.
The complexity of the system grew even more when they wanted to try to do upgrades without downtime. The people behind the internal project eventually acknowledged what I had been saying since before they even started - it's WAY too complicated for an org our size, and offers features we do not need. So they gave up.
My container solution I deployed for our previous app stack(which was LAMP). LXC on bare metal hardware back in 2014. Took me probably 6 weeks going from not even knowing LXC existed to being fully in production running our most critical e-commerce app. Ran for 5 years pretty much flawlessly, saved a ton of $$ and really accelerated our application. I proposed the same solution even if as interim on our newer Ruby app stack but they didn't want it. Wasn't cool enough for them. I said fine, you can build your kubernetes shit and when it's ready just switch over. I can be ready with LXC for this app in a couple of weeks and we have the hardware already. But nope they wanted to stick to VMs until Kubernetes was ready. And of course it never got ready.