Re: Or perhaps ...
As someone who has dealt with on prem servers since ~1998 I never want to go back either (back to cloud that is). I was exposed to 100% cloud infrastructure for about two years across two different companies. Moved the second(current) company out almost 10 years ago exactly now(company was "born" in cloud from day 1), I think the day was Feb 23 2012 that we moved our first infrastructure out, and about July 2012 when we moved the last of it out.
DBAs have always been on my team, and generally always reasonable when requesting resources. We have tons of metrics so can generally 100% agree upon when something needs to change(I can even add CPU or memory or disk live to my systems with zero impact to the DBs). My team has always controlled administrative access to every system in the environment(all linux systems managed by Chef configuration management). Developers and others had no issues with that(except one guy, I told him he had to host his stuff on the IT side of the house if he wanted a windows server and have admin rights to it). Developers praised us on occasion for providing such a robust environment that just worked. Not a single server has had to be rebuilt as a result of hardware failure in a decade of operation. Prior to the ops team forming the developers (with IT's help) ran their own VMs that they managed and even they knew it was a shit show.
Speaking of databases I remember to this DAY (this was over 10 years ago!!) a phone call with amazon support about TERRIBLE database performance on our RDS at the time, I even took a screenshot and kept it all these years:
I remember a comment the support person said, they said oh we are getting great performance look at those IOPS, oh yeah 3,000 IOPS is good but look how much data was transferred... 200 KILOBYTES? Write latency over 150ms ? CPU usage maybe 5%? WTF IS GOING ON.
We were(at previous company) a "beta" tester for amazon's early "performance" EBS system. I forgot the technical term this was back in 2010, basically you get more IOPS with more space. The idea was interesting but the implementation(at the time) didn't work. I'm sure they've fixed that since though.
Back to cloud - the lack of reliability, the lack of in depth monitoring, the endless list of small failures, the forced reboots, the head scratching moments WTF is going on and why? Because of the variability and constant manipulation of the infrastructure drove me mad. The lack of ability to precisely size systems, the lack of ability to oversubscribe. The lack of control, the INSANE COSTS.
My former manager was talking with google cloud last year, and the cost for hosting our production databases(about 30 systems) was about as much as it cost us to run our entire datacenter operations(about 750 systems) - according to him I never spoke with the google people. That wasn't even taking into account the extra capacity we have waiting to be used(which could easily run another 500 systems). It's comical. 2-3 years ago we had a VP who wanted to go cloud(no reason other than to help his resume I think), we told him it was too expensive. He said "he had a guy" who can make the numbers work. Well ~6 months later the VP was gone didn't really hear about the concept again.
I have seen many people who have loved cloud stuff, those people also don't seem to care about the costs. Many others don't believe cloud (generally) so much more expensive than running it yourself.
The last company I was at I hit a wall in convincing the board of directors to move out of cloud, despite having CTO/CEO onboard, and the rest of the company really with a $1.6M savings in the first year of operations. But I left shortly after that, my (original) hiring manager at that company hired me at the next(current) company. Previous company collapsed a couple years later. Their cloud spend was upwards of $500k/mo for a tiny startup(maybe 100 employees?) at peak, more common was in the $200-250k/mo realm. Current company was about $80k/mo when we moved out.
Current company I'd say conservative savings has been $10M, more practical savings of over $15M over the past 10 years, that is with a peak of ~5 racks of equipment. Currently about 3.5 racks. Not talking super scale here.
I remember hosting a load balancing software called Zeus(at the time) in Amazon cloud because the ELBs were such pieces of shit. The cost of running Zeus (as an appliance distributed by the amazon store thing), which was CRIPPLED because it could only have a single IP address was huge. It alone would come to about $10-20k/year for a single system I think? That could pay for a real hardware load balancer very quickly(my current load balancers ran upwards of about 450 IP addresses(at peak) on several networks for various workloads and fail over within 1 second, Zeus as it used Elastic IPs took something like 20 seconds).
I'd be all in on cloud if it provided a superior experience(or at least equivalent - control my network, connectivity, end to end storage metrics down to the links and disks/ssds etc). That is offering the level of control and availability that on prem can offer (that includes data center facilities where everything is N+1 power/cooling). If cost was never a concern would help too.
Oracle said it pretty well at one point in the last year or so, they want their customers to go cloud because it makes Oracle so much more money.
One area where cloud can make sense though is SaaS. Abstracting most of the failings of the major cloud providers behind an application that is hopefully robust. But as we've seen with recent cloud outages even that can fall apart.
I've been told by multiple people over the years nobody can run a data center operation like I can(at least in their experience). So I am somewhat unique in that ability. It is sad that companies have yet to realize that even if they need 3-5 people on the team to do stuff it's likely going to be done far cheaper and better on prem.
The cloud marketing bullshit hype cycle is deafening though. People talk about "hybrid cloud" a lot. Current app stack I manage has nearly 20 micro services. There had been talk in the past about running some of those in public cloud to provide more scalability. Despite the fact we had no lack of capacity on site, people were clueless when it came to things like latency between services. Whether you are distributing an application across two different data centers or a data center and a cloud there will be a huge latency hit regardless unless that 2nd location is very close(say within 50 miles). But many people(for some reason) don't realize that. Some apps wouldn't care but most transactional ones would care a ton.
Most of the failings of the major cloud providers are BY DESIGN, and haven't changed in the past decade(not betting on them changing in the near future too).
But as with anything, you can do things on prem very poorly, and you can do things in cloud very poorly. You can do things in both very well in rare occasions.