Someone explain: What is the point of these new "features"? Just to differentiate CISC from RISC by making things more complexxx?
Sure, AVX, AVX2, AVX512 work on big data, chokers we used to iterate over. How much everyday stuff NEEDS this?
Cryptography? How often do I log-in or share secrets? On a server, how often does login/handshake traffic approach data exchange traffic?
Once the compiler is written, compiler time is cheap. Far cheaper than buy-time checkbox-checking or, as said, trying to get your Cloud provider to give you a specific instruction set. Make the compiler emit both dumb-i386 code and snappy AVX-xxx code. At start-up time, poke the CPU (don't trust BIOS flags too far) and flag code junctures to use one or the other path.
And what when the snazziest new "feature" turns out to be open to hackers? Then a first-fix can be to turn-off that feature.
I ran 8088 in a day when "big math" (log, trig) wanted a 8087 math coprocessor. The average 8088 program optionally (but often by default) compiled-in a 8087 emulator. Working like Long Division, it could chew a LOG in seconds. The 8087 would do it in milliSeconds but was a $400 chip; most of us did not do that much math. I finally did, for electronic problems. But 10+X the speed on 10% of instructions did not make the result a lot faster.
Yes, the 386 does seem "slow" today. I have seen an early '386 struggle for a minute to render a simple 1995 web page. But sometimes the job is not about speed. And there are still legacy CPU cores in ASIC libraries.
Don't tell me it bloats the code! Windows, bah. Linux is growing opaque and humongous init and flatpack bloat. A little bloat to run code on ANY 386+ processor is more than justified today.