MIPS has a lot of oddities in it's design that were hard-coded from the early architecture. Branch delay slots and register timing are what I remember from the blog post below.
Apple could likely have bought MIPS when they were ready to go 64-bit, instead of using AArch64. Their M1 now beats Intel by several metrics.
Also, the top supercomputer is AArch64.
It looks like ARM really put some thought into enterprise performance, and removed similar scalability problems from Furber & Wilson's original ARM design.
"MIPS is the worst offender. It deliberately omits a feature which is so fundamental to CPU architecture that software people don't even think about it. The architecture leaves out the mechanism in the CPU pipeline which would otherwise stall execution until the data was ready. A register access which would have created a minor inefficiency on any other architecture instead creates a "hazard" on MIPS. You can read from a register before that register is ready. If you are writing or debugging MIPS code, you have to know how this works...
"Both SPARC and MIPS share another horrid feature - delayed branches. These create a dependency between instructions, in which the branch takes effect after the next instruction, rather than immediately. When using assembly code, you have to know which instructions have a delayed effect, and what rules apply to the instruction (or, sometimes, instructions) in the "delay slot" following it. The delay slot is restricted in various ways: for instance, you can't put another delayed branch there."