Re: Server simplification, really?
Nonsense. If you look at how (since we're talking about servers) AMD's Epyc SoCs do I/O, you will see that there are for our purposes two collections of high-ish speed I/O lanes: those that can be a SATA lane or a PCIe lane and those that can be only PCIe lanes (I'm ignoring xGMI because either you need xGMI and will have it or you don't need it). That is, there is a 1-for-1 option between creating a lane of SATA and a lane of PCIe, which can be used as an NVMe transport or to connect a chip-down end device or to provide a standard (or nonstandard) card-edge connector. There is no difference in the cost of the SoC, the socket, or the traces.
That leaves the end device connectors, of which there are many different kinds, each with its own performance, cost, service life, lane count, mechanical attachment, and board space consumption tradeoffs. It's true that connectors can be fairly expensive, but if you want the most apples-to-apples comparison you might look at the M.2 connector you mention, because it comes in several different flavours that can support up to x4 PCIe, or x1 SATA, or either of those in the same connector (keys B and M are what you're after here). Those connectors are literally the same and there is no price difference between keying types. As a system designer building around an SP3 socket, you can provide an M.2 connector that supports only SATA, an M.2 connector that supports only PCIe, or an M.2 connector that supports either one for the exact same price. Note that supporting both requires either that you consume both PCIe and SATA lanes at the SoC or do some very unusual contortions to route the same lane(s) to both sets of pins and then let the operator choose how the serdes are configured.
The reasons PCIe is better than SATA are numerous: the author of this piece has stated that it's faster, but has implied that PCIe is limited to 20 Gb/s which it certainly is not; assuming he's thinking about a 4-lane interface that has not been true for over a decade. Currently shipping devices support PCIe4 which is 16 GT/s each direction *per lane*, and typical NVMe devices on the market support 4 lanes, meaning any gen4 x4 end device (NVMe or otherwise) can after the encoding overhead handle about 63 Gbit/s or just shy of 8 GiB/s. PCIe isn't 4x as fast as SATA, it's more than 12x faster and will be 24x faster once PCIe gen5 devices become available in the next couple of years. This ignores the overhead of HBAs and software drivers, which only add to PCIe's advantages...
But there's much more to this than performance, especially since even multi-actuator spinning disks are still, compared with SSDs, sloths wallowing in molasses. PCIe doesn't require an additional HBA (embedded in Epyc and most current PCHs/SBs, but it still requires transistors and software to drive it), it can scale up to 16 lanes per port, and it provides a mostly reasonable collection of error-handling and hotplug mechanisms. SATA is basically still just plain old IDE from 1990, designed from the start for low-cost desktops. It has never belonged in servers and still doesn't. The real competing interface for rotating media isn't SATA, it's SAS. SAS, like SATA, requires an HBA and software to drive it (and, sadly, firmware which is invariably buggy), but it provides far more throughput than SATA and a much more robust and flexible collection of data protocols, including support for PCIe-like switched fabrics, reliable hotplug, dual-porting (like NVMe), and much more. For a lot good reasons, PCIe is winning that battle, but SATA isn't fit to carry water for either one.