Re: M:N was already there a while back
Tide in, tide out.
Popular modern programming languages, most notably Go, lean heavily on M:N models.
On the other hand C++ is looking to, ahem, standardize this model, so maybe it is doomed.
Really it is all about applicability, and avoiding kernel calls. The trap latency of modern processors is obscene, even though the architecture manuals mumble about a few cycles, that carefully dismisses making the write-fifo visible, cancelling all speculative results and branch predictors, and thanks to some benchmark-embarrassed cpu-designers, a bit of cache flushing. So after the trap, the processor trundles along like something from the 80s for a bit until it can get its funk on, only to go through the same throttling to get back to useful work. To paraphrase a brilliant processor designer "I shed a tear every time I see a syscall".
So, instead, adopt a virtual cpu model, with rapid switching between local threads on internal interlocks, and fall-back to a set of parked system threads for handling the laborious system calls. If you look at the google mutex handling code(*), it is at once as brilliant as it is frightening; especially when you realize how much of it is just to avoid system calls.
(*): https://github.com/abseil/abseil-cpp/blob/master/absl/synchronization/mutex.cc