Winding paths of history...
I recall being intrigued back when (90s?) reading GO Corp PenOS doco the the application's text was executed in-place on the devices non-volatile storage ie it wasn't copied into ram.
I don't imagine these devices had demand paging or copy-on-write, or indeed a mmu at all, but I assume the non-volatile storage was roughly as fast as the processor or a fairly large ram cache.
An early australian 8 bit z80 pc, the Applied Tech. Microbee used 6116 cmos static ram which could be battery backed to provide non-volatile storage.
I always thought Multics approach of making everything a segment was an idea worth doing properly - 64kb segments are just plain silly but 2^64 byte segments could fly. I imagine mmap(2) took part of its inspiration from this.
I suspect flat address spaces are pretty much the rule partly because its the unix way and the complete dogs breakfast the 80286 made of segmentation. Although I think at one stage on x86_32 hardware you could run the linux kernel in one segment and the user space in another. I assume the intersegment call were slow so was a space v time trade off.