Re: Very good points!
At least the viability of embedding discrete memory modules into the CPU package has been proven, e.g. eDRAM. SRAM operating at core speed is just too costly (in terms of die real estate, which translates directly into costs). I believe some *Lake stuff had as much as 128MB of the stuff. While nowhere near as fast as core-speed SRAM (I'm seeing numbers in the 10s of GB/s), it's still a huge improvement over accessing external memory; it keeps the latency penalty for cache misses far more reasonable (perhaps half the latency), as can be seen from the graphs here:
https://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested/3