Um, if there's little I and O, then you can compute Mandelbot sets, but not do much useful work.
You can't fix all that with cache - it slows down due to address computation the larger it becomes, which is why there are multiple (each one slower) layers already - for that very reason.
The glory of a general purpose computer is basically if() - it can do something different based on the input. That's why there's branch prediction and translation lookaside and all that - and nope, if we use a computer for a computer, and not just a glorified single purpose machine (where all the programming is hardware, like a totally pipelined FPGA perhaps) then that cache slowdown with size issue comes back, and as the original poster (Alan) said, it's all down to bandwidth outside the CPU being the limit - which has been true for quite a long time now. And trying to overcome that by keeping more inside the CPU with predictions and multiple choice pipelines to precompute just in case IS THE VERY REASON FOR THE SECURITY ISSUES. And would never have been bothered with if the latency to main ram and the other speed parameters weren't the limit.
Duh, as Drs mentioned, words do mean something.
What's the use of a computer that produces no output?
Go watch, or better yet, edit a video...it's nearly ALL IO other than encoding/decoding, which in all modern cpus or gpus have their own bit of logic, and in Intel - and NVidia - have embarrassingly not actually gotten faster in a long time. A gtx 1050 is as fast at that job as the newest shiny that costs many times as much...