"...the cloud provider charges for the shaded area only,"
This isn't really strictly true, though is it? Amazon do not charge you based on the compute-cycles that you use. They charge you based on the number of instances you ask for.
For massively-scaled applications, which are integrated with the EC2 tightly enough to provision and release instances depending upon demand in real-time, one might be able to realise a concept a bit closer to 'pay for what you use'. However, in order to get the good pricing, you must predict and reserve the number of instances you require, which detracts from this dynamism.
There are undoubtedly widely varying different types of workloads living in Amazon's cloud. For some, it will be possible and appropriate to scale on demand. For others, it will be necessary to reserve a baseline of instances. There will always be a trade-off between the better price on a reserved instance vs. the chance of under-utilisation.
For many workloads which serve interactive business users' needs, though, particularly at the small-to-mid-scale, there will be a fairly fixed number of instances necessary to provide a particular service. It might be possible to allow for known peaks in demand at certain times, e.g. log-in rush, but there are workloads that really don't lend themselves well to dynamic scaling. It's important to point out that 99.9% of businesses in the UK are SMEs, and their adoption of cloud will be the benchmark by which me measure how mainstream the technology is becoming.
For a cloud offering to be truly "elastic" the user would only pay spot prices for the cycles/memory that they use at any given instant. These prices would probably vary over time as global demand changes. Of course there are a number of challenges to doing such a thing, which is why none of the major players have managed to offer it as yet.