Good dedupe can solve the IOPS problem...
of all your VDs booting at the same time etc, as well as cutting down on the space you need for them. You just need a very good CPU doing the deduplication. And store user files and stuff somwhere else, so the bulk of the system disk is going to be common across all VDs, even if they have different patches and applications installed.
Here's one way to do it...
1) You need a fast network for iSCSI between hosts and storage. Let's say 10GbE to the hosts and multiple 40GbE to the storage server
2) Storage server needs plenty of PCIe bandwidth, fast CPUs, and lots of fast RAM
3) Storage server runs an iSCSI target that supports inline deduplication - e.g. Starwind. This uses system RAM as a cache, so you most important 40GB (say) of common data is in RAM
4) Your disks are a RAID of SLC SSDs, or maybe use them as cache for hard drives (e.g. LSI CacheCade)
In my small tests my network (just 1 x 10GbE) ran out of road before I could stress the CPU. I was booting up to 10 Windows VMs in the same time I could boot just one off a non deduped target - about 20 secs. Starwind's dedupe is still an experimental feature but should RTM around Q1.