Amdahl's law
Back when I was a puppy, and Kevlin was just a kitten, we used to attempt to shave time off tasks using Occam's razor.
Then Gene Amdahl explained that there's a tradeoff between the speed gain of parallelisation and the overhead of breaking the task into little bits and reassembling the answers.
I suspect this is the reason behind the apparently rather hefty recommended grainsize in a parallel_for loop.