@Kenny Swan
Kenny,
There is an interesting article here: http://en.wikipedia.org/wiki/Transputer
To paraphrase; it talks about an early attempt (1980's) at parallel computers and gives a good comparison of what we have now.
I've no idea what your developer skills are so please forgive me if I sound patronising, it's not intended :)
Take a problem like applying a filter to an in memory bitmap.
For simplicities sake, let say that filter is really crap, all it does is read the last pixel in the X coordinate and the current pixel. It does a sum with the two values and writes out a new value to the current pixel. If the X pixel is at element zero, it does nothing.
You could write something like:
(I'm not very good at pseudo code, sorry)
For Y = 0 to Bitmap.X.Length
For X = 0 to Bitmap.Y.Length
.
do filter calculation on Bitmap[X,Y] and write new value to Bitmap[X,Y]
.
next X
next Y
(end pseudo code)
The CPU core cannot understand that your code is stepping through a two-dimensional array, reading two values and writing out a new one.
From its point of view, it is reading two memory locations and storing them. Performing a sum and writing out a value to one of those locations. It does not even know how many times this is going to happen (I think).
Now lets say that your not happy with the performance of this filter and it looks like a great task to multi-thread.
So you ask the OS how many cores you have any split up the bitmap into chunks to be processed by these cores. Lets say that you divide the bitmap into quarters with a bisection along the X and the Y coordinates of your bitmap (wrong).
You then write a routine that can be used by multiple threads:
Pseudo code again.
MyMethod(bitmap, startX, endX, startY, endY)
/* My method is passed a reference to the bitmap and told the pixels to start processing and where to stop */
For Y = startY to endY
For X = startX to endX
.
do filter calculation on Bitmap[X,Y] and write new value to Bitmap[X,Y]
.
next X
next Y
/* Spawn four threads, each running the method with different coordinates */
(end pseudo code)
Again, the process is the same as before, but this time all the cores are performing the same processing on the bitmap. Great.
But, there is a deliberate problem with the method I described. It seems that I divided the bitmap along the Y-axis. This means that the threads dealing with the right hand side of the bitmap will give an incorrect value for their left most pixels unless the threads on the left hand side have finished processing their right most pixels.
Had I created a method that split the bitmap into chunks along the Y-axis only, my algorithm would have worked.
Hopefully, this rather long illustration shows the problems encountered with multi-threading.
For a core to know that I relied on the value in pixel that, at some point, was going to be processed by another core and for it to wait until that core had processed the pixel would be impossible.
It takes the programmer to know how an algorithm can be multi-threaded, if at all.
A compiler may be able to work out this simple example, but I doubt that it could cope with more complex problems.
Still, get your head around multi-threading and your salary should increase!
Disclaimer. This is the best simple problem I can come up with. If someone wants to post a better example or link, please feel free to do so. I wont be offended. I don’t mind being corrected either!
Regards,
Pad