Divide and conquer - a less absurd multithreading
A followup to End of the threads: My extreme version of Multithreading Mandelbrot turned out to be less than perfect, after all. I experienced frequent deadlocks on other subsets of the Mandelbrot set, forcing me to forcibly reboot the computer. Very bad. I'm not exactly sure what caused those deadlocks but introducing a far better method of taking care of the data generated by the computing threads solved it.
Instead of letting each thread keep a file handle to the Targa file and write its own results directly to the file, I introduced a result queue to which the threads add their resulting point data, and from which the main thread consumes and writes out the points to disk. Using a simple mutex locking mechanism for this queue made it safe and sound, too.
After this little change I decided to implement a proper D&C algorithm instead of the totally dumb equal subdivision technique I used before. Thus, I now push the entire area on a stack before entering the main loop, and then inside the loop, I pop one element and handle it according to simple rules:
- If the area is larger than, say, 300x300 pixels, I divide it into four parts and push them back.
- If it's 300x300 or less and I haven't already done so, I start a check thread to see if it's fully inside the Mandelbrot set.
- If it has already been checked and found to be not fully inside the Set and is larger than, say, 10x10pixels, I divide it into four pieces.
- Otherwise, I start a proper worker thread that calculates each pixel in the area.
The effect is that the program never starts a thread to do exhaustive calculations of more than 10x10 pixels at a time. Also, if a certain, larger, area is not found to be inside the Set, it will be subdivided and re-tested, and one or more of those parts may actually turn out to be inside the Set, which means we don't really miss many larger bodies of black pixels.
This technique also meant that I could easily keep the number of threads down, to match the number of actual CPU cores. No point in running 1600 CPU-intensive threads on eight cores, if you can run eight at a time instead.
This is merely a new version of my first truly multithreaded program, but I feel pretty confident that I've hit on the perfect design. All I really need to do is tweak parameters.
Oh, and the result? The same huge image as before was now calculated in about 7h22m real time, using a total of 49h42m CPU time, which means it would be feasible to run it even on a single core CPU - thanks to the Divide & Conquer algorithm.
And now, on to colour!