Perpendiculous Programming, Personal Finance, and Personal musings

2008.08.06

I WANT MY F*ing Semaphores!

Filed under: Uncategorized — cwright @ 4:34 pm

Modern applications can often realize a substantial performance boost by dividing the work they have to do into logical groups that can be performed in parallel. Humans call this “Multi-tasking”, computer scientists call this “multithreading”. Not all workloads easily lend themselves to such divisions without a bit of effort, and even ones that endure the transformation well still require a way to re-group and coordinate once their work is done. In Computer Science lingo, these are called “synchronization points”.

Several years ago, before multi-core CPUs became popular, I spent a fair amount of time figuring out how this all worked (strangely prescient, huh? :), and wrote about my adventures here: http://softpixel.com/~cwright/programming/threads/

In general, synchronization points come in 2 flavors: mutexes (MUtual EXclusion locks) and semaphores (signals/flags). Mutexes are wonderful for keeping two competing threads from romping on the same data at the same time, while semaphores are marvelously well-suited for flagging asynchronous operations — when threads take vastly different lengths of time to complete, you can use semaphores to indicate each thread’s “doneness”, wait on them all, and poof, you’ve got everything nice and sane upon completion. When I was first beginning, mutexes were my favorite synchronization object, but as I grew, matured, and worked on non-trivial projects, I came to love semaphores for all their elegance, simplicity, and versatility. And, if you want to get _Really_ nerdy, you can even call mutexes a simplified semaphore, where the count is limited to 1. But I digress.

So today, I was tackling a problem where there 2 threads that needed to work somewhat in tandem: one thread produced data (either very quickly, or very slowly, with lots of variance) in discreet sized pieces, and one thread consumed that data once each piece was complete. The magic was that the producer could start working on the next piece as soon as it handed off the first piece to the consumer. It would then wait for the consumer to say “Give me more pieces!”, and it would hand off the already-done (hopefully) piece, and start working again immediately. The end goal, of course, was to have either the producer or the consumer running non-stop as the performance bottleneck, whereas the current model has them tied together serially (meaning no work is done in parallel). Splitting it adds complexity, of course, but offers the potential to cut the run time in half (realistically, it would yield a 10%-25% decrease, depending on the workload).

Then I got to the part where I needed to signal what’s going on between threads. Mutexes weren’t suited for the job at all, so I naturally chose to use semaphores. Then I discovered some magical OS X goodness that never seemed to catch my attention before:

First: unnamed posix semaphores are NOT implemented. At all. Meaning you have to resort to named ones, which means leaving litter on the harddrive and crap, not to mention filename management in addition to semaphore management. And don’t even get me started on the goofiness of this with multiple jobs going on simultaneously, and the ability for a job to be aborted from user input or from an error while running.

Second: The default policy for the unimplemented semaphores is ALWAYS UNLOCKED (meaning everything just takes off running, often to a cataclysmic segmentation fault with incomprehensible fallout).

I hear that this is because posix thread stuff is based on Mach thread stuff. that’s ok. But how difficult is it to add unnamed semaphore support? And why on Earth did someone choose to make them unlocked, rather than locked by default? This probably has to do with how they’re implemented (just an int in memory space), and how there’s a 4,294,967,295/4,294,967,296 chance that it’ll appear unlocked (or, a 2.32×10^-10% chance that it’s locked), but still, how hard would it be to make the “not implemented” stub zero the value so that everything stops, and the programmer investigates the locks rather than memory management (which is usually where one looks when stuff explodes strangely, like in the case of locks not doing anything…).

Even more annoying, is Cocoa’s apparent lack of semaphore-like objects (NSLock is a mutex, NSConditionLock is a joke), and Apple’s decree at WWDC to pervasively multithread applications from here on out to give them a performance edge. Kinda difficult to do when the tools that make this work easy are non-extant, and non-functional…. 🙁

*sigh*

2 Comments »

  1. MPCreateSemaphore is your friend.

    Comment by Valda — 2010.03.03 @ 11:10 am

  2. It seems that MPCreateSemaphore is not an option going forward. It has been deprecated from 10.7 onwards (See: https://developer.apple.com/library/mac/#documentation/Carbon/Reference/Multiprocessing_Services/Reference/reference.html)

    Comment by Brian Matthews — 2012.04.11 @ 5:53 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress