threading/malloc/fork interaction -> hang

Stephen Warren swarren at wwwdotorg.org
Wed Oct 26 18:34:25 UTC 2005


I'm seeing a hang in our application that appears to be caused by a
uClibc problem. Currently, I'm using 0.9.27, but I don't see any obvious
changes in SVN that would fix this.

I believe the problem is as follows:

Some/all of our threads periodically malloc/free data. The
implementation of malloc/free will lock the heap, to prevent against
data corruption by other threads.

Another of our threads occasionally calls system(), which maps down to
calling fork()/exec() internal to uClibc.

In uClibc's implementation of fork() (or rather, the
libpthread/linuxthreads wrapper around it, in ptfork.c), the child
process immediately attempts to reset some pthread state
(__pthread_reset_main_thread), which ends up calling free() on some
manager data.

Now, if one thread is executing malloc() whilst another thread fork()s,
then the heap lock will be held in the parent process when the fork()
occurs.

Since the child process consists only of a copy of the thread that
called fork() in the parent process, no thread in the child will unlock
the heap.

Since the VM sharing is copy-on-write, when the parent process's thread
completes the malloc, and unlocks the heap, this lock state change won't
be seen in the child process.

Consequently, the heap lock is held in the child process and will never
be released, and hence when the child process executes
free(__pthread_manager_thread_bos), the child hangs, and the parent
process's thread hangs, since the wait() for the child (inside system())
never completes since the child never exits.

Looking at glibc 2.3.5, there is a somewhat complex set of atfork
handlers registered to work around this. Porting this to uClibc looks
non-trivial!

Should uClibc's __fork (in ptfork.c) just re-initialize the heap's lock
mutex to force it to be unlocked (registering a child atfork handler
won't work, since these are run after the __pthread_reset_main_thread
call, which hangs).

Or, should fork() and malloc()/free() both be made to acquire the same lock?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://lists.busybox.net/pipermail/uclibc/attachments/20051026/316ae362/attachment.pgp 


More information about the uClibc mailing list