bugs in malloc

Tue Nov 24 03:42:05 UTC 2009

On Monday 23 November 2009 14:55:26 Freeman Wang wrote:
> 1. We found with some special application, the application would get
> stuck at line 162 of malloc.c and the reason was mem->next points back
> to itself.

please try to reduce the allocation patterns of your 'special' application.  
it should be easy to enable debugging and capture the malloc/free sequences 
and run them again manually.

> It turns out, we believe, to be because new_mmb is allocated after the
> mmb list is iterated throught to find the insertion point. When the
> mmb_heap also runs out and needs to be extended when the regular heap is
> just extended, the mmb list could be messed up. We moved the new_mmb
> allocation up and the problem seems to have been fixed.

i dont see why the current code is a problem.  it's a singly linked list which 
means if the list is walked to the end, the new_mmb will be 'inserted' as the 
last item in the linked list.  prev_mmb points to the last valid entry in the 
list and mmb is null.  so the last valid entry will be updated to point to 
new_mmb and it will have its next member set to null.  i dont see any place 
where the mmb list 'could be messed up'.

if you look a few lines up, the recursive memory-full-issue should already be 
handled because a mmap for more memory is done, and that mmap is put into the 
heap by the heap free call.

> 2. While trying to fix the above issue, we read the code and found a
> multi-threading issue with this mmb list handling. This list is halfway
> protected in free.c and not protected by any lock at all in malloc.c. Is
> it intentional?

looks like the locking fixes we have in the blackfin tree werent pushed 
upstream.  i'll have to rebase them first, but it should at least partially 
cover what you see.  if it doesnt, i'll stitch in your pieces.

> 3. In an embedded world without MMU, it is not garanteed that the mmap
> syscall would always get back a valid block, and that's probably why the
> return value, block, is checked immediately after the syscall. But it
> seems we are not checking the return value of new_mmb which is allocated
> from the mmb_heap? Is it a potential issue?

you have no guarantee of mmap returning valid memory under a mmu-system 
either.  typically an oom situation will have an application crash quickly, so 
this particular missing check isnt a big deal, but it should probably still be 
added.  i imagine in a threaded situation, one thread could grab the fresh 
memory before the original thread got a chance to use it and thus got null 
back.
-mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.busybox.net/pipermail/uclibc/attachments/20091123/1d32b890/attachment-0001.pgp>