reentrant functions

Sun Jun 8 15:35:22 UTC 2008

On Sunday 08 June 2008 16:21, Bernd Schmidt wrote:
> Denys Vlasenko wrote:
> > Ok, this is a scenario: the user runs a "passwd" utility on NOMMU box.
> > This utility has 20k of text and 80k of data. 70k of this data
> > is occupied by des.c static buffer.
> > 
> > At this moment the machine has only 90k RAM available, has no swap etc.
> > It cannot satisfy load request for this application.
> > So user gets some error message from the shell to this effect.
> > 
> > If des.c buffer is not static but is __uc_malloc'ed, program will
> > load succesfully, and then __uc_malloc will fail and exit
> > with "no memory!" message.
> > 
> > In both cases user's experience is essentially the same - [s]he cannot
> > run the program because there is not enough memory.
> 
> Except the program that called crypt may have left some state - 
> temporary files, partial modifications in other files, whatever - 
> because it did not expect to fail at this point.

If this program wants to not be killed by OOM, it needs to install
the handler for this situation. It is not difficult at all.

> Essentially your argument comes down to "it doesn't really matter what 
> the documented interface is". 

No, my argument is twofold:

1. Minority of people who are careful enough to handle OOM coditions
   correctly will not be deterred by the need to have a tiny speck
   of uclibc-specific glue:

   #ifdef __UCLIBC__
   static void uc_malloc_failed() {
       ....OOM handling......
   }
   #endif

   int main() {
   #ifdef __UCLIBC__
       __uc_malloc_failed = uc_malloc_failed;
   #endif

   because they will have much bigger infrastructure for handling OOM
   (emergency pools, swapping to disk (GIMP has this), whatever).

2. Most of the programs do not handle OOM situation correctly anyway:
   - take any sufficiently big program and you will easily find
     a malloc call or new operator in C++ which is not checked for NULL.
     For example, Mozilla:
     https://bugzilla.mozilla.org/show_bug.cgi?id=336152
     https://bugzilla.mozilla.org/show_bug.cgi?id=336154
     https://bugzilla.mozilla.org/show_bug.cgi?id=336155
     https://bugzilla.mozilla.org/show_bug.cgi?id=336107
     https://bugzilla.mozilla.org/show_bug.cgi?id=336157
     https://bugzilla.mozilla.org/show_bug.cgi?id=336151
     https://bugzilla.mozilla.org/show_bug.cgi?id=336150
     https://bugzilla.mozilla.org/show_bug.cgi?id=336145
     https://bugzilla.mozilla.org/show_bug.cgi?id=336143
     https://bugzilla.mozilla.org/show_bug.cgi?id=336141
     https://bugzilla.mozilla.org/show_bug.cgi?id=336140
     https://bugzilla.mozilla.org/show_bug.cgi?id=336122
     https://bugzilla.mozilla.org/show_bug.cgi?id=336119
     ...
     As you probably guessed by now, I was reading mozilla source and filing bugs.
     Rather quickly I realized that it's impractical to report all of
     OOM bugs, let alone fix then faster than new ones are made.

   - some people are more honest about it and they implement
     and document this behavior, rather than just being lazy
     and allowing their programs to segfault when malloc fails.
     For example, gtk/Gnome libraries do this. They have allocators
     which just exit on allocation error. Well, busybox uses xmalloc
     extensively. etc...

Which prompted me to think about more realistic approach
to this problem. Yes, from the POV of malloc (libc) author,
"lets return NULL on failure and let user deal with it".
But for the user, it's not good enough. In many cases,
user calls malloc deep inside the callchain, and it just
HAS NO SANE WAY TO REACT TO OOM at that place!

This prompted me to propose an alternative approach to Mozilla
people - see https://bugzilla.mozilla.org/show_bug.cgi?id=335951

Please read that bug description, it has quite a bit of writeup.

It's easy to get a solid proof that "return NULL" approach is flawed.
I started Mozilla under memory limit of 60 megabytes, opened
a few pages and it crashed. I repeated this test many times
and Mozilla NEVER FAILS GRACEFULLY. It always crash.

Try to do the the same test with any non-trivial program.
Let's see how many of real world programs will survive it.

Again, my point is: if the program dies on OOM because it
is badly written and does not expect NULL from malloc,
why we are avoiding malloc inside libc as a plague?

This is masochistic! malloc is not a bad thing to use!
It is actually a good thing, because it allocates memory
WHEN it is needed, and AS MUCH as needed for the particular case,
whereas static_buffer[WORST_CASE_SIZE] is always there and
frequently way bigger than needed.

But we don't want to be "bad guys", and our internal malloc use
should allow careful users to still handle OOM correctly.

Which it does! __uc_malloc() provides a callback pointer,
and careful user can hook a handler to it.
--
vda