Using environment variables without leaking memory?

Tue Oct 24 19:51:35 UTC 2006

On Tuesday 24 October 2006 12:54 pm, David Daney wrote:
> > Except for the part where execvp looks at $PATH in a library function 
that's 
> > using getenv() behind the scenes, you mean? 
> 
> No I meant to call the system call execve() which knows nothing about 
> $PATH and as far as I can see does is not influenced by the caller's 
> environment in any way.

Yeah, but the one I was _using_ was execvp() and since there isn't an 
orthogonal "find executable in $PATH" function I now have to reimplement 
that.

> > it exists it's global, although I haven't implemented
> > "PATH=blah:$PATH ./thingy" yet and expect that will require careful 
sequencing 
> > but that _is_ an implicit post-fork export, but still pre-execvp)...
> 
> Basically you are saying you would like libc to take care of managing 
> the environment in a manner that will not leak memory so that writing a 
> new shell is easier.

It'd be darn nice, but I see now that it doesn't do it.

But I might submit a patch to fix up uClibc to do this, because it _is_ a 
solvable problem.  It does need a new function added, but oh well.

You start with a fixed number of non-freeable variables allocated by exec() 
outside of your heap.  All the others are allocated with malloc() and we can 
free 'em.  (At least in the setenv() case, passing a constant string to 
putenv() and then calling envfree() instead of unsetenv() is "pilot error".)

First thing to do is maintain a count of the number of unfreeable entries.  
This is initialized to the number of entries in __environ before we do our 
first modification of the environment (and we know which modification is our 
first because last_environ is null).  Whenever you add a new environment 
variable, put it at the end of the array (already the case).  Whenever you 
remove an environment variable whose index in __environ is less than this 
count, decrement the count.  Whenever you remove an environment variable 
that's greater than or equal to this count, it's something we added and thus 
something we could potentially free().  (Corner case: whenever you replace an 
environment variable that's less than this count, remove the old one and then 
add the new one to the end rather than updating in place.)

This keeps track of when you _can_ free an environment variable, but doesn't 
say when you should.  Just doing this blindly screws up existing programs, 
because what if somebody did a getenv() and kept the pointer around after 
doing an unsetenv()?  So we create a new function, envfree(name), which acts 
like unsetenv(name) but doesn't leak memory.  Then it's the caller's job to 
get the usage right.  (I.E. don't keep pointers from getenv() past the 
corresponding envfree(), and if you putenv() something that can't be freed 
use unsetenv() instead of envfree() to get rid of it.  Or just always use 
setenv() which creates a copy already.)

Not actually all that hard to implement, really.  Why people have accepted 
such a screwed up status-quo for 15 years (forget about Unix, why have the 
_linux_ people accepted it?) is an open question.

At the moment, I'm implementing my own xsetenv() and envfree() in toybox's 
library.  (It would be freeenv() except that the three consecutive e's are a 
bit much.)  I'll happily post them here LGPL when I've got 'em working, 
though...

> > So your suggestion for modifying the environment is "don't"?
> 
> Correct.

Not good enough. :)

> > Is this inherently broken and unworkable?
> 
> Assuming you are referring to the libc and not my idea, I have no idea. 
>   My suggestion was just off the top of my head.  My suggestion is of 
> course bug free :)

My implementation of an unrelated solution to the larger problem isn't yet, 
but I'm working on it...

> David Daney

Rob
-- 
"Perfection is reached, not when there is no longer anything to add, but
when there is no longer anything to take away." - Antoine de Saint-Exupery