[git commit prelink] linuxthreads.old: fix nommu initial thread stack detection

Mike Frysinger vapier at gentoo.org
Wed Mar 30 11:53:36 UTC 2011

commit: http://git.uclibc.org/uClibc/commit/?id=7a583ea370974998b4584595b9a4088fc070df1f
branch: http://git.uclibc.org/uClibc/commit/?id=refs/heads/prelink

Because the nommu address space is flat, and the application stack can
literally be located anywhere, we cannot rely on the assumptions that the
mmu port gets away with.  Namely, that the first thread's stack lives at
the top of memory and nothing will be created above it.

Currently, the code rounds the current stack up a page and sets that as
the "top" of the stack, and then marks the "bottom" of the stack as "1".
Then as new threads are created, this assumption is further refined by
slowly backing off the "bottom" when a new stack is created within the
range of the initial stack.

Simple ascii example (tid0 is the initial thread):

1 thread:
 [bos          tid0 stack           tos]

2 threads:
                 [     tid0 stack      ]
      [tid1 stack]

3 threads:
                 [     tid0 stack      ]
      [tid1 stack]
                                            [tid2 stack]

As you can kind of see, this algorithm operates on one basic assumption:
the initial top of stack calculation is the absolute top of the stack.
While this assumption was fairly safe in the original nommu days of yore
where the only file format was FLAT (which defaults to a 4KiB stack --
exactly 1 page), and memory was fairly tight, we can see that this falls
apart pretty quickly as soon as the initial stack is larger than a page.

The issue that crops up now is simple to hit: start an application with
an 8KiB stack, execute some functions that put pressure on the stack so
that it exceeds 4KiB, then start up some threads.  The initial tos will
be rounded up by a page, but this is actually the middle of the stack.
Now when the initial thread returns from its functions (thus unwinding
the stack) and tries to call something which calls back into libpthread,
the thread_self() func fails to detect itself as the initial thread as
the current stack is now above the tos.  The __pthread_find_self() func
kicks in, walks all the thread arrays, fails to find a hit, and then
walks into uninitialized memory for the thread descriptor.  Use of this
garbage memory has obvious results -- things fall down & go boom.

To address this, I extend the current algorithm to automatically scale
back both the bottom and the top stack limits of the initial thread.
We use the current stack pointer at "thread boot time" only as a single
known point.  The initial thread stack bottom is set to the bottom of
memory and the initial thread stack top is set to the top of memory.
Then as we create new stack threads, we figure out whether the new stack
is above or below the single known good address, and then scale back
either the tos or the bos accordingly.

Reviewed-by: Steven J. Magnani <steve at digidescorp.com>
Signed-off-by: Mike Frysinger <vapier at gentoo.org>
 libpthread/linuxthreads.old/internals.h |   24 ++++++++++++++++--------
 libpthread/linuxthreads.old/pthread.c   |   23 ++++++++++++++---------
 2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/libpthread/linuxthreads.old/internals.h b/libpthread/linuxthreads.old/internals.h
index 637fcea..110dd9d 100644
--- a/libpthread/linuxthreads.old/internals.h
+++ b/libpthread/linuxthreads.old/internals.h
@@ -252,17 +252,25 @@ extern pthread_descr __pthread_main_thread;
    Initially 0, meaning that the current thread is (by definition)
    the initial thread. */
-/* For non-MMU systems also remember to stack top of the initial thread.
- * This is adapted when other stacks are malloc'ed since we don't know
- * the bounds a-priori. -StS */
 extern char *__pthread_initial_thread_bos;
 #ifndef __ARCH_USE_MMU__
-extern char *__pthread_initial_thread_tos;
+/* For non-MMU systems, we have no idea the bounds of the initial thread
+ * stack, so we have to track it on the fly relative to other stacks.  Do
+ * so by scaling back our assumptions on the limits of the bos/tos relative
+ * to the known mid point.  See also the comments in pthread_initialize(). */
+extern char *__pthread_initial_thread_tos, *__pthread_initial_thread_mid;
-	if ((tos)>=__pthread_initial_thread_bos \
-	    && (bos)<__pthread_initial_thread_tos) \
-		__pthread_initial_thread_bos = (tos)+1
+	do { \
+		char *__tos = (tos); \
+		char *__bos = (bos); \
+		if (__tos >= __pthread_initial_thread_bos && \
+		    __bos < __pthread_initial_thread_tos) { \
+			if (__bos < __pthread_initial_thread_mid) \
+				__pthread_initial_thread_bos = __tos; \
+			else \
+				__pthread_initial_thread_tos = __bos; \
+		} \
+	} while (0)
 #define NOMMU_INITIAL_THREAD_BOUNDS(tos,bos) /* empty */
 #endif /* __ARCH_USE_MMU__ */
diff --git a/libpthread/linuxthreads.old/pthread.c b/libpthread/linuxthreads.old/pthread.c
index ad392e3..a8830b1 100644
--- a/libpthread/linuxthreads.old/pthread.c
+++ b/libpthread/linuxthreads.old/pthread.c
@@ -168,12 +168,10 @@ pthread_descr __pthread_main_thread = &__pthread_initial_thread;
 char *__pthread_initial_thread_bos = NULL;
-/* For non-MMU systems also remember to stack top of the initial thread.
- * This is adapted when other stacks are malloc'ed since we don't know
- * the bounds a-priori. -StS */
 #ifndef __ARCH_USE_MMU__
+/* See nommu notes in internals.h and pthread_initialize() below. */
 char *__pthread_initial_thread_tos = NULL;
+char *__pthread_initial_thread_mid = NULL;
 #endif /* __ARCH_USE_MMU__ */
 /* File descriptor for sending requests to the thread manager. */
@@ -457,12 +455,19 @@ static void pthread_initialize(void)
     setrlimit(RLIMIT_STACK, &limit);
-  /* For non-MMU assume __pthread_initial_thread_tos at upper page boundary, and
-   * __pthread_initial_thread_bos at address 0. These bounds are refined as we
-   * malloc other stack frames such that they don't overlap. -StS
+  /* For non-MMU, the initial thread stack can reside anywhere in memory.
+   * We don't have a way of knowing where the kernel started things -- top
+   * or bottom (well, that isn't exactly true, but the solution is fairly
+   * complex and error prone).  All we can determine here is an address
+   * that lies within that stack.  Save that address as a reference so that
+   * as other thread stacks are created, we can adjust the estimated bounds
+   * of the initial thread's stack appropriately.
+   *
+   * This checking is handled in NOMMU_INITIAL_THREAD_BOUNDS(), so see that
+   * for a few more details.
-  __pthread_initial_thread_tos =
-    (char *)(((long)CURRENT_STACK_FRAME + getpagesize()) & ~(getpagesize() - 1));
+  __pthread_initial_thread_mid = CURRENT_STACK_FRAME;
+  __pthread_initial_thread_tos = (char *) -1;
   __pthread_initial_thread_bos = (char *) 1; /* set it non-zero so we know we have been here */
   PDEBUG("initial thread stack bounds: bos=%p, tos=%p\n",
 	 __pthread_initial_thread_bos, __pthread_initial_thread_tos);

More information about the uClibc-cvs mailing list