Weirdness in elf.h?

Rob Landley rob at landley.net
Thu Nov 12 01:37:54 UTC 2009


On Monday 09 November 2009 16:09:05 Mike Frysinger wrote:
> On Sunday 08 November 2009 19:12:57 Rob Landley wrote:
> > #define EM_XTENSA       94              /* Tensilica Xtensa Architecture
> > */ #define EM_IP2K         101             /* Ubicom IP2022 micro
> > controller #define EM_CR           103             /* National
> > Semiconductor 3define EM_MSP430       105             /* TI msp430 micro
> > controller */ #define EM_BLACKFIN     106             /* Analog Devices
> > Blackfin */ #define EM_ALTERA_NIOS2 113     /* Altera Nios II soft-core
> > processor */ #define EM_CRX          114             /* National
> > Semiconductor CRX */ #define EM_NUM          95
> >
> > Isn't EM_NUM supposed to be one higher than the largest number used?
>
> yes, but i dont think anyone actually uses this thing.

If I wasn't trying to use it I wouldn't have asked about it.

As for one obscure little user, when the Linux kernel creates an x86 bzImage 
from a vmlinux image it uses arch/x86/boot/compressed/relocs.c, and line 8 is 
"#include <elf.h>".  So if you yank that header, you can't build an ARCH=x86 
Linux kernel with a uClibc toolchain.  (And probably other ARCHes, that's just 
the one I'm familiar with off the top of my head.  They all start with 
vmlinux...)

The reason I know that is if you try to build a Linux kernel on a macintosh 
host, it dies when it gets to that point due to mac's mach-o based headers not 
having ELF support.  (The Linux kernel guys' rationale is that relocs.c is a 
utility you build and run on the host, and thus uses the host headers.  You 
can patch it to use linux/elf.h, and that's what you in fact need to do to 
build a Linux kernel on a macintosh host, but it's nontrivial and they haven't 
wanted to take the patch upstream yet.)

By the way, how do you think uClibc's own dynamic linker is supposed to get 
these constants?  Hint: ldso/ldso/$ARCH/dl-sysdep.h #includes <elf.h>, for 
over a dozen different $ARCH values.  Without this header, the dynamic linker 
can't get the EM_ values it needs to recognize native binaries.

The reason I know _that_ is because sparc gets it wrong, and I've been 
carrying this patch for ages:

Sparc v8 and v9 should still support EM_SPARC binaries, not _just_ 
SPARC32PLUS.

--- uClibc/ldso/ldso/sparc/dl-sysdep.h	2008-09-15 11:36:11.000000000 -0500
+++ uClibc.bak/ldso/ldso/sparc/dl-sysdep.h	2009-04-08 01:09:53.000000000 -0500
@@ -29,13 +29,8 @@
 /* Here we define the magic numbers that this dynamic loader should accept
  * Note that SPARCV9 doesn't use EM_SPARCV9 since the userland is still 32-
bit.
  */
-#if defined(__sparc_v9__) || defined(__sparc_v8__)
 #define MAGIC1 EM_SPARC32PLUS
-#else
-#define MAGIC1 EM_SPARC
-#endif
-
-#undef  MAGIC2
+#define MAGIC2 EM_SPARC

 /* Used for error messages */
 #define ELF_TARGET "sparc"

Support for the newer binary type doesn't necessarily mean you drop support 
for the older one...

> might be easier to just punt it.

And abandoning the entire project rather than ever fixing any issues pointed 
out in it would be even easier to do, but might not be the right thing from an 
engineering standpoint.

(I still don't understand why Erik's readelf.c got yanked.  Worked fine for 
years, and it took me about 5 minutes to get it building against both glibc 
and uClibc with a simple "gcc readelf.c".  Too much trouble to maintain?  
Really?)

> if we do fix it, we should important the holes from
> binutils. -mike

It seems to me that the logical thing to do with the EM_ symbols list 
(presumably what was giving you headaches with readelf.c) is to trim it down 
to the set of systems uClibc actually supports.  Go ahead and let the rest be 
unknown, there's nothing wrong with that.  There _will_ be unknowns, and 
identifying what _kind_ of lack of support we have is of dubious utility.  
(The "file" command has its own database for that sort of thing.)  But lots of 
things need the constants for the current host, so we _do_ need the values for 
all the targets uClibc runs on.

It also seems to me that most of the interesting stuff is in linux/elf.h 
already, which could presumably be #included from an elf.h one level up.  The 
kernel guys don't quite sanitize it in headers_install (the 32 and 64 bit base 
types are still in kernel __u64 and friends), but Linux adheres to the LP64 
standard:

  http://www.unix.org/whitepapers/64bit.html
  http://www.unix.org/version2/whatsnew/lp64_wp.htm

So it's a fairly simple sed to convert 'em all.  Something like (off the top of 
my head, untested):

sed -e 's/__u16/unsigned short/g' -e 's/__u32/unsigned int/g' \
    -e 's/__u64/unsigned long long/g' -e 's/__s16/signed short/g' \
    -e 's/__s32/signed int/g' -e 's/__s64/signed long long/' \
    -i linux/elf.h

And then 2/3 of elf.h presumably becomes "#include <linux/elf.h>"

(Dunno if it would actually work, but the kernel guys are riffing on a standard, 
and gratuitous deviations from that standard are something they're likely to 
avoid for code hygiene reasons, and might take upstream patches for if any 
were found.)

There are still things linux/elf.h hasn't got, such as the EM_ constants and 
the wrappers to provide "current host size" for 32 bit or 64 bit types.   The 
top level elf.h wouldn't be _just_ a wrapper.  But it's a lot _less_ to 
maintain, and the Linux guys have to support their elf.h or they can't _run_ 
elf binaries.

Anyway, that seems like a more interesting direction to explore than "I don't 
understand this, let's drop it".

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds


More information about the uClibc mailing list