Weirdness in elf.h?
Rob Landley
rob at landley.net
Thu Nov 12 01:37:54 UTC 2009
On Monday 09 November 2009 16:09:05 Mike Frysinger wrote:
> On Sunday 08 November 2009 19:12:57 Rob Landley wrote:
> > #define EM_XTENSA 94 /* Tensilica Xtensa Architecture
> > */ #define EM_IP2K 101 /* Ubicom IP2022 micro
> > controller #define EM_CR 103 /* National
> > Semiconductor 3define EM_MSP430 105 /* TI msp430 micro
> > controller */ #define EM_BLACKFIN 106 /* Analog Devices
> > Blackfin */ #define EM_ALTERA_NIOS2 113 /* Altera Nios II soft-core
> > processor */ #define EM_CRX 114 /* National
> > Semiconductor CRX */ #define EM_NUM 95
> >
> > Isn't EM_NUM supposed to be one higher than the largest number used?
>
> yes, but i dont think anyone actually uses this thing.
If I wasn't trying to use it I wouldn't have asked about it.
As for one obscure little user, when the Linux kernel creates an x86 bzImage
from a vmlinux image it uses arch/x86/boot/compressed/relocs.c, and line 8 is
"#include <elf.h>". So if you yank that header, you can't build an ARCH=x86
Linux kernel with a uClibc toolchain. (And probably other ARCHes, that's just
the one I'm familiar with off the top of my head. They all start with
vmlinux...)
The reason I know that is if you try to build a Linux kernel on a macintosh
host, it dies when it gets to that point due to mac's mach-o based headers not
having ELF support. (The Linux kernel guys' rationale is that relocs.c is a
utility you build and run on the host, and thus uses the host headers. You
can patch it to use linux/elf.h, and that's what you in fact need to do to
build a Linux kernel on a macintosh host, but it's nontrivial and they haven't
wanted to take the patch upstream yet.)
By the way, how do you think uClibc's own dynamic linker is supposed to get
these constants? Hint: ldso/ldso/$ARCH/dl-sysdep.h #includes <elf.h>, for
over a dozen different $ARCH values. Without this header, the dynamic linker
can't get the EM_ values it needs to recognize native binaries.
The reason I know _that_ is because sparc gets it wrong, and I've been
carrying this patch for ages:
Sparc v8 and v9 should still support EM_SPARC binaries, not _just_
SPARC32PLUS.
--- uClibc/ldso/ldso/sparc/dl-sysdep.h 2008-09-15 11:36:11.000000000 -0500
+++ uClibc.bak/ldso/ldso/sparc/dl-sysdep.h 2009-04-08 01:09:53.000000000 -0500
@@ -29,13 +29,8 @@
/* Here we define the magic numbers that this dynamic loader should accept
* Note that SPARCV9 doesn't use EM_SPARCV9 since the userland is still 32-
bit.
*/
-#if defined(__sparc_v9__) || defined(__sparc_v8__)
#define MAGIC1 EM_SPARC32PLUS
-#else
-#define MAGIC1 EM_SPARC
-#endif
-
-#undef MAGIC2
+#define MAGIC2 EM_SPARC
/* Used for error messages */
#define ELF_TARGET "sparc"
Support for the newer binary type doesn't necessarily mean you drop support
for the older one...
> might be easier to just punt it.
And abandoning the entire project rather than ever fixing any issues pointed
out in it would be even easier to do, but might not be the right thing from an
engineering standpoint.
(I still don't understand why Erik's readelf.c got yanked. Worked fine for
years, and it took me about 5 minutes to get it building against both glibc
and uClibc with a simple "gcc readelf.c". Too much trouble to maintain?
Really?)
> if we do fix it, we should important the holes from
> binutils. -mike
It seems to me that the logical thing to do with the EM_ symbols list
(presumably what was giving you headaches with readelf.c) is to trim it down
to the set of systems uClibc actually supports. Go ahead and let the rest be
unknown, there's nothing wrong with that. There _will_ be unknowns, and
identifying what _kind_ of lack of support we have is of dubious utility.
(The "file" command has its own database for that sort of thing.) But lots of
things need the constants for the current host, so we _do_ need the values for
all the targets uClibc runs on.
It also seems to me that most of the interesting stuff is in linux/elf.h
already, which could presumably be #included from an elf.h one level up. The
kernel guys don't quite sanitize it in headers_install (the 32 and 64 bit base
types are still in kernel __u64 and friends), but Linux adheres to the LP64
standard:
http://www.unix.org/whitepapers/64bit.html
http://www.unix.org/version2/whatsnew/lp64_wp.htm
So it's a fairly simple sed to convert 'em all. Something like (off the top of
my head, untested):
sed -e 's/__u16/unsigned short/g' -e 's/__u32/unsigned int/g' \
-e 's/__u64/unsigned long long/g' -e 's/__s16/signed short/g' \
-e 's/__s32/signed int/g' -e 's/__s64/signed long long/' \
-i linux/elf.h
And then 2/3 of elf.h presumably becomes "#include <linux/elf.h>"
(Dunno if it would actually work, but the kernel guys are riffing on a standard,
and gratuitous deviations from that standard are something they're likely to
avoid for code hygiene reasons, and might take upstream patches for if any
were found.)
There are still things linux/elf.h hasn't got, such as the EM_ constants and
the wrappers to provide "current host size" for 32 bit or 64 bit types. The
top level elf.h wouldn't be _just_ a wrapper. But it's a lot _less_ to
maintain, and the Linux guys have to support their elf.h or they can't _run_
elf binaries.
Anyway, that seems like a more interesting direction to explore than "I don't
understand this, let's drop it".
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
More information about the uClibc
mailing list