[uClibc 0000385]: add support for ARM thumb

bugs at busybox.net bugs at busybox.net
Sat Jan 21 06:18:37 UTC 2006


A NOTE has been added to this issue. 
====================================================================== 
http://busybox.net/bugs/view.php?id=385 
====================================================================== 
Reported By:                jbowler
Assigned To:                uClibc
====================================================================== 
Project:                    uClibc
Issue ID:                   385
Category:                   Architecture Specific
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     assigned
====================================================================== 
Date Submitted:             08-13-2005 22:51 PDT
Last Modified:              01-20-2006 22:18 PST
====================================================================== 
Summary:                    add support for ARM thumb
Description: 
Various parts of the ARM specific uClibc implementation use mov pc,lr and
other non-interwork compatible stuff.

====================================================================== 

---------------------------------------------------------------------- 
 vapier - 08-14-05 06:21  
---------------------------------------------------------------------- 
might be a good idea to add another submenu to the Target Architecture
Features and Options so someone who wishes to build for a thumb target
just selects the option rather than worry about setting compiler options
properly ...

suggestion on a submenu name ?  Target Processor Thumbness ? ;) 

---------------------------------------------------------------------- 
 jocke - 08-14-05 08:05  
---------------------------------------------------------------------- 
Had a quick look at this and though I don't know anything about ARM I
wonder
if you can make the changes to _dl_find_hash in such a way that
non ARM arches don't have to execute extra code.

Also, have a look at SVN head. Quite a lot has changed in the code that
these patches touch, hopefully to the better :) 

---------------------------------------------------------------------- 
 jbowler - 08-14-05 08:54  
---------------------------------------------------------------------- 
The issue of architecture specific settings is handled fine by setting CC
on the command line - that's the way OpenEmbedded does it - I raised bug
386 for the one failure this causes.

Notice that the ARM switches instruction sets dynamically (32 bit arm, 16
bit thumb or java).  It's not a compile time issue - the only thing which
can be done at compile time is to disable the support for it in the
generate code.

I think the _dl_find_hash needs the eye of an expert - I fail to
understand why _dl_linux_resolve should get an STT_OBJECT match for a
function name, but there is probably a good reason.

It seems to me that passing in the full list of all acceptable symbol
types is a significant improvement for all architectures.  I don't see why
the extra code:

+	if ((type_class & ELF_RTYPE_CLASS_ALL) == 0)
+		type_class += ELF_RTYPE_CLASS_DEFAULT;

is a big deal, that's at most four instructions on the ARM (16 bytes, 8
for thumb!), and the patched result is going to be considerably faster
than the original.  Indeed if all the callers are changed to pass in an
explicit list of types the above add can be removed (the size should be
unchanged from the origina) but a humungous number of strcmp operations
are going to be removed for each lookup... 

---------------------------------------------------------------------- 
 jocke - 08-14-05 10:26  
---------------------------------------------------------------------- 
Don't quite follow the STT_OBJECT match for a function name statement.
Could
you elaborate? It is quite possible that it is wrong, if it is, how would
you like to change the function?

There is also this chage that adds a litte code
-			if (ELF32_ST_TYPE(sym->st_info) > STT_FUNC)
+			if ((ELF_RTYPE_CLASS_OF(ELF32_ST_TYPE(sym->st_info)) & type_class) ==
0)

Could you demonstrate how you would like to change all callers to pass
relevant flags? 

---------------------------------------------------------------------- 
 jbowler - 08-14-05 12:48  
---------------------------------------------------------------------- 
The change to the test in _dl_find_hash changes an expression of the form
(exp > 2) to one of the form ((2 << exp) & type_class), which, depending
on what is in the registers, is either 0 or 1 exra instruction on ARM (if
there is a '2' in a register r<2> it is 0 - CMP rx,
http://busybox.net/bugs/view.php?id=2 becomes TST S rt,
r<2>, LSL rx).

The implementation dependent calls to _dl_find_hash are in
ldso/ldso/*/elfinterp.c.  There are two cases - a call from
'_dl_linux_resolver' to find the real address of a function call and a
call to do relocations.  The implementation independent calls are in
ldso/ldso and are calls to find specific functions (atexit, on_exit,
malloc) or data objects (__environ).  So far as I can see all of those
calls *except* the relocation ones require a specific symbol type - e.g.
if the search for 'malloc' is satsified by an STT_OBJECT, not an STT_FUNC,
bad things are happening.  The patch to arm/elfinterp.c changes the ARM
version to just search for STT_FUNC when doing PLT lookups and adds
STT_ARM_TFUNC when doing relocations (I think the absence of the latter
must be a bug).  The result works - i.e. it works even though
_dl_linux_resolver no longer resolves a PLT function call to anything
other than STT_FUNC (or STT_ARM_TFUNC).

The two cases can be seen in the patch for arm/elfinterp.c: 

-				 ELF_RTYPE_CLASS_PLT);
+		ELF_RTYPE_CLASS_PLT+ELF_RTYPE_CLASS_OF(STT_FUNC));

(there needs to be a separate case for STT_ARM_TFUNC because _dl_find_hash
is returning the function call symbol value with the bottom bit unset - so
_dl_linux_resolver needs to set it - that's a separate issue which needs
investigation).

-				scope, tpnt, elf_machine_type_class(reloc_type));
+				scope, tpnt, elf_machine_type_class(reloc_type) +
+				ELF_RTYPE_CLASS_DEFAULT + ELF_RTYPE_CLASS_OF(STT_ARM_TFUNC));

I.e. for the relocation case just add 'ELF_RTYPE_CLASS_DEFAULT' (plus, on
ARM, STT_ARM_TFUNC).  In fact it's not clear to me which of these types
are valid in a relocation - it may be that there are additional valid
types in other architectures too.  It's also not clear to my how
STT_ARM_TFUNC should behave - the base value does not have the bottom bit
set, but direct uses of the value need the bit set and relocations (e.g.
'main+0x22') should presumably not have it set... 

---------------------------------------------------------------------- 
 jocke - 08-14-05 13:32  
---------------------------------------------------------------------- 
OK, so its not much extra code.

I think a function can have both STT_FUNC and STT_NOTYPE. The NOTYPE
comes from symbols in assembler that hasn't been given a type.
Therefore you cannot assume that every function will have the correct
STT type.

I suspect that if your app defines malloc as variable, its a programming
error.

What does glibc do to handle Thumb?

This is a bit short but I don't much time at the moment, sorry. 

---------------------------------------------------------------------- 
 jbowler - 08-14-05 16:16  
---------------------------------------------------------------------- 
STT_NOTYPE: ah ha, I suspected there might be a good reason, it's good to
have expert input :)

glibc: I believe when I first tried this I was using glibc and the basic
setup worked, but the programs went kaboom somewhere around the first
library calls.  If that's right uclibc is (with these patches) in a much
better state than glibc...

glibc must do something different with "main" - to avoid the need for the
glue code (although maybe I never tried a glibc static link).  It seems to
me that, in libc/elf/rtld.c _dl_start_final returns the address of "main". 
That's what I was trying to do with the final, unused, patch (in
__arm_main) - that fails because if main is thumb the address returned
does *not* have the low bit set.

It seems to me that uClibc should probably doing an indirect call rather
than a direct one - then the reference to main is a relocated STT_OBJECT,
not an actual STT_FUNC, and that is likely to remain more compatible with
the GNU ld.

I guess the answer to how to relocate STT_ARM_TFUNC is deep within
binutils bfd somewhere (of course, given that my glibc attempt went kaboom
it may not work.) 

---------------------------------------------------------------------- 
 vapier - 08-14-05 20:31  
---------------------------------------------------------------------- 
merged thumb-swi.patch, thumb-swp.patch, and the parts of
arm-thumb-defined.patch which dont touch ldso 

---------------------------------------------------------------------- 
 jbowler - 08-15-05 09:37  
---------------------------------------------------------------------- 
>From Joakim Tjernlund:

"Function symbols (those with type STT_FUNC) in shared object files have
special significance.
When another object file references a function from a shared object, the
link editor automatically
creates a procedure linkage table entry for the referenced symbol. Shared
object symbols with types
other than STT_FUNC will not be referenced automatically through the
procedure linkage table."

http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixprggd/genprogc/binary_symtabSymb.htm

So just searching for the function types (STT_FUNC on most architectures,
STT_ARM_TFUNC too on ARM) should be safe and, with the move of the
_dl_strcmp to after the test on type, it should be a speed improvement. 

---------------------------------------------------------------------- 
 jocke - 08-15-05 09:48  
---------------------------------------------------------------------- 
Mind trying the this change to dl_find_hash instead?
Index: dl-hash.c
===================================================================
--- dl-hash.c	(revision 11157)
+++ dl-hash.c	(working copy)
@@ -184,14 +184,15 @@
 		for (si = tpnt->elf_buckets[hn]; si != STN_UNDEF; si =
tpnt->chains[si]) {
 			sym = &symtab[si];
 
-			if (type_class & (sym->st_shndx == SHN_UNDEF))
+			if (type_class & (sym->st_shndx == SHN_UNDEF ||
(ELF_ST_TYPE(sym->st_info) < STT_FUNC)))
 				continue;
 			if (_dl_strcmp(strtab + sym->st_name, name) != 0)
 				continue;
 			if (sym->st_value == 0)
 				continue;
-			if (ELF_ST_TYPE(sym->st_info) > STT_FUNC)
+			if (ELF_ST_BIND(sym->st_info) == STB_LOCAL) /* not sure if this is
needed */
 				continue;
+			/* what about COMMOM types ? */
 
 			switch (ELF_ST_BIND(sym->st_info)) {
 			case STB_WEAK: 

---------------------------------------------------------------------- 
 jbowler - 08-15-05 10:22  
---------------------------------------------------------------------- 
There's also a whole thread about STT_ARM_TFUNC which basically says 'get
rid of it', see, for example
http://sources.redhat.com/ml/binutils/2004-11/msg00290.html and this
comment from the patch:

+  /* We convert STT_ARM_TFUNC symbols into STT_FUNC with the low bit
+     of the address set, as per the new EABI.  We do this
unconditionally
+     because objcopy does not set the elf header flags until after
+     it writes out the symbol table.  */
+  if (ELF_ST_TYPE (src->st_info) == STT_ARM_TFUNC)


OE is using binutils 2.16 in general, but for uclibc uses a 2.14 variant
because of 'bitrot in the uclibc patches' - i.e. the 2.14 patches for
uclibc don't work with 2.16 - so since this is an EABI change it may be
necessary to upgrade to 2.16 

---------------------------------------------------------------------- 
 jbowler - 08-15-05 11:16  
---------------------------------------------------------------------- 
The STT_ARM_TFUNC hackery is in binutils 2.16, but gcc 3.4.4 is still
generating objects using STT_ARM_TFUNC.  The are no references to
STT_ARM_TFUNC in unpatched uclibc 0.9.27, therefore there is no thumb func
handling in uclibc (but there is handling for the thumb relocations)

I'll probably come up with a better patch which does the same thing as
binutils 2.16 - i.e. set the low bit in an STT_ARM_TFUNC symbol value and
convert it to STT_FUNC, then check the low bit in elfinterp.c (which is, I
believe, the only place in uclibc which hacks processor specific ELF).

This affects the dll lookup stuff only - all the asm patches (i.e. changes
to the assembler code) are, so far as I can see, good, it's just the stuff
in elfinterp.c which needs more work.

Of course the current binutils being used in OE is probably broken too,
but that's a separate problem. 

---------------------------------------------------------------------- 
 jocke - 08-15-05 11:17  
---------------------------------------------------------------------- 
Using buildroot you should be able to use a never(2.16) binutils 

---------------------------------------------------------------------- 
 jbowler - 09-08-05 22:25  
---------------------------------------------------------------------- 
I've verified the fixes in 0.9.28, they all look fine.  I've rearranged the
other fixes (the ones not in 0.9.28) to isolate the ARM only changes from
the resolver change and to minimise the impact on the shared code
(dl-hash.c).  I still find it necessary to change dl-hash.c because the
code to handle STT_ARM_TFUNC needs to know the symbol type (the function
in dl-hash.c just returns the value).  I've verified this stuff to the
level of having a full NSLU2 build with only libgcc and uclibc build using
arm as opposed to thumb functions - everything looks fine.  (And
libgcc/uclibc may work fine too, I just haven't tested that yet.)

The three additional patch sets will be attached:

# These patches should be applied in this order to the SVN head
# (in fact order is probably unimportant, I believe there are no
# overlaps).
thumb-defined-arm-or-thumb.patch
thumb-mov-pc-bx.patch
thumb-resolve.patch 

---------------------------------------------------------------------- 
 jbowler - 09-08-05 22:27  
---------------------------------------------------------------------- 
These patches are to the SVN head (20050908).
thumb-resolve.patch contains the only change to a non-arm-specific file 

---------------------------------------------------------------------- 
 jocke - 09-09-05 02:22  
---------------------------------------------------------------------- 
Just looked at the dl-hash.c stuff:

Don't quite understand why you need the STT_ARM_TFUNC here. Didn't
newer binutils already encode the STT_ARM_TFUNC as STT_FUNC 
toghter with sym_val | 1?
If this is to support older binutiles, please don't. Make
newer binutils a requirement for Thumb instead.

In case STT_ARM_TFUNC must be there with never binutils:
I think you can remove the 
 if (ELF_ST_TYPE(sym->st_info) > STT_FUNC) 
test, or replace it with 
 if (ELF_ST_BIND(sym->st_info) == STB_LOCAL)

Not sure moving the _dl_strcmp is a good idea since this test
will mostly be true and the if (sym->st_value == 0) will mostly
be false which will make dl_find_hash execute sym->st_value == 0 test
needlessly.

Please don't use #ifdefs, do a ARCH_<insert a good name here>(tpnt, sym)
macro for all archs instead. 

---------------------------------------------------------------------- 
 jbowler - 01-20-06 22:18  
---------------------------------------------------------------------- 
This is an updated patch set against svn 20060120 (i.e. the head on that
date).  The tested patches are (in this order):

thumb-defined-arm-or-thumb.patch
thumb-mov-pc-bx.patch
thumb-swi-r7.patch
thumb-sysnum-h.patch
thumb-asm-swi.patch
thumb-call-via-rx.patch
thumb-resolve.patch

thumb-resolve dl-hash.c may be unnecessary except that the weak symbol
handling is required (it is actually used for some DLL stuff - not sure
uClibc works without it).  I haven't tested this. 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
08-13-05 22:51  jbowler        New Issue                                    
08-13-05 22:51  jbowler        Status                   new => assigned     
08-13-05 22:51  jbowler        Assigned To               => uClibc          
08-13-05 22:51  jbowler        File Added: thumb-swi.patch                    
08-13-05 22:52  jbowler        File Added: thumb-swp.patch                    
08-13-05 22:52  jbowler        File Added: arm-thumb-defined.patch              
     
08-13-05 22:53  jbowler        File Added: thumb-ldso-dlboot.patch              
     
08-13-05 22:53  jbowler        File Added: thumb-interwork-asm.patch            
       
08-13-05 22:54  jbowler        File Added: thumb-static-main.patch              
     
08-14-05 06:21  vapier         Note Added: 0000409                          
08-14-05 06:21  vapier         Summary                  ARM thumb (including
interworking) support is non-functional => add support for ARM thumb
08-14-05 08:05  jocke          Note Added: 0000410                          
08-14-05 08:54  jbowler        Note Added: 0000411                          
08-14-05 10:26  jocke          Note Added: 0000412                          
08-14-05 12:48  jbowler        Note Added: 0000413                          
08-14-05 13:32  jocke          Note Added: 0000414                          
08-14-05 16:16  jbowler        Note Added: 0000415                          
08-14-05 20:31  vapier         Note Added: 0000417                          
08-15-05 09:37  jbowler        Note Added: 0000419                          
08-15-05 09:48  jocke          Note Added: 0000420                          
08-15-05 10:22  jbowler        Note Added: 0000421                          
08-15-05 11:16  jbowler        Note Added: 0000422                          
08-15-05 11:17  jocke          Note Added: 0000423                          
09-08-05 22:25  jbowler        Note Added: 0000509                          
09-08-05 22:25  jbowler        File Added: thumb-defined-arm-or-thumb.patch     
              
09-08-05 22:26  jbowler        File Added: thumb-mov-pc-bx.patch                
   
09-08-05 22:26  jbowler        File Added: thumb-resolve.patch                  
 
09-08-05 22:27  jbowler        Note Added: 0000510                          
09-09-05 02:22  jocke          Note Added: 0000511                          
01-20-06 22:18  jbowler        Note Added: 0000967                          
======================================================================




More information about the uClibc-cvs mailing list