[PATCH 0/8] ARC updates to uClibc

Vineet Gupta Vineet.Gupta1 at synopsys.com
Wed Feb 18 05:51:17 UTC 2015


On Monday 16 February 2015 08:34 PM, Bernhard Reutner-Fischer wrote:
>> While it at I also did some arch specific adjustment in sigaction path
>> >- inlining the rt_sigaction syscall stub detour to reduce branch return
>> >stack mispredicts etc - which is what 6/8 does !
> This sounds suspicious.
> IIRC we already had that argument, last time around _dl_do_reloc and _dl_do_lazy_reloc.
> Could it be that your port has a bug here ( missed optimisation ) around ifunc handling? Sounds like back then on ARM https://gcc.gnu.org/PR40887#c6
> 
> What am I missing?


I don't think my use-case is close to the ARM issue u pointed to above as there is
no ifunc or function pointer involved.

With orig code, we get 2 function calls on ARC:

0000b504 <__libc_sigaction>:
    b504:	push_s     blink
    b506:	sub_s      sp,sp,12
    b508:	bl.d       36b20 <__st_r13_to_r15>
...

    b540:	bl.d       b750 <__syscall_rt_sigaction>   <--- DIRECT CALL
    b544:	mov_s      r3,8
    b546:	add_s      sp,sp,20
    b548:	mov_s      r12,12
    b54a:	b          36b88 <__ld_r13_to_r15_ret>
    b54e:	nop_s

0000b750 <__syscall_rt_sigaction>:
    b750:	mov        r8,134
    b754:	swi                                <---- SYSCALL TRAP INTO KERNEL
    b758:	cmp        r0,0xfffffc00
    b75c:	bls_s      b76a
    b75e:	st.a       blink,[sp,-4]
    b762:	bl         b550 <__syscall_error>
    b766:	ld.ab      blink,[sp,4]
    b76a:	j_s        [blink]

The small function call is not necessarily good micro-architecturally when
returning due to limited number of call return stack entries. That cost is
amortized if function is largish.

I do understand that these small syscall wrappers are a common uClibc design
pattern and exist all over the place but given that this was all arch code I tool
the liberty of removing the one hop and the code now looks as below:

0000b4d8 <__libc_sigaction>:
    b4d8:	st.a       gp,[sp,-4]
    b4dc:	sub_s      sp,sp,20
    b4de:	add        gp,pcl,0x00065284
    b4e6:	breq_s     r1,0,b516
    b4e8:	ld_s       r3,[r1,4]
...
    b516:	mov        r8,134
    b51a:	mov_s      r3,8
    b51c:	swi
    b520:	cmp        r0,0xfffffc00
    b524:	bls_s      b532
    b526:	st.a       blink,[sp,-4]
    b52a:	bl         b53c <__syscall_error>
    b52e:	ld.ab      blink,[sp,4]
    b532:	ld.a       gp,[sp,20]
    b536:	j_s.d      [blink]
    b538:	add_s      sp,sp,4
    b53a:	nop_s

-Vineet


More information about the uClibc mailing list