[PATCH] Fix nextafterf

Mon Feb 9 09:58:30 UTC 2009

Denys Vlasenko wrote:
> On Friday 06 February 2009 03:39, Jie Zhang wrote:
>>>> A patch is attached. This patch copies nextafterf from glibc. Tested on 
>>>> Blackfin with the above test case. If it's OK, please apply. Thanks.
>>> Why the patch touches libm/s_nextafter.c (the double version)?
>>> Your comment says nothing about it.
>>>
>> Sorry for not explain that.
>>
>> math_opt_barrier and math_force_eval were added in glibc to fix glibc bug:
>>
>> http://sources.redhat.com/bugzilla/show_bug.cgi?id=3306
>>
>> When I copied nextafterf, I noticed this. So I modified nextafter, too.
> 
> Ok, I took a look.
> 
> There is optimized version of them for i386 and x86_64.
> You didn't use that.
> 
I cannot find them in glibc CVS HEAD. Maybe you are referring to the 
long double versions of i386 and x86_64.

> Also, glibc didn't bother to explain what these macros do.
> I would like to do better than that, I want people
> to be able to hack on uclibc not in a "blind" mode.
> Code readability needs to be improved.
> 
Maybe glibc developers thought this piece of code is obvious.

> I added this:
> 
> /*
>  * math_opt_barrier(x): force expression x to be evaluated and put into
>  * a floating point register or memory. This macro returns the value.
>  *
>  * math_force_eval(x): force expression x to be evaluated and put into
>  * a floating point register or memory *of the appropriate size*.
>  * This forces floating point flags to be set correctly
>  * (for example, when float value is overflowing, but FPU registers
>  * are wide enough to "hide" this).
>  */
> 
> Is it correct?
> 
Not completely.

math_opt_barrier(x) prevents compiler doing optimization across it.

math_force_eval(x) prevents compiler optimizing away calculation of x.

> 
> Now, let's see how they are used in nextafterf.
> 
>         if (ix == 0) { /* x == 0? */
>                 float u;
>                 /* return +-minsubnormal */
>                 SET_FLOAT_WORD(x, (hy & 0x80000000) | 1);
>                 u = math_opt_barrier(x);
>                 u = u * u;
>                 math_force_eval(u); /* raise underflow flag */
>                 return x;
> 
> Ok, we constructed smallest possible subnormal in x.
> Why do we load it through math_opt_barrier()?

As stated above, prevent compiler doing optimization.

> Why we multiply it by itself?

To raise the underflow flag.

> Can't we just math_force_eval(x)?

This will not work.

> which tells me that  /* raise underflow flag */ comment
> is on the wrong line! _multiplication_ is the operation which sets
> underflow flag, FST instruction never sets it (I just checked
> the manual). In all cases, FST instruction generated here
> is useless, it does no useful work whatsoever.
> 
It's not in the wrong line completely. Without math_force_eval, the 
underflow fag might not be set because of compiler optimization. You may 
like comments as below

     u = math_opt_barrier(x); /* don't optimize across this */
     u = u * u;               /* raise underflow flag */
     math_force_eval(u);      /* force evaluation of the above expr */

But doesn't this looks a little wordy?
> 
>>> You also did not remove the wrapper from float_wrappers.c
>>>
>> I indeed removed it in my patch.
> 
> Ah, now I see. sorry.
> 
No problem.

> 
>>> I think math_opt_barrier macro should be either eliminated
>>> or moved to its sole user, libm/s_nextafterf.c
>>>
>> nextafter also needs them. But if uClibc does not care of the spec on 
>> this corner, they can be removed from both.
> 
> Btw, can you give an example C code which demonstrates
> how math_opt_barrier() and math_force_eval() are resulting
> in correct implemetation. So far it sounds somewhat nebulous.
> 
I attached a test case in this email, which is adapted from the test case in

http://sources.redhat.com/bugzilla/show_bug.cgi?id=3306

Put all three files in the same directory.

$ gcc -o test test.c -O2 -lm -DMATH_OPT_BARRIER -DMATH_FORCE_EVAL
$ ./test
fetestexcept: 30
FE_UNDERFLOW: 10
FE_INEXACT: 20

Drop -DMATH_OPT_BARRIER or -DMATH_FORCE_EVAL or all of them will get us:

$ ./test
fetestexcept: 0
FE_UNDERFLOW: 10
FE_INEXACT: 20

> IOW: do you have a testcase which checks that nextafter[f]
> sets these exception flags and whatnot?

I think the test code in

http://sources.redhat.com/bugzilla/show_bug.cgi?id=3306

is a good one.

Regards,
Jie