[PATCH] prevent retries on fclose/fflush after write errors

Rich Felker dalias at aerifal.cx
Thu Mar 15 00:18:28 UTC 2012


On Wed, Mar 14, 2012 at 06:13:03AM -0700, Michael Deutschmann wrote:
> On Tue, 13 Mar 2012, Rich Felker wrote:
> > I would consider this a flaw in the standard since it largely prevents
> > using EINTR in any useful way.
> 
> EINTR wasn't invented to be useful, it was invented because it was easier
> to implement in pre-sigaction() SysV kernels than SA_RESTART semantics.
> Known as the "PCLSR problem", it is an often cited example of the "Worse
> is Better" design philosophy at work.

It was cited in "Worse is Better", yes, but I don't entirely believe
the anecdote. SA_RESTART semantics are no harder to implement; it's
just a matter of whether you save the address of the syscall
instruction or the instruction immediately after it when invoking a
signal handler. (These days it's a little more complicated due to the
required semantics, but at the time when there was no standard for the
semantics, either behavior would have been equally easy to implement.)

> > I was assuming the old standard idiom of installing a do-nothing signal
> > handler (without SA_RESTART) for SIGALRM so that fgets would return with
> > an error and the program could proceed.
> 
> That's a broken idiom, since if the signal should arrive just one opcode
> before the read syscall begins, it will be ignored.  If you need reliable
> signal interruption, you must use sigsuspend() or longjmp out of the
> handler.

Agreed. A smarter version of the dumb idiom would call ualarm(100); or
so from the signal handler, but it's still dumb. The situation where
it's not dumb is where the signal is generated by a user at the
terminal pressing ^C or ^\ instead of a timer expiration.

> One historical note:
> 
> Circa 2000, I was actively reading glibc's bugs list and trying to help
> out.  Someone posted a bug report ("libc/1174" in the old GNATS system)
> citing this very issue, and I suggested a patch to restart after EINTR
> within stdio -- since any deliberate use of interruption would involve a
> race condition.

It can be solved (albeit in an ugly way) by having the signal handler
re-arm the alarm with exponential falloff in delay (in case the system
is so loaded that it can't return from the signal handler before
another timer expiration happens).

> Ulrich Drepper denied it, on the grounds that:
> ) But this is not correct.  You must be able to set an alarm() and
> ) terminate a blocked fwrite() code.  This is required and tested by all
> ) kinds of standard test suites.
> 
> Although that exchange eventually had one productive result -- the node
> "Error Recovery" in the glibc manual, which I basically wrote.
> 
> Sounds like in the intervening decade, glibc may have come around to my
> thinking....

Thankfully glibc is not the standard of correctness. :)

Even if your approach is preferable to users, I don't think it's
conformant, since POSIX specifies the EINTR error for fgetc. Since all
other stdio read operations are specified in terms of fgetc, and since
they're specified to stop early with an error if fgetc reaches EOF or
an error, it appears that a conformant implementation must treat EINTR
as a hard error whether it likes to or not..

Rich


More information about the uClibc mailing list