[uClibc 0000687]: utf-8 mbrtowc accepts invalid bytes

bugs at busybox.net bugs at busybox.net
Mon Feb 6 08:59:54 UTC 2006


The following issue has been SUBMITTED. 
====================================================================== 
http://busybox.net/bugs/view.php?id=687 
====================================================================== 
Reported By:                rfelker
Assigned To:                uClibc
====================================================================== 
Project:                    uClibc
Issue ID:                   687
Category:                   Standards Compliance
Reproducibility:            always
Severity:                   minor
Priority:                   normal
Status:                     assigned
====================================================================== 
Date Submitted:             02-06-2006 00:59 PST
Last Modified:              02-06-2006 00:59 PST
====================================================================== 
Summary:                    utf-8 mbrtowc accepts invalid bytes
Description: 
According to section 3.9 of the Unicode Standard, UTF-8 is a mapping
between byte sequences and "Unicode scalar values", which are integers in
one of the ranges 0-0xd7ff or 0xe000-0x10ffff. The standard is clear that
UTF-8 sequences are one to four bytes in length. uClibc accepts the
illegal bytes 0xf5-0xfd giving 5- and 6-byte sequences for code points up
to 0x7fffffff.

Although there was a conflict in the past, my understanding is that
ISO-10646 now agrees that UCS codes go only up through 0x10ffff and that
UTF-8 is a 1-4 byte encoding, not 1-6 byte.

====================================================================== 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
02-06-06 00:59  rfelker        New Issue                                    
02-06-06 00:59  rfelker        Status                   new => assigned     
02-06-06 00:59  rfelker        Assigned To               => uClibc          
======================================================================




More information about the uClibc-cvs mailing list