UTF-8 runtime environment

Brian Clarke brian.clarke at oxsemi.com
Wed Mar 7 11:12:58 UTC 2007


I'm wondering whether the problem with getting segmentation faults is 
due to the locale data being generated on my host x86 system being in 
some way incompatible when used on the ARM target.

The extra/locale/README includes the following statement:

NOTE: While its possible to use this stuff for native != target arch,
you'll have to either write a converter to account for endianess and
struct padding issues, or run the mmap file generator on your target
arch.  But all these programs will be rewritten at some point.

I'm guessing that the "mmap file generator" has something to do with 
extra/locale/gen_mmap.c, but have yet to work out how to build or run 
this on my ARM target.

I found this quote when googling:

mjn3: this is what I do to build uclibc 1) I build for my target 
extra/locale/gen_mmap, 2) then I run it and produce locale.mmap and copy 
it back to extra/locale/ 3) I run make in that dir. 4) do a regular 
uclibc build.    So, I don't know how to figure out what locales I have 
selected

which would also suggest that gen_mmap has to be run on the target 
system, but I haven't been able to work out how build gen_mmap.c into an 
executable yet. Can anyone provide any pointers?

Brian.

Brian Clarke wrote:

>Hi Todd,
>
>I've rebuilt my system from scratch using the pre-generated locales and 
>got a little further.
>
>Running the tst_nl_langinfo program without any arguments I now get 
>output, but the CODESET is still ASCII and the strings printed for the 
>various locale settings all appear corrupt.
>
>If I specify any valid locale to the test program, i.e. en_US.UTF-8, 
>then I get a segmentation fault from the C library. I also found that if 
>I try to set the LC_ALL environment variable to a valid locale my bash 
>login shell dies and I get back to the login prompt.
>
>I'm currently rebuilding with debug symbols in uClibc to see if I can 
>work out where things are going wrong.
>
>Brian.
>
>rockwell618 at gmail.com wrote:
>
>  
>
>>Hi Brian,
>>
>>No,  to date I see no evidence of locale support on the ARM.  The
>>behaviors of the pre- and post- locale-enabled builds are identical.
>>In both cases I get illegal byte sequence errors whenever my
>>application encounters multi-byte chars.  (Everything still works fine
>>for chars in the ascii range; i.e., single byte chars.)
>>
>>I recently made config changes to the kernel, busybox, and to uClibc
>>-- my problems may be related to some interaction effects, I don't
>>know.  I am in the process of rolling back other changes in order to
>>isolate locale-related changes.
>>
>>I'll try again with a clean build and let you know if anything changes
>>for the better.
>>
>>Todd
>>
>>
>>On 2/23/07, Brian Clarke <brian.clarke at oxsemi.com> wrote:
>>
>>    
>>
>>>Hi Todd,
>>>
>>>I've been using a non-locale Buildroot/uClibc ARM based system for many
>>>months now without problems.
>>>
>>>I too am now trying to build uClibc locale support into a Buildroot
>>>built root filesystem with the aim of being able to process UTF-8
>>>encoded multibyte strings.
>>>
>>>I selected Locale support from the Buildroot menus, which appears to
>>>select to download the pre-generated local data using Buildroot's
>>>toolchain/uClibc/uClibc.config-locale config file for uClibc.
>>>
>>>I do not seem to get any functioning locale support, as for example the
>>>date command from Busybox returns a corrupt string (ÓÈÊ  sön  1
>>>00:54:37 UTC 2006) and if I compile and run the tst_nl_langinfo.c locale
>>>test program from uClibc's extra/locale directory, I get "couldn't set
>>>locale" for any locale I try, including not specifying a locale which
>>>should I believe select the default locale.
>>>
>>>Do you see any evidence on your ARM system that there is any functioning
>>>locale support present?
>>>
>>>Brian.
>>>
>>>rockwell618 at gmail.com wrote:
>>>
>>>      
>>>
>>>>Hi,
>>>>
>>>>I am trying to enable support for the en_US.UTF-8 locale  in an ARM/
>>>>linux 2.6.18 environment, using uClibc-0.9.28.
>>>>
>>>>Using Buildroot, I compiled uClibc after selecting the String and
>>>>Stdio Support->Wide Character Support options: "Locale Support, " "Use
>>>>Pre-generated Locale Data, "
>>>>and "Automatically Download the "Pre-generated Locale Data (if 
>>>>        
>>>>
>>>necessary)."
>>>      
>>>
>>>>My problem is that I do not have a sense for how the runtime
>>>>environment should be configured on the target device so that the
>>>>locale support can actually be used by my application.  For instance,
>>>>as with glibc, are the LANG and LC_* env vars used? Should
>>>>/usr/lib/locale be created to contain the locale data?  If not, where
>>>>is the "Pre-generated Locale Data" supposed to reside on the target
>>>>device -- or is it actually built into uClibc?   It's all been
>>>>documented I'm sure but I've overlooked it somehow.
>>>>
>>>>I would greatly appreciate any pointers to documentation (or other
>>>>guidance) on the proper runtime environment for uClibc UTF-8 support.
>>>>
>>>>Thanks,
>>>>Todd
>>>>_______________________________________________
>>>>uClibc mailing list
>>>>uClibc at uclibc.org
>>>>http://busybox.net/cgi-bin/mailman/listinfo/uclibc
>>>>
>>>>
>>>>        
>>>>
>>>
>>>      
>>>
>
>
>_______________________________________________
>uClibc mailing list
>uClibc at uclibc.org
>http://busybox.net/cgi-bin/mailman/listinfo/uclibc
>





More information about the uClibc mailing list