This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: codeset problems in wprintf and wcsftime


On Mar 31 22:57, Andy Koppe wrote:
> On 2 March 2010 20:47, Corinna Vinschen:
> >> - Add another flag __HAVE_WCHAR_LOCALE_INFO__ or something. ?Used in
> >> ? vfwprintf and wcsftime it would use different code for targets
> >> ? which only have the multibyte locale info and targets which have
> >> ? also the wide char representation.
> >
> > Attached is a patch which does that and more. ?I do not intend to apply
> > this patch before the next Cygwin release 1.7.2 is out (which will be
> > RSN), but I wanted to show what I have, so you guys can make a sanity
> > check on the new code if you're interested in this stuff.
> >
> > Here's what the patch does, in random order:
> >
> > - Add a new macro called __HAVE_LOCALE_INFO_EXTENDED__ which guards
> > ?(almost) all of the new locale definitions.
> >
> > - Add two files locale/lctype.[ch], which define a basic LC_CTYPE
> > ?datastructure and functions as in the other categories.
> >
> > ?This adds two features. ?One of them is for all targets.
> >
> > ?- The global variables lc_ctype_charset, __mb_cur_max, and
> > ? ?lc_message_charset are no longer used on targets defining
> > ? ?__HAVE_LOCALE_INFO__. ?The reason is that global variables disallow
> > ? ?to define thread-local locales. ?These are a GNU extension which
> > ? ?have been added to the latest POSIX-1.2008 standard, see
> > ? ?uselocale(3).
> >
> > ? ?The replacement are codeset and mb_cur_max members in the LC_CTYPE
> > ? ?structure, and codeset in the LC_MESSAGES structure.
> >
> > ? ?This affects especially the MB_CUR_MAX definition in stdlib.h. ?It's
> > ? ?not any longer just __mb_cur_max, rather it calls the new
> > ? ?__locale_mb_cur_max() function.
> 
> I'm concerned about the level of indirection that this entails.
> 
> Leaving out the ifdef's, we've got:
> 
> int
> _DEFUN_VOID(__locale_mb_cur_max)
> {
>   return __get_current_ctype_locale ()->mb_cur_max;
> }
> 
> Actually, I think there's a bug here: __get_current_ctype_locale()
> returns a pointer to lc_ctype_T struct, and the mb_cur_max field in
> that is of type 'const char *'. Looking at the code in
> __ctype_load_locale(), this is intentional, because apparently it's
> meant to point into a char array pointed to by _ctype_locale_buf. So I
> think there's a '*' missing in the function above.

Yes, you're right.  That should have been

  return __get_current_ctype_locale ()->mb_cur_max[0];

I've fixed that locally.

> But I'm puzzled about why that indirection into _ctype_locale_buf is
> there in the first place. Why not just store mb_cur_max directly in
> the lc_ctype_T struct?

No, that spoils the way the data is read from locale files.  For
Cygwin that would be no problem, but it doesn't match the way the
__part_load_locale function works.

> The __ctype_locale_buf also holds the name of the selected charset,
> and I don't like the charset and mb_cur_max being mixed up in that
> way, and I think this is really ugly:
> 
> /* Max encoding_len + NUL byte + 1 byte mb_cur_max plus trailing NUL byte */
> #define _CTYPE_BUF_SIZE	34
> static char _ctype_locale_buf[_CTYPE_BUF_SIZE];
> 
> What's the idea behind that? And indeed, why not store the codeset
> string directly in an array in lc_ctype_T?

Again, that's how __part_load_locale works.  Compare with BSD.

> Also, in __get_current_ctype, there's this:
> 
> struct lc_ctype_T *
> __get_current_ctype_locale(void) {
> 	return (_ctype_using_locale
> 		? &_ctype_locale
> 		: (struct lc_ctype_T *)&_C_ctype_locale);
> }
> 
> Checking whether we're in the "C" locale every single time seems
> rather inefficient. Wouldn't it be better to initialise _ctype_locale
> with the contents _C_ctype_locale and later copy it in again when the
> C locale is selected?

That's exactly how it's implemented in the other files, like 
lmonetary.c, etc.  There is no comparison with the "C" locale, rather
it just depends on the _ctype_using_locale variable.  The idea (from BSD)
is that the function always returns a valid structure, even while the
__part_load_locale function is changing the contents of _ctype_locale_buf.


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]