This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fix localedef collation handling of <U0000> (bug 15948)


As described in bug 15948, if a locale file has a collation entry for
the NUL character <U0000>, ld-collate.c sets up a zero-length
wide-character sequence L"\0" for it.  The code is unprepared to
handle nwcs == 0 and as a result allocates insufficient memory in the
code

              added = (1 + 1 + runp->nwcs - 1) * sizeof (int32_t);
              if (sizeof (int) == sizeof (int32_t))
                obstack_make_room (atwc.extrapool, added);

              obstack_int32_grow_fast (atwc.extrapool, weightidx);
              obstack_int32_grow_fast (atwc.extrapool, runp->nwcs - 1);
              for (i = 1; i < runp->nwcs; ++i)
                obstack_int32_grow_fast (atwc.extrapool, runp->wcs[i]);

as can be seen by adding "assert (runp->nwcs > 0);" before that code,
resulting in errors from "make localedata/install-locales" (first in
ar_SA.UTF-8).

This patch causes such a sequence to be treated as length 1 instead.
(This is the last substantive localedef change I'm expecting to merge
from the cross-localedef changes in EGLIBC; there are still a few
changes to types I need to investigate to work out whether they are
good to merge or not, and I still need to so something about page
sizes but that's a new change rather than any part of the pre-existing
cross-localedef changes.  As of 2006, the bad memory allocation
actually caused problems in some cases together with cross-localedef
support for different int32_t alignment.  While int32_t alignment is
being dealt with in
<https://sourceware.org/ml/libc-alpha/2013-09/msg00432.html> by
eliminating the architecture variation in the locale file format,
rather than by adding a new option to localedef, this memory handling
still seems like a clear localedef bug that should be fixed even if
it's hard to trigger actual failures from it.)

Tested x86_64.

2013-09-12  Richard Sandiford  <richard@codesourcery.com>

	[BZ #15948]
	* locale/programs/ld-collate.c (new_element): Handle <U0000> as a
	single character.

diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
index c4d7e3d..1c622a1 100644
--- a/locale/programs/ld-collate.c
+++ b/locale/programs/ld-collate.c
@@ -348,6 +348,9 @@ new_element (struct locale_collate_t *collate, const char *mbs, size_t mbslen,
     {
       size_t nwcs = wcslen ((wchar_t *) wcs);
       uint32_t zero = 0;
+      /* Handle <U0000> as a single character.  */
+      if (nwcs == 0)
+	nwcs = 1;
       obstack_grow (&collate->mempool, wcs, nwcs * sizeof (uint32_t));
       obstack_grow (&collate->mempool, &zero, sizeof (uint32_t));
       newp->wcs = (uint32_t *) obstack_finish (&collate->mempool);


-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]