This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: strxfrm and man


On Wed, Nov 08, 2006 at 11:36:01AM +0300, Nikita Shulga wrote:
> strxfrm man page said that return value are number of bytes required  
> to store transformed string excluding terminating \0? character.
> But this is not always true - if third argument is less then number  
> of bytes required  to store results and locale is,for example,  
> "en_US.utf8"
> than return value is length of transformed string including  
> terminating character, but if locale is C or POSIX it behaves as  
> described in man-page.
> For example, strxfrm(NULL,"a",0)<>strxfrm(buf,"a",10) for  
> "en_US.utf8" locale, but return values are equal if locale is "C".
> 
> Do you think it's OK? Or bug report should be filed to glibc bugzilla?

This is caused by glibc strxfrm optimization where it removes
a trailing \1, but only removes it when the third argument is big enough.
The current glibc behavior looks like a bug to me, neither ISO C99 nor
POSIX wording seem to allow returning different values depending on
what third argument was passed to it (as long as the source string
is identical and the locale is the same too).

The following patch should fix it, by checking the length of the
last rule's additions instead of checking whether the last char before
'\0' is '\1'.

2006-11-08  Jakub Jelinek  <jakub@redhat.com>

	* string/strxfrm_l.c (STRXFRM): Do the trailing \1 removal
	optimization even if needed > n.

--- libc/string/strxfrm_l.c.jj	2005-10-15 22:49:18.000000000 +0200
+++ libc/string/strxfrm_l.c	2006-11-08 22:18:38.000000000 +0100
@@ -1,4 +1,5 @@
-/* Copyright (C) 1995,96,97,2002, 2004, 2005 Free Software Foundation, Inc.
+/* Copyright (C) 1995, 1996, 1997, 2002, 2004, 2005, 2006
+   Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Written by Ulrich Drepper <drepper@gnu.org>, 1995.
 
@@ -95,7 +96,7 @@ STRXFRM (STRING_TYPE *dest, const STRING
   const USTRING_TYPE *extra;
   const int32_t *indirect;
   uint_fast32_t pass;
-  size_t needed;
+  size_t needed, last_needed;
   const USTRING_TYPE *usrc;
   size_t srclen = STRLEN (src);
   int32_t *idxarr;
@@ -197,6 +198,7 @@ STRXFRM (STRING_TYPE *dest, const STRING
 	 this is true for all of them.  */
       int position = rule & sort_position;
 
+      last_needed = needed;
       if (position == 0)
 	{
 	  for (idxcnt = 0; idxcnt < idxmax; ++idxcnt)
@@ -426,11 +428,11 @@ STRXFRM (STRING_TYPE *dest, const STRING
      a `position' rule at the end and if no non-ignored character
      is found the last \1 byte is immediately followed by a \0 byte
      signalling this.  We can avoid the \1 byte(s).  */
-  if (needed <= n && needed > 2 && dest[needed - 2] == L('\1'))
+  if (needed > 2 && needed == last_needed + 1)
     {
       /* Remove the \1 byte.  */
-      --needed;
-      dest[needed - 1] = L('\0');
+      if (--needed < n)
+	dest[needed - 1] = L('\0');
     }
 
   /* Free the memory if needed.  */


	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]