This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug stdio/6530] *printf() and incomplete multibyte sequences returns -1 bogusly


http://sourceware.org/bugzilla/show_bug.cgi?id=6530

--- Comment #14 from Jonathan Nieder <jrnieder at gmail dot com> 2012-06-23 02:31:05 UTC ---
Hi,

Rich Felker wrote:

> --- vfprintf.c.orig
> +++ vfprintf.c
> @@ -1168,42 +1168,7 @@
>  	else if (!is_long && spec != L_('S'))				      \
>  	  {								      \
>  	    if (prec != -1)						      \
> -	      {								      \
> -		/* Search for the end of the string, but don't search past    \
> -		   the length (in bytes) specified by the precision.  Also    \
> -		   don't use incomplete characters.  */			      \
> -		if (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MB_CUR_MAX) == 1)   \
>  		  len = __strnlen (string, prec);			      \
> -		else							      \
> -		  {							      \
> -		    /* In case we have a multibyte character set the	      \
> -		       situation is more complicated.  We must not copy	      \
> -		       bytes at the end which form an incomplete character. */\
[...]
> -		      len = str2 - string - (ps.__count & 7);		      \
> -		  }							      \
> -	      }								      \
>  	    else							      \
>  	      len = strlen (string);					      \
>  	  }								      \
> This patch fixes the bug. As I suspected, it's all -'s, no +'s.

Thanks.  wprintf(3) tells me:

    s
        If no l modifier is present: The const char * argument
        is expected to be a pointer to an array of character
        type (pointer to a string) containing a multibyte character
        sequence beginning in the initial shift state.
        Characters from the array are converted to wide
        characters (each by a call to the mbrtowc(3) function
        with a conversion state starting in the initial state
        before the first byte).  The resulting wide characters
        are written up to (but not including) the terminating
        null wide character.
[etc]

C99 says that in this case "the argument shall be a pointer to the
initial element of a character array containing a multibyte character
sequence beginning in the initial shift state".  POSIX uses similar
wording.  C99 describes an error condition:

    The fwprintf function returns the number of wide characters
    transmitted, or a negative value if an output or encoding
    error occurred.

POSIX does not document what value errno should have in this case, but
presumably EILSEQ is typical, for consistency with mbrtowc.

Luckily these considerations only matter in the defined(COMPILE_WPRINTF)
case.  Your patch only modifies the !defined(COMPILE_WPRINTF) code.
So for what it's worth:

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]