This is the mail archive of the libc-alpha@sourceware.cygnus.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

iconv(3) return value



Hi Ulrich,

The iconv documentation in glibc has two mistakes. It says:

    If all input from the input buffer is successfully converted
    and stored in the output buffer the function returns the number
    of conversions performed.

The SUSV2 manual page for iconv(3) says: "The iconv() function ...
returns the number of non-identical conversions performed." From the
other use of the word "identical" in the same page, this means that
the return value is the number of characters converted in a
non-reversible way during the call.

The glibc iconv implementation needs to be fixed accordingly.

    In all other cases the return value is (size_t) -1 and errno is
    set appropriately. In this case the value pointed to by inbytesleft
    is nonzero.

The SUSV2 manual page for iconv(3) says: "The iconv() function updates
the variables pointed to by the arguments to reflect the extent of the
conversion and returns the number of non-identical conversions performed.
If the entire string in the input buffer is converted, the value pointed
to by inbytesleft will be 0. If the input conversion is stopped due to
any conditions mentioned above, the value pointed to by inbytesleft will
be non-zero and errno is set to indicate the condition. If an error occurs
iconv() returns (size_t)-1 and sets errno to indicate the error."

When the conversion stops with *inbytesleft > 0, the second sentence does
not specify the return value, the third sentence does not apply because
it talks about "error" not "condition", hence the first sentence must
determine the return value. Thus, in this case, iconv() should set
errno _and_ return a value != (size_t)(-1).

The glibc iconv implementation needs to be fixed accordingly. Its current
behaviour is as follows: it sets errno but returns (size_t)(-1).

A test program is attached.

Bruno

===============================================================================
#include <iconv.h>
#include <stdio.h>
#include <errno.h>

int main()
{
  char in [100] = { 0x45, 0x89, 0x34, };
  char out [100];

  iconv_t cd = iconv_open("ISO-8859-1", "UTF-8");

  const char* inptr;
  size_t inbytesleft;
  char* outptr;
  size_t outbytesleft;
  long res;
  char buf[10];

  if (cd == (iconv_t)(-1)) {
    perror("iconv_open");
    return 1;
  }

  /* Test return value in normal case. */
  inptr = in; inbytesleft = 1; outptr = out; outbytesleft = sizeof(out);
  errno = 0;
  res = iconv(cd, &inptr, &inbytesleft, &outptr, &outbytesleft);
  printf("res=%ld errno=%s inptr+=%ld inbytesleft=%ld outptr+=%ld outbytesleft=%ld\n",
         res, (errno == EILSEQ ? "EILSEQ" : (sprintf(buf,"%d",errno),buf)),
         (long)(inptr-in), (long)inbytesleft,
         (long)(outptr-out), (long)outbytesleft);
  /* Correct:
   * res=0 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
   */

  /* Test return value if conversion stops at the beginning of buffer. */
  inptr = in+1; inbytesleft = 1; outptr = out; outbytesleft = sizeof(out);
  errno = 0;
  res = iconv(cd, &inptr, &inbytesleft, &outptr, &outbytesleft);
  printf("res=%ld errno=%s inptr+=%ld inbytesleft=%ld outptr+=%ld outbytesleft=%ld\n",
         res, (errno == EILSEQ ? "EILSEQ" : (sprintf(buf,"%d",errno),buf)),
         (long)(inptr-(in+1)), (long)inbytesleft,
         (long)(outptr-out), (long)outbytesleft);
  /* Correct:
   * res=0 errno=EILSEQ inptr+=0 inbytesleft=1 outptr+=0 outbytesleft=100
   */

  /* Test return value if conversion stops after some valid conversions. */
  inptr = in; inbytesleft = 2; outptr = out; outbytesleft = sizeof(out);
  errno = 0;
  res = iconv(cd, &inptr, &inbytesleft, &outptr, &outbytesleft);
  printf("res=%ld errno=%s inptr+=%ld inbytesleft=%ld outptr+=%ld outbytesleft=%ld\n",
         res, (errno == EILSEQ ? "EILSEQ" : (sprintf(buf,"%d",errno),buf)),
         (long)(inptr-in), (long)inbytesleft,
         (long)(outptr-out), (long)outbytesleft);
  /* Correct:
   * res=0 errno=EILSEQ inptr+=1 inbytesleft=1 outptr+=1 outbytesleft=99
   */

  return 0;
}

/*
Correct:
res=0 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=0 errno=EILSEQ inptr+=0 inbytesleft=1 outptr+=0 outbytesleft=100
res=0 errno=EILSEQ inptr+=1 inbytesleft=1 outptr+=1 outbytesleft=99

glibc-2.1:
res=1 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=-1 errno=EILSEQ inptr+=0 inbytesleft=1 outptr+=0 outbytesleft=100
res=-1 errno=EILSEQ inptr+=1 inbytesleft=1 outptr+=1 outbytesleft=99

libiconv-1.0:
res=0 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=-1 errno=EILSEQ inptr+=0 inbytesleft=1 outptr+=0 outbytesleft=100
res=-1 errno=EILSEQ inptr+=1 inbytesleft=1 outptr+=1 outbytesleft=99

Solaris 2.7:
res=0 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=-1 errno=EILSEQ inptr+=0 inbytesleft=1 outptr+=0 outbytesleft=100
res=-1 errno=EILSEQ inptr+=1 inbytesleft=1 outptr+=1 outbytesleft=99

Irix 6.5:
res=0 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=-1 errno=22 inptr+=0 inbytesleft=1 outptr+=0 outbytesleft=100
res=-1 errno=22 inptr+=1 inbytesleft=1 outptr+=1 outbytesleft=99

FreeBSD iconv-0.4:
res=1 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=1 errno=0 inptr+=1 inbytesleft=0 outptr+=1 outbytesleft=99
res=2 errno=0 inptr+=2 inbytesleft=0 outptr+=2 outbytesleft=98

*/
===============================================================================

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]