This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

fputs() is too slow


While testing cpplib I noticed a performance problem with fputs.
cpplib's output routine does lots of fputs of short strings.  This is
taking over half of runtime.  Here's a trace:

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 29.41      0.15     0.15   815472     0.18     0.32  fputs_unlocked
 21.57      0.26     0.11   815481     0.13     0.13  _IO_new_file_xsputn
 11.76      0.32     0.06   816284     0.07     0.10  _cpp_get_token
 11.76      0.38     0.06        1 60000.00 509998.79  scan_buffer
  9.80      0.43     0.05   815472     0.06     0.38  cpp_output_token

                0.05    0.26  815472/815472      scan_buffer [3]
[4]     60.8    0.05    0.26  815472         cpp_output_token [4]
                0.15    0.11  815472/815472      fputs_unlocked [5]
                0.00    0.00     198/198         __overflow [539]
-----------------------------------------------
                0.15    0.11  815472/815472      cpp_output_token [4]
[5]     51.0    0.15    0.11  815472         fputs_unlocked [5]
                0.11    0.00  815472/815481      _IO_new_file_xsputn [6]
-----------------------------------------------
                0.00    0.00       9/815481      vfprintf [13]
                0.11    0.00  815472/815481      fputs_unlocked [5]
[6]     21.6    0.11    0.00  815481         _IO_new_file_xsputn [6]
                0.00    0.00     741/939         _IO_file_overflow [536]
                0.00    0.00     741/781         _IO_default_xsputn [538]

We are using the _unlocked variants; this is not a
default-to-thread-safe problem.  I modified cpp_output_token to call
putc_unlocked in a loop instead of fputs and got a substantial
improvement:

 37.14      0.13     0.13   815472     0.16     0.16  cpp_output_token
 22.86      0.21     0.08        1 80000.00 350000.00  scan_buffer
 20.00      0.28     0.07   816284     0.09     0.12  _cpp_get_token

                0.13    0.00  815472/815472      scan_buffer [2]
[5]     37.1    0.13    0.00  815472         cpp_output_token [5]
                0.00    0.00     938/938         __overflow [536]

fputs_unlocked currently looks like

{
  _IO_size_t len = strlen (str);
  int result = EOF;
  CHECK_FILE (fp, EOF);
  if (_IO_fwide (fp, -1) == -1 && _IO_sputn (fp, str, len) == len)
    result = 1;
  return result;
}

_IO_sputn dispatches to _IO_new_file_xsputn, which is three pages of
spaghetti.

On the face of it, replacing fputs_unlocked with

{
  do
    _IO_putc_unlocked (fp, *str);
  while (*++str);
}

would be appropriate, but I'm not sure what happens with longer
strings.  Note that putc_unlocked does not touch _IO_fwide, so I
do not see why fputs_unlocked needs to.

In any case, _IO_new_file_xsputn badly needs to be streamlined.

$ /lib/libc.so.6 
GNU C Library development release version 2.1.95, by Roland McGrath et al.
Compiled by GNU CC version 2.95.2 20000220 (Debian GNU/Linux).
Compiled on a Linux 2.2.17 system on 2000-10-13.
Available extensions:
        GNU libio by Per Bothner
        crypt add-on version 2.1 by Michael Glad and others
        linuxthreads-0.9 by Xavier Leroy
        BIND-8.2.3-T5B
        libthread_db work sponsored by Alpha Processor Inc
        NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk

I have reason to believe the problem is present in 2.1 stable as well.

zw

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]