This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
fputs() is too slow
- To: libc-alpha at sources dot redhat dot com
- Subject: fputs() is too slow
- From: "Zack Weinberg" <zackw at stanford dot edu>
- Date: Mon, 30 Oct 2000 08:43:52 -0800
While testing cpplib I noticed a performance problem with fputs.
cpplib's output routine does lots of fputs of short strings. This is
taking over half of runtime. Here's a trace:
% cumulative self self total
time seconds seconds calls us/call us/call name
29.41 0.15 0.15 815472 0.18 0.32 fputs_unlocked
21.57 0.26 0.11 815481 0.13 0.13 _IO_new_file_xsputn
11.76 0.32 0.06 816284 0.07 0.10 _cpp_get_token
11.76 0.38 0.06 1 60000.00 509998.79 scan_buffer
9.80 0.43 0.05 815472 0.06 0.38 cpp_output_token
0.05 0.26 815472/815472 scan_buffer [3]
[4] 60.8 0.05 0.26 815472 cpp_output_token [4]
0.15 0.11 815472/815472 fputs_unlocked [5]
0.00 0.00 198/198 __overflow [539]
-----------------------------------------------
0.15 0.11 815472/815472 cpp_output_token [4]
[5] 51.0 0.15 0.11 815472 fputs_unlocked [5]
0.11 0.00 815472/815481 _IO_new_file_xsputn [6]
-----------------------------------------------
0.00 0.00 9/815481 vfprintf [13]
0.11 0.00 815472/815481 fputs_unlocked [5]
[6] 21.6 0.11 0.00 815481 _IO_new_file_xsputn [6]
0.00 0.00 741/939 _IO_file_overflow [536]
0.00 0.00 741/781 _IO_default_xsputn [538]
We are using the _unlocked variants; this is not a
default-to-thread-safe problem. I modified cpp_output_token to call
putc_unlocked in a loop instead of fputs and got a substantial
improvement:
37.14 0.13 0.13 815472 0.16 0.16 cpp_output_token
22.86 0.21 0.08 1 80000.00 350000.00 scan_buffer
20.00 0.28 0.07 816284 0.09 0.12 _cpp_get_token
0.13 0.00 815472/815472 scan_buffer [2]
[5] 37.1 0.13 0.00 815472 cpp_output_token [5]
0.00 0.00 938/938 __overflow [536]
fputs_unlocked currently looks like
{
_IO_size_t len = strlen (str);
int result = EOF;
CHECK_FILE (fp, EOF);
if (_IO_fwide (fp, -1) == -1 && _IO_sputn (fp, str, len) == len)
result = 1;
return result;
}
_IO_sputn dispatches to _IO_new_file_xsputn, which is three pages of
spaghetti.
On the face of it, replacing fputs_unlocked with
{
do
_IO_putc_unlocked (fp, *str);
while (*++str);
}
would be appropriate, but I'm not sure what happens with longer
strings. Note that putc_unlocked does not touch _IO_fwide, so I
do not see why fputs_unlocked needs to.
In any case, _IO_new_file_xsputn badly needs to be streamlined.
$ /lib/libc.so.6
GNU C Library development release version 2.1.95, by Roland McGrath et al.
Compiled by GNU CC version 2.95.2 20000220 (Debian GNU/Linux).
Compiled on a Linux 2.2.17 system on 2000-10-13.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
linuxthreads-0.9 by Xavier Leroy
BIND-8.2.3-T5B
libthread_db work sponsored by Alpha Processor Inc
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
I have reason to believe the problem is present in 2.1 stable as well.
zw