This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: glibc aio performance
- From: "Don Capps" <don dot capps2 at verizon dot net>
- To: "Amos P Waterland" <waterland at us dot ibm dot com>
- Cc: <libc-alpha at sources dot redhat dot com>,"Thomas Gall" <tom_gall at vnet dot ibm dot com>
- Date: Thu, 30 May 2002 19:00:16 -0500
- Subject: Re: glibc aio performance
- Organization: Self
- References: <OF71498DF7.F1B104B8-ON87256BC9.006CB74A-86256BC9.007784B4@boulder.ibm.com>
- Reply-to: "Don Capps" <don dot capps2 at verizon dot net>
Amos,
I tried a few more tests. This time I used a file
size that is two times the size of the memory
in the system.
Normal write/read
./iozone -r 64 -s 600M -i 0 -i 1
KB reclen write rewrite read reread
614400 64 27287 30701 27743 27737
POSIX async I/O (no bcopy)
./iozone -r 64 -s 600M -i 0 -i 1 -k 32
KB reclen write rewrite read reread
614400 64 25169 28362 26998 27111
The difference in performance for async I/O is
around 7 percent on write/rewrite and around
2 percent for read/re-read.
This is expected because:
1. There is only one disk and multiple threads don't help
2. There is more work for the system when using
multiple threads. (overhead)
3. The operating system already has a read-ahead algorithm for
sequential readers so the extra async I/O threads
don't help much.
4. More threads means more context switching, more
CPU cache flushes, and more TLB activity.
5. Normal read/write have a nice fast I/O completion
notification model that is implemented in the
operating system. POSIX async I/O was a "group think"
design and has a very poor I/O completion model that
slows things down. Polling for I/O completion or signals
was and is a very poor design. Most vendors have
custom async I/O routines that have a fast I/O completion
mechanism. (call back notification)
When to use async I/O.
There are times when async I/O is a better way to go.
If you have:
Multiple disks (stripe) filesystem So that parallel
async I/O requests can happen in parallel and not
cause head seek contention.
and
The access behavior of the application is to random
offsets in a file. So that multiple threads can access
different parts of the file on different spindles in
parallel, and to get read-ahead working as the operating
system is not going to do any for a random reader.
and
The async I/O threads are used over and over to
do the I/O so that the spawn/join times are not
a factor.
If your application meets the above criteria then it may
benefit from using async I/O. If it does not meet the
criteria then it is likely that it will not benefit.
Although the above criteria seems overly constraining
it maps well onto one application type very nicely.
Database applications. (Oracle, Sybase .....)
Hope this helps,
Don Capps
----- Original Message -----
From: "Amos P Waterland" <waterland@us.ibm.com>
To: "Don Capps" <don.capps2@verizon.net>
Cc: <libc-alpha@sources.redhat.com>; "Thomas Gall" <tom_gall@vnet.ibm.com>
Sent: Thursday, May 30, 2002 4:46 PM
Subject: Re: glibc aio performance
>
> Don:
>
> Thank you very much for your analysis. I have a few questions.
>
> I ran the suggested options on my Redhat 7.3 box, and got the following
> results. (I noticed that in your results, your second command line had -s
> 200M, but the report had 307200: maybe the 2 was just a typo in the
email?)
>
> % iozone -r 64 -s 300M -i 0 -i 1
> KB reclen write rewrite read reread
> 307200 64 32881 32494 42239 43256
> % iozone -k 32 -r 64 -s 300M -i 0 -i 1
> KB reclen write rewrite read reread
> 307200 64 18548 19151 26866 26512
>
> As you can see, the results are much better, but the AIO is still 30-40%
> slower than the SIO. (I re-ran the tests several times to try to iron out
> timing anomalies.) Do you think that thread setup, teardown, and overhead
> accounts for this? (I did try using just two threads, but did not get
> significantly better throughput.)
>
> Thanks in advance.
>
> Amos Waterland
>
>
>
>
>
> "Don Capps"
> <don.capps2@veriz To:
<libc-alpha@sources.redhat.com>, Amos P
> on.net> Waterland/Austin/IBM@IBMUS
> cc: Thomas
Gall/Rochester/IBM@IBMUS
> 05/30/02 02:34 PM Subject: Re: glibc aio
performance
> Please respond to
> "Don Capps"
>
>
>
>
>
> Amos,
>
> I believe that this may be a case of pilot error :-)
>
> In the case where you ran
> iozone -i0 -i1 -k128
>
> You asked for Iozone to use POSIX async I/O and
> to use 128 async writes/reads of 4k and wrote/read
> a file that was 512 Kbytes in size.
> Thus.. there are 128 async read threads doing exactly
> one 4k operation.
>
> Let's look under the hood.
> If one spawns 128 async reads/writes then this is the
> equivalent of calling fork() 128 times and having
> the new process do a single 4k operation and then
> terminate. The overhead of creating the threads,
> and terminating the threads is eating your lunch.
>
> Here are some suggestions that you might find useful.
>
> Try using a file size that is much larger.
> Example: -s 300M
> Try using a transfer size that is larger.
> Example: -r 64
> Try using fewer async ops.
> Example: -k 32
>
> In the examples above the threads will do a more
> significant amount of work and they will be
> re-used many times before they terminate.
>
> There would be 32 threads and each thread will
> do 64 kbyte transfers. The total number of
> transfers will be 4800. Each thread will now
> do 150 ops before it terminates.
>
> Here is output from my Redhat 7.2 box.
>
> ./iozone -r 64 -s 300M -i 0 -i 1
> KB reclen write rewrite read reread
> 307200 64 37314 29174 28042 28057
>
>
> ./iozone -k 32 -r 64 -s 200M -i 0 -i 1
> KB reclen write rewrite read reread
> 307200 64 25793 30625 27376 27335
>
> Since the filesystem is on a single disk drive there
> is no advantage to the async operations. But, the
> result is well within reason.
>
> I don't believe that there is anything wrong with
> glibc or with Iozone but more of a case of getting
> what you asked for and finding out that the question
> was probably not a good one :-)
>
> The moral of the story is: If you are going to spawn
> a thread then it would be wise to have it do some
> significant work before it terminates.
>
> Hope this helps,
> Don Capps
>
>
> ----- Original Message -----
> From: "Amos P Waterland" <waterland@us.ibm.com>
> To: <libc-alpha@sources.redhat.com>
> Cc: <capps@iozone.org>; <wnorcott@us.oracle.com>; <tom_gall@vnet.ibm.com>
> Sent: Thursday, May 30, 2002 1:01 PM
> Subject: glibc aio performance
>
>
> > Hello, I have been looking at the glibc asynchronous I/O implementation,
> > and have run into a bit of an issue.
> >
> > When I run IOzone (an open source filesystem benchmark tool) with AIO
> > enabled, it reports write KB/s throughput on the order of 45 times
slower
> > than that reported without AIO.
> >
> > % iozone -i0 -i1 #run just the write and read tests
> > [snip]
> > KB reclen write rewrite read reread [snip]
> > 512 4 101769 178334 331595 345702
> > % iozone -i0 -i1 -k128 #do same, but use no-bcopy aio
> > [snip]
> > KB reclen write rewrite read reread [snip]
> > 512 4 2232 47210 121874 103372
> >
> > I have looked at the source code for the glibc implementation, and it is
> > not obvious to me why keeping a thread pool, each of whose contituents
> > perform a pread(2) or pwrite(2), should be so much slower than
> synchronous
> > I/O. I looked at the source code for IOzone, and found that it uses
> > libasync.c for AIO, but could find no obvious performance problems in
its
> > code.
> >
> > So my question is: Might there be a problem with the way IOzone is using
> > glibc's implementation of AIO, or is glibc's implementation known to
have
> > performance problems?
> >
> > Amos Waterland
> >
> >
>
>
>
>
>
>
>