This is the mail archive of the
mailing list for the binutils project.
Re: [patch] Use unaligned access on x86_64
- From: Cary Coutant <ccoutant at gmail dot com>
- To: Rafael EspÃndola <rafael dot espindola at gmail dot com>
- Cc: Binutils <binutils at sourceware dot org>
- Date: Mon, 1 Jun 2015 16:48:27 -0700
- Subject: Re: [patch] Use unaligned access on x86_64
- Authentication-results: sourceware.org; auth=none
- References: <CAG3jRe+iwbjGWdqG-0gTb-8yNWqDN=S-1iVLkNKP2cXEscONhA at mail dot gmail dot com>
> x86_64 has exquisite support for unaligned loads. It is a shame not to use it.
> The attached patch avoids aligning archive members on x86_64. The
> results when linking clang are very interesting:
> * massif reports that the malloc memory usage goes from 331,295,192
> bytes to just 133,415,136 bytes.
> * the linking time (30 runs average) goes from
> 1.310065610 seconds time elapsed ( +- 0.19% )
> 1.162564763 seconds time elapsed ( +- 0.14% )
Hmmm, I guess x86 has gotten a lot better with this.
I'd rather have a configure flag that tells us whether the host
platform can do unaligned access without (much) penalty. I did a quick
search but didn't come up with anything provided by autoconf. Maybe
add a configure option like --enable-fast-unaligned-access? Other
suggestions? Write a micro-benchmark for configure to run on the fly?
(I'm kind of surprised that I couldn't find an autoconf macro for this
-- I'd think that the ability to use unaligned loads/stores is
something that lots of programs would want to test for at configure
On the other hand, the archive format should generally keep things on
4-byte boundaries -- the magic string is 8 bytes, archive headers are
60 bytes, and ELF file members will generally be a multiple of 4 or 8
bytes in length. The symbol map should be a multiple of 4, but I'll
bet it's the long-file name table that's throwing everything out of
alignment. If we could just fix that, we could probably improve
archive performance on many platforms where unaligned loads are not
fast. Of course, for 64-bit targets, we're going to insist on 8-byte
alignment, so to avoid the malloc-and-copy, we'd have to arrange for
archive members to be 8-byte aligned.
Also, have you tried thin archives?