This is the mail archive of the
glibc-bugs@sources.redhat.com
mailing list for the glibc project.
[Bug libc/206] malloc does not align memory correctly for sse capable systems
- From: "wg at malloc dot de" <sourceware-bugzilla at sources dot redhat dot com>
- To: glibc-bugs at sources dot redhat dot com
- Date: 5 Jun 2004 22:24:39 -0000
- Subject: [Bug libc/206] malloc does not align memory correctly for sse capable systems
- References: <20040605142238.206.ma1flfs@bath.ac.uk>
- Reply-to: sourceware-bugzilla at sources dot redhat dot com
------- Additional Comments From wg at malloc dot de 2004-06-05 22:24 -------
Subject: Re: malloc does not align memory correctly for sse capable systems
Hello,
> The alignment size is defined as MALLOC_ALIGNMENT defined as
> (2*sizeof(INTERNAL_SIZE_T)). INTERNAL_SIZE_T is size_t (4 byte),
> so MALLOC_ALIGNMENT is 8 bytes.
Yes. I believe setting MALLOC_ALIGNMENT to 16 would work in the
source, but _should not_ be done in the general case, because of the
significant performance drop (many more objects /about half would need
to be rounded up to a multiple of 16 in size).
> Bug#206 says it's not suitable
> for SSE instruction that needs 16 byte alignment as follows:
> http://sources.redhat.com/bugzilla/show_bug.cgi?id=206
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15795
>
> I don't know that changing memory alignment size is acceptable
> on i386. IMHO, SSE is special instruction, so I think in this
> case using posix_memalign() is safe. Please check this bug?
I am well aware that the C standard says that alignment needs to be
"suitable for any type of object". However, SSE instructions are IMHO
clearly out of the scope of the C standard -- at least it is not a
clear case of non-conformity to the standard.
Best would be to use posix_memalign() in the applications only for the
allocations where the alignment is really required, because that would
give optimal performance and least memory waste.
Second best (if you don't want/can't change the apps _at all_) would
be to have an additional shared library (libmalloc16?) where malloc is
compiled with MALLOC_ALIGNMENT=16, and link that library only into the
SSE-using applications. A bit tricky because of interdependency with
libpthread, but most probably doable.
Worst and IMHO unacceptable would be to make MALLOC_ALIGNMENT dynamic;
malloc would become much slower.
What do you think? How do other systems handle this?
Regards,
Wolfram.
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=206
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.