This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Updated x86-64 memcpy and New x86-64 memset
- From: Evandro Menezes <evandro at yahoo dot com>
- To: Ulrich Drepper <drepper at redhat dot com>
- Cc: List GLIBC <libc-alpha at sourceware dot org>, harsha dot jagasia at amd dot com, michael dot meissner at amd dot com, christophe dot harle at amd dot com
- Date: Sat, 25 Aug 2007 11:36:45 -0700 (PDT)
- Subject: Re: Updated x86-64 memcpy and New x86-64 memset
Ulrich,
> You compare (in memcpy) what is larger:
>
> __x86_64_core_cache_size_half
> _x86_64_data_cache_size_half
>
> The result is never going to change. Therefore this should be done in
> the cacheinfo initialization. If one of the vars is never used
> otherwise remove it (I haven't checked it).
The way I wrote the initialization code, if a cache topology is absent its corresponding variable has a value of zero.
What I'm trying to account for are the several different configurations one can find in commonly found processors:
Data Core Shared
Opteron L1 L2 -
Barcelona L1 L2 L3
P4 L1 L2 -
P4 Xeon L1 L2 L3
Core2 L1 - L2
So, in the example you mentioned above, on Core2 the core data is L1 and the core cache is absent, therefore L1 is larger. In all other processros above, core cache is always larger.
> Where are the performance numbers for memset? For Intel and AMD.
They should be coming up soon.
Thanks, --
__________________________________________________________________
Evandro Menezes Austin, TX http://themenezes.us