This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: DT_GNU_HASH: ~ 50% dynamic linking improvement

first thank you, for your reply. i am sending this mail to explain further my idea.My suggestion
was made primarily because of 2 assumptions:

1- the case that is to be optimized is when the expected number of unsuccesfull querie is much more important than the succesfull ones.
2- each unsuccesfull query is doing two memory accesses to 2 different cache lines : the bucket inside the hash table to determine and the hash chain: those two different cache lines are not consecutive.a succesfull query will do a third memory access to compare the name of the queried symbol with the symbol inside the table of symbols.

the problem with the current hash table is that its size is too small to prevent access to the hash chains with unsuccesfull seems that the number of buckets nbuckets<=nsymbols/2 which makes the table containing too few empty empty buckets.
increasing the size of the current hash table is not a good solution because it occupies 32 bits per bucket.My suggestion hence , was to add a second hash table before the current hash table that will filter out most of the unsuccesfuill queries.this table uses only one bit per bucket.

for nsymbols==3072 the size of this table will be of 4096*8=32768 buckets which occupies 4096 bytes.
the hash_table is declared as :
uint32 bit_hash_nbuckets=32768;
uint32 bit_hash_table[bit_hash_nbuckets>>5];

and the code looks like:

	return unsuccesfull_query;
	continue the search into the original hash_table.

with the help of this hash table, the number of memory accesses (and correspondingly l2 cache misses )in the common case for unsuccesfull queries will be reduced from 2 to 1 .

if we suppose that for the currently implemented hash table that nbuckets=nsymbols/2 then this hash table will occupy nsymbols*2 bytes. the hash chains will occupy exactly nsymbols*4 bytes.
an unsuccesfull query will do a first random access into the hash_table buckets and then do random access into the hash chains. so there will be two random accesses into a space of 6*nsymbols adding the new hash table before the currently implemented one the common unsuccesfull case will do an access into a space of nsymbols bytes.there is much more chance that a space of nsymbols bytes fit into the cache than for a space of nsymbols*6 fact it is possible that the cache miss rate will be much more smaller for the new added hash table than with the previous one because it is acceeded much more frequently than second hash table buckets and the hash chains.
although i am sure this solution is faster than the the one currently implemented, i know already that it is unlikely to be implemented because it overcomplicates the file format.
best regards.

MSN Messenger : appels gratuits de PC à PC !

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]