This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: [rfa] symbol hashing, part 1/n - updates to hash functions
- To: Elena Zannoni <ezannoni at cygnus dot com>
- Subject: Re: [rfa] symbol hashing, part 1/n - updates to hash functions
- From: Daniel Berlin <dan at cgsoftware dot com>
- Date: Thu, 11 Oct 2001 19:57:27 -0400
- Cc: Daniel Jacobowitz <drow at mvista dot com>, gdb-patches at sources dot redhat dot com
On Thursday, October 11, 2001, at 07:58 PM, Elena Zannoni wrote:
> Daniel Jacobowitz writes:
>> This patch still has two logical parts; if you strongly prefer I can
>> break
>> it up further, but they are somewhat intertwined and I think neither
>> should
>> be objectionable. They are:
>> - Fix a looping bug in msymbol_hash_iw. It would not stop on '(' if
>> there
>> was whitespace before it.
>> - Update to use the identifier hash function that libiberty uses, and
>> more buckets.
>>
>> Is this OK?
>
> Looks ok to me in theory. Except that, why was the
>
> '% MINIMAL_SYMBOL_HASH_SIZE;'
>
> bit moved outside of the msymbol_hash and msymbol_hash_iw functions?
So msymbol_hash and hash_iw could be used elsewhere.
> You still do the same operation with the results returned by the two
> functions anyway.
>
Except, now they are just hash functions, not hash functions that only
work for the minsym hash tables.
> Also, where are these 2 functions used besides mynsyms.c?
In a further symbol hashing patch, unless he changed it.
> I think we
> should make them static and remove the extern from symtab.h.
>
> Can you give me an example where the '(' error comes up? (Just so I
> understand it better). How did you come up with the number of
> buckets?
Averaging based on a large number of gnome, kde, and other real
applications (ie emacs), compiled with debug info.
349 is way too small, we ended up with chains > length 100, all the time.
> Is this also used in libiberty?
Which, the hash function?
> Can you fix it and resubmit?
>
> Thanks
> Elena
>
>>
>> --
>> Daniel Jacobowitz Carnegie Mellon University
>> MontaVista Software Debian GNU/Linux Developer
>>
>> 2001-10-01 Daniel Jacobowitz <drow@mvista.com>
>>
>> * minsyms.c (msymbol_hash): Use better hash function.
>> Return hash value without taking modulus.
>> (msymbol_hash_iw): Likewise. Terminate loop at '(' properly.
>> (add_minsym_to_hash_table): Take modulus of msymbol_hash's return
>> value.
>> (add_minsym_to_demangled_hash_table): Likewise for msymbol_hash_iw.
>> (lookup_minimal_symbol): Likewise for both.
>>
>> * objfiles.h: Increase MINIMAL_SYMBOL_HASH_SIZE to match modern
>> binaries.
>>
>> Index: gdb/minsyms.c
>> ===================================================================
>> RCS file: /cvs/src/src/gdb/minsyms.c,v
>> retrieving revision 1.17
>> diff -u -p -r1.17 minsyms.c
>> --- minsyms.c 2001/05/29 10:45:10 1.17
>> +++ minsyms.c 2001/10/01 22:20:47
>> @@ -96,10 +96,12 @@ msymbol_hash_iw (const char *string)
>> while (isspace (*string))
>> ++string;
>> if (*string && *string != '(')
>> - hash = (31 * hash) + *string;
>> - ++string;
>> + {
>> + hash = hash * 67 + *string - 113;
>> + ++string;
>> + }
>> }
>> - return hash % MINIMAL_SYMBOL_HASH_SIZE;
>> + return hash;
>> }
>>
>> /* Compute a hash code for a string. */
>> @@ -109,8 +111,8 @@ msymbol_hash (const char *string)
>> {
>> unsigned int hash = 0;
>> for (; *string; ++string)
>> - hash = (31 * hash) + *string;
>> - return hash % MINIMAL_SYMBOL_HASH_SIZE;
>> + hash = hash * 67 + *string - 113;
>> + return hash;
>> }
>>
>> /* Add the minimal symbol SYM to an objfile's minsym hash table,
>> TABLE. */
>> @@ -120,7 +122,7 @@ add_minsym_to_hash_table (struct minimal
>> {
>> if (sym->hash_next == NULL)
>> {
>> - unsigned int hash = msymbol_hash (SYMBOL_NAME (sym));
>> + unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) %
>> MINIMAL_SYMBOL_HASH_SIZE;
>> sym->hash_next = table[hash];
>> table[hash] = sym;
>> }
>> @@ -134,7 +136,7 @@ add_minsym_to_demangled_hash_table (stru
>> {
>> if (sym->demangled_hash_next == NULL)
>> {
>> - unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME
>> (sym));
>> + unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME
>> (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
>> sym->demangled_hash_next = table[hash];
>> table[hash] = sym;
>> }
>> @@ -162,8 +164,8 @@ lookup_minimal_symbol (register const ch
>> struct minimal_symbol *found_file_symbol = NULL;
>> struct minimal_symbol *trampoline_symbol = NULL;
>>
>> - unsigned int hash = msymbol_hash (name);
>> - unsigned int dem_hash = msymbol_hash_iw (name);
>> + unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE;
>> + unsigned int dem_hash = msymbol_hash_iw (name) %
>> MINIMAL_SYMBOL_HASH_SIZE;
>>
>> #ifdef SOFUN_ADDRESS_MAYBE_MISSING
>> if (sfile != NULL)
>>
>> Index: gdb/objfiles.h
>> ===================================================================
>> RCS file: /cvs/src/src/gdb/objfiles.h,v
>> retrieving revision 1.8
>> diff -u -p -r1.8 objfiles.h
>> --- objfiles.h 2001/03/06 08:21:11 1.8
>> +++ objfiles.h 2001/10/01 22:20:53
>> @@ -202,7 +202,7 @@ extern void print_objfile_statistics (vo
>> extern void print_symbol_bcache_statistics (void);
>>
>> /* Number of entries in the minimal symbol hash table. */
>> -#define MINIMAL_SYMBOL_HASH_SIZE 349
>> +#define MINIMAL_SYMBOL_HASH_SIZE 2039
>>
>> /* Master structure for keeping track of each file from which
>> gdb reads symbols. There are several ways these get allocated: 1.