This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[RFC]: Greatly speed symbol lookups


This is a prerelease of  a patch that changes the syms array in a
block to be a hashtable. It adds a hash_next member to the symbol
structure so we can chain them, since we can have multiple symbols of
the same name. 
We leave blocks that have BLOCK_FUNCTION set alone (IE they are just a
linear array), just like BLOCK_SHOULD_SORT would, to avoid changing
the argument order.

Besides the major changes to mdebugread I need to perform, it's pretty
much ready to go.

The hash key is SYMBOL_SOURCE_NAME, and the hash function is a
modified msymbol_hash_iw (it's based on the FNV-1 hash now, which is
the FNV-0 hash in bcache.c with a different starting value.
Obviously, it has been modified to be insensitive to whitespace and
anything after the first paren, just like msymbol_hash_iw has been in
the past).  No regressions are caused by this patch on
hosts/targets not using those symbol readers.

The speed improvement on symbol lookups is at least 30-40% on average.  Average lookup time is
split about 50% between msymbol_hash_iw and 50% between the new
lookup_block symbol.  Considering how little work lookup_block_symbol
does now (linear search if BLOCK_FUNCTION is set, 6 lines of code to
run through the hash chain of symbols if it's not set), this is actually very good.

I arrived at the figure by profiling gdb while doing random large
program debugging, of both c and c++ programs i needed  to debug, as
well as medium sized programs like GDB.

The at least 30-40% appears pretty much everywhere.

For example, just doing something like "info var a" in GDB, which does 1.6 million
lookup_block symbols, will run ~60% faster (4.65 seconds vs 2.06)

However both jv-lang, dstread and mdebugread completely avoid the normal way of making the
blocks, directly filling the arrays, so I can't just change "sym =
BLOCK_SYM (block, i)" to "for (sym = BLOCK_BUCKET (block, i); sym; sym
= sym->hash_next) { ... }" as I can with things that want to run
through all the symbols.

dstread and jv-lang aren't all that bad.
But mdebugread basically duplicates all of buildsym, I assume it's
been that way since before buildsym came from dbxread, and mdebugread
was split from mipsread. 
It's going to take me a week or two worth of work to fix it.

Anyway, I've modified all the other references in GDB to use the approriate
macros for what they are doing (although BLOCK_NBUCKETS and
BLOCK_NSYMS, and BLOCK_SYM and BLOCK_BUCKET currently point to the
same things, i figured i might as well future proof them in case
someone wants to change the block structure).

I got the idea for doing this, and realized it would work, from my new
copy of Muchnick's "Advanced compiler design and implementation".  He
describes a scheme for using a  symbol stack,  a hash table,  and an
associated block stack,  where the hash table is used for fast symbol
lookup, *as well as* determining your visibility, and you
reorder the chains in a neat way to implement the visibility rules (so
that the first symbol of a certain name you come across is the right
one for where you are now).

He uses a stack plus a hash table because obviously the
compiler only needs to do one function at a time.  We could do the
same if we actually tracked the scope we were in, and this is a
possible future  enhancement (we only  need the symbols at the global
level, and the current visible scope, unless specifically
requested. We could keep a cache of 
everything but the nearest scope [since it would change a lot, i
imagine], of a fixed size, and just reread stuff as necessary.  Or we
could use a huge file with all the symbol structures, and mmap
different parts.  Apple's GDB already does Mapped symbols properly
again.). 
I realized since we effectively have much the same structure (we just
have a vector of blockstacks, and the block contain the symbol stacks
themselves, rather than the symbol stack being seperate), that we
could use the block symbol arrays as a hash table, and just chain the
symbols.  Our scheme just avoids the reordering,  at the cost of having
to look through 8 million more blocks than we should :(.  Whoops.

To handle things like exporting things to other scopes, we'd probably
want to go with his idea of seperating the actual symbol stack and block
structures, and including a level number with a symbol name that tells
you the deepest scope it's visible in (we'd probably need a list of
what blocks it's visible in instead, since we have the whole tree. But
just thinking about it, it seems this scheme would certainly work for
GDB).

But that's in the future.

2001-05-30  Daniel Berlin  <dan@cgsoftware.com>

	* coffread.c (patch_opaque_types): Use BLOCK_NBUCKETS,
	BLOCK_BUCKETS, handle the fact that we now have symbol chains.

2001-05-29  Daniel Berlin  <dan@cgsoftware.com>

	* bcache.c (hash): Fix indenting, make it FNV-1 hash rather than
	FNV-0 hash (just a change in the starting constant).

	* minsyms.c (msymbol_hash_iw): Rewrite, based on FNV-1 hash from
	bcache.c.  Don't modulo the hash by the table size here, so we can
	reuse this function elsewhere. 
	Add comments about changing the loop this function uses, and how
	symbol lookup is dominated by this function now.
	(msymbol_hash): Just use FNV-1 hash from bcache.c. Also don't
	modulo the hash by the table size here, so we can reuse this
	function.
	(add_minsym_to_hash_table): Modulo the msymbol_hash return value
	by the hash table size.
	(add_minsym_to_demangled_hash_table): Ditto for msymbol_hash_iw.
	(lookup_minimal_symbol): Ditto for both.

	* symtab.c (lookup_block_symbol): Complete rewrite.
	(search_symbols): Use BLOCK_NBUCKETS, BLOCK_BUCKET, and walk the
	chain, since it's trying to go through all the symbols in blocks
	we know to be in hash table form.
	(make_symbol_completion_list): Ditto.
	(make_symbol_overload_list): Ditto.

	* buildsym.c (finish_block): Use the syms array in the block as a
	hash table if possible (IE, the block isn't a function).  Default
	number of buckets is 1/4 the number of symbols. 

	* symtab.h: (BLOCK_BUCKET): New macro. Currently the same as
	BLOCK_SYM, since we reuse the sym array as the hash table.
	(BLOCK_NBUCKETS): New macro. Currently the same as BLOCK_NSYMS,
	for the same reason.
	(BLOCK_SHOULD_SHORT): Never sort, not necessary anymore.
	(struct symbol): Add hash_next member.

	
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.38
diff -c -3 -p -w -B -b -r1.38 symtab.c
*** symtab.c	2001/05/14 18:49:54	1.38
--- symtab.c	2001/05/31 20:48:04
*************** lookup_partial_symbol (struct partial_sy
*** 984,994 ****
  	  center = bottom + (top - bottom) / 2;
  	  if (!(center < top))
  	    internal_error (__FILE__, __LINE__, "failed internal consistency check");
- 	  if (!do_linear_search
- 	      && (SYMBOL_LANGUAGE (*center) == language_java))
- 	    {
- 	      do_linear_search = 1;
- 	    }
  	  if (strcmp (SYMBOL_SOURCE_NAME (*center), name) >= 0)
  	    {
  	      top = center;
--- 901,906 ----
*************** find_main_psymtab (void)
*** 1168,1272 ****
  
  /* Search BLOCK for symbol NAME in NAMESPACE.
  
!    Note that if NAME is the demangled form of a C++ symbol, we will fail
!    to find a match during the binary search of the non-encoded names, but
!    for now we don't worry about the slight inefficiency of looking for
!    a match we'll never find, since it will go pretty quick.  Once the
!    binary search terminates, we drop through and do a straight linear
!    search on the symbols.  Each symbol which is marked as being a C++
!    symbol (language_cplus set) has both the encoded and non-encoded names
!    tested for a match. */
  
  struct symbol *
  lookup_block_symbol (register const struct block *block, const char *name,
  		     const namespace_enum namespace)
- {
-   register int bot, top, inc;
-   register struct symbol *sym;
-   register struct symbol *sym_found = NULL;
-   register int do_linear_search = 1;
- 
-   /* If the blocks's symbols were sorted, start with a binary search.  */
- 
-   if (BLOCK_SHOULD_SORT (block))
-     {
-       /* Reset the linear search flag so if the binary search fails, we
-          won't do the linear search once unless we find some reason to
-          do so */
- 
-       do_linear_search = 0;
-       top = BLOCK_NSYMS (block);
-       bot = 0;
- 
-       /* Advance BOT to not far before the first symbol whose name is NAME. */
- 
-       while (1)
- 	{
- 	  inc = (top - bot + 1);
- 	  /* No need to keep binary searching for the last few bits worth.  */
- 	  if (inc < 4)
- 	    {
- 	      break;
- 	    }
- 	  inc = (inc >> 1) + bot;
- 	  sym = BLOCK_SYM (block, inc);
- 	  if (!do_linear_search && (SYMBOL_LANGUAGE (sym) == language_java))
- 	    {
- 	      do_linear_search = 1;
- 	    }
- 	  if (SYMBOL_SOURCE_NAME (sym)[0] < name[0])
- 	    {
- 	      bot = inc;
- 	    }
- 	  else if (SYMBOL_SOURCE_NAME (sym)[0] > name[0])
- 	    {
- 	      top = inc;
- 	    }
- 	  else if (strcmp (SYMBOL_SOURCE_NAME (sym), name) < 0)
- 	    {
- 	      bot = inc;
- 	    }
- 	  else
- 	    {
- 	      top = inc;
- 	    }
- 	}
- 
-       /* Now scan forward until we run out of symbols, find one whose
-          name is greater than NAME, or find one we want.  If there is
-          more than one symbol with the right name and namespace, we
-          return the first one; I believe it is now impossible for us
-          to encounter two symbols with the same name and namespace
-          here, because blocks containing argument symbols are no
-          longer sorted.  */
- 
-       top = BLOCK_NSYMS (block);
-       while (bot < top)
- 	{
- 	  sym = BLOCK_SYM (block, bot);
- 	  if (SYMBOL_NAMESPACE (sym) == namespace &&
- 	      SYMBOL_MATCHES_NAME (sym, name))
  	    {
! 	      return sym;
! 	    }
! 	  bot++;
! 	}
!     }
! 
!   /* Here if block isn't sorted, or we fail to find a match during the
!      binary search above.  If during the binary search above, we find a
!      symbol which is a C++ symbol, then we have re-enabled the linear
!      search flag which was reset when starting the binary search.
! 
!      This loop is equivalent to the loop above, but hacked greatly for speed.
! 
!      Note that parameter symbols do not always show up last in the
!      list; this loop makes sure to take anything else other than
!      parameter symbols first; it only uses parameter symbols as a
!      last resort.  Note that this only takes up extra computation
!      time on a match.  */
! 
!   if (do_linear_search)
      {
        top = BLOCK_NSYMS (block);
        bot = 0;
--- 1080,1097 ----
  
  /* Search BLOCK for symbol NAME in NAMESPACE.
  
!    If we couldn't use a hashtable for the block, we just do a linear search.
!    Otherwise, we hash the name, and walk the bucket's symbol chain to see if we find
!    the symbol we want. 
! */
  
  struct symbol *
  lookup_block_symbol (register const struct block *block, const char *name,
  		     const namespace_enum namespace)
  {
!   register unsigned int top,bot;
!   register struct symbol *sym, *sym_found=NULL;
!   if (BLOCK_FUNCTION (block) != NULL)
      {
        top = BLOCK_NSYMS (block);
        bot = 0;
*************** lookup_block_symbol (register const stru
*** 1313,1320 ****
  	  bot++;
  	}
      }
!   return (sym_found);		/* Will be NULL if not found. */
  }
  
  /* Given a main symbol SYM and ADDR, search through the alias
     list to determine if an alias is active at ADDR and return
--- 1138,1157 ----
  	  bot++;
  	}
      }
!   else
!     {
!       top = msymbol_hash_iw (name) % BLOCK_NBUCKETS (block);
!       sym_found = BLOCK_BUCKET (block, top);
!       while (sym_found != NULL)
! 	{
! 	  if (SYMBOL_NAMESPACE (sym_found) == namespace 
! 	      && SYMBOL_MATCHES_NAME (sym_found, name))
! 	    break;
! 	  sym_found = sym_found->hash_next;
  	}
+     }
+   return sym_found;
+ }
  
  /* Given a main symbol SYM and ADDR, search through the alias
     list to determine if an alias is active at ADDR and return
*************** find_pc_sect_symtab (CORE_ADDR pc, asect
*** 1373,1379 ****
    register struct partial_symtab *ps;
    register struct objfile *objfile;
    CORE_ADDR distance = 0;
! 
    /* Search all symtabs for the one whose file contains our address, and which
       is the smallest of all the ones containing the address.  This is designed
       to deal with a case like symtab a is at 0x1000-0x2000 and 0x3000-0x4000
--- 1210,1217 ----
    register struct partial_symtab *ps;
    register struct objfile *objfile;
    CORE_ADDR distance = 0;
!   if (pc == 0)
! 	  return NULL;
    /* Search all symtabs for the one whose file contains our address, and which
       is the smallest of all the ones containing the address.  This is designed
       to deal with a case like symtab a is at 0x1000-0x2000 and 0x3000-0x4000
*************** find_pc_sect_symtab (CORE_ADDR pc, asect
*** 1415,1427 ****
  	  {
  	    int i;
  
! 	    for (i = 0; i < b->nsyms; i++)
  	      {
! 		fixup_symbol_section (b->sym[i], objfile);
! 		if (section == SYMBOL_BFD_SECTION (b->sym[i]))
  		  break;
  	      }
! 	    if (i >= b->nsyms)
  	      continue;		/* no symbol in this symtab matches section */
  	  }
  	distance = BLOCK_END (b) - BLOCK_START (b);
--- 1253,1269 ----
  	  {
  	    int i;
  
! 	    for (i = 0; i < BLOCK_NBUCKETS(b); i++)
  	      {
! 		register struct symbol *tempsym;
! 		for (tempsym = BLOCK_BUCKET (b,i); tempsym; tempsym = tempsym->hash_next)
! 		  {
! 		    fixup_symbol_section (tempsym, objfile);
! 		    if (section == SYMBOL_BFD_SECTION (tempsym))
  		      break;
+ 		  }
  	      }
! 	    if (i >= BLOCK_NBUCKETS(b))
  	      continue;		/* no symbol in this symtab matches section */
  	  }
  	distance = BLOCK_END (b) - BLOCK_START (b);
*************** search_symbols (char *regexp, namespace_
*** 2491,2503 ****
        for (i = GLOBAL_BLOCK; i <= STATIC_BLOCK; i++)
  	{
  	  b = BLOCKVECTOR_BLOCK (bv, i);
! 	  /* Skip the sort if this block is always sorted.  */
! 	  if (!BLOCK_SHOULD_SORT (b))
! 	    sort_block_syms (b);
! 	  for (j = 0; j < BLOCK_NSYMS (b); j++)
  	    {
  	      QUIT;
! 	      sym = BLOCK_SYM (b, j);
  	      if (file_matches (s->filename, files, nfiles)
  		  && ((regexp == NULL || SYMBOL_MATCHES_REGEXP (sym))
  		      && ((kind == VARIABLES_NAMESPACE && SYMBOL_CLASS (sym) != LOC_TYPEDEF
--- 2333,2343 ----
        for (i = GLOBAL_BLOCK; i <= STATIC_BLOCK; i++)
  	{
  	  b = BLOCKVECTOR_BLOCK (bv, i);
! 	  for (j = 0; j < BLOCK_NBUCKETS (b); j++)
  	    {
  	      QUIT;
! 	      for (sym = BLOCK_BUCKET (b, j); sym; sym = sym->hash_next)
! 		{
  		  if (file_matches (s->filename, files, nfiles)
  		      && ((regexp == NULL || SYMBOL_MATCHES_REGEXP (sym))
  		      && ((kind == VARIABLES_NAMESPACE && SYMBOL_CLASS (sym) != LOC_TYPEDEF
*************** search_symbols (char *regexp, namespace_
*** 2525,2530 ****
--- 2365,2371 ----
  		    }
  		}
  	    }
+ 	}
      prev_bv = bv;
    }
    
*************** make_symbol_completion_list (char *text,
*** 3004,3012 ****
        /* Also catch fields of types defined in this places which match our
           text string.  Only complete on types visible from current context. */
  
!       for (i = 0; i < BLOCK_NSYMS (b); i++)
  	{
! 	  sym = BLOCK_SYM (b, i);
  	  COMPLETION_LIST_ADD_SYMBOL (sym, sym_text, sym_text_len, text, word);
  	  if (SYMBOL_CLASS (sym) == LOC_TYPEDEF)
  	    {
--- 2845,2855 ----
        /* Also catch fields of types defined in this places which match our
           text string.  Only complete on types visible from current context. */
  
!       for (i = 0; i < BLOCK_NBUCKETS (b); i++)
  	{
! 	  for (sym = BLOCK_BUCKET (b, i); sym; sym=sym->hash_next)
! 	    {
! 	      
  	      COMPLETION_LIST_ADD_SYMBOL (sym, sym_text, sym_text_len, text, word);
  	      if (SYMBOL_CLASS (sym) == LOC_TYPEDEF)
  		{
*************** make_symbol_completion_list (char *text,
*** 3027,3032 ****
--- 2870,2876 ----
  		}
  	    }
  	}
+     }
  
    /* Go through the symtabs and check the externs and statics for
       symbols which match.  */
*************** make_symbol_completion_list (char *text,
*** 3035,3043 ****
    {
      QUIT;
      b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
!     for (i = 0; i < BLOCK_NSYMS (b); i++)
        {
! 	sym = BLOCK_SYM (b, i);
  	COMPLETION_LIST_ADD_SYMBOL (sym, sym_text, sym_text_len, text, word);
        }
    }
--- 2879,2887 ----
    {
      QUIT;
      b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
!     for (i = 0; i < BLOCK_NBUCKETS (b); i++)
        {
! 	for (sym = BLOCK_BUCKET (b, i); sym; sym = sym->hash_next)
  	  COMPLETION_LIST_ADD_SYMBOL (sym, sym_text, sym_text_len, text, word);
        }
    }
*************** make_symbol_completion_list (char *text,
*** 3049,3057 ****
      /* Don't do this block twice.  */
      if (b == surrounding_static_block)
        continue;
!     for (i = 0; i < BLOCK_NSYMS (b); i++)
        {
! 	sym = BLOCK_SYM (b, i);
  	COMPLETION_LIST_ADD_SYMBOL (sym, sym_text, sym_text_len, text, word);
        }
    }
--- 2893,2901 ----
      /* Don't do this block twice.  */
      if (b == surrounding_static_block)
        continue;
!     for (i = 0; i < BLOCK_NBUCKETS (b); i++)
        {
! 	for (sym = BLOCK_BUCKET (b, i); sym; sym = sym->hash_next)
  	  COMPLETION_LIST_ADD_SYMBOL (sym, sym_text, sym_text_len, text, word);
        }
    }
*************** make_symbol_overload_list (struct symbol
*** 3271,3279 ****
        /* Also catch fields of types defined in this places which match our
           text string.  Only complete on types visible from current context. */
  
!       for (i = 0; i < BLOCK_NSYMS (b); i++)
  	{
! 	  sym = BLOCK_SYM (b, i);
  	  overload_list_add_symbol (sym, oload_name);
  	}
      }
--- 3115,3123 ----
        /* Also catch fields of types defined in this places which match our
           text string.  Only complete on types visible from current context. */
  
!       for (i = 0; i < BLOCK_NBUCKETS (b); i++)
  	{
! 	  for (sym = BLOCK_BUCKET (b, i); sym; sym=sym->hash_next)
  	    overload_list_add_symbol (sym, oload_name);
  	}
      }
*************** make_symbol_overload_list (struct symbol
*** 3285,3293 ****
    {
      QUIT;
      b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
!     for (i = 0; i < BLOCK_NSYMS (b); i++)
        {
! 	sym = BLOCK_SYM (b, i);
  	overload_list_add_symbol (sym, oload_name);
        }
    }
--- 3129,3137 ----
    {
      QUIT;
      b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), GLOBAL_BLOCK);
!     for (i = 0; i < BLOCK_NBUCKETS (b); i++)
        {
! 	for (sym = BLOCK_BUCKET (b, i); sym; sym=sym->hash_next)
  	  overload_list_add_symbol (sym, oload_name);
        }
    }
*************** make_symbol_overload_list (struct symbol
*** 3299,3307 ****
      /* Don't do this block twice.  */
      if (b == surrounding_static_block)
        continue;
!     for (i = 0; i < BLOCK_NSYMS (b); i++)
        {
! 	sym = BLOCK_SYM (b, i);
  	overload_list_add_symbol (sym, oload_name);
        }
    }
--- 3143,3151 ----
      /* Don't do this block twice.  */
      if (b == surrounding_static_block)
        continue;
!     for (i = 0; i < BLOCK_NBUCKETS (b); i++)
        {
! 	for (sym = BLOCK_BUCKET (b, i); sym; sym=sym->hash_next)
  	  overload_list_add_symbol (sym, oload_name);
        }
    }
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.21
diff -c -3 -p -w -B -b -r1.21 symtab.h
*** symtab.h	2001/04/27 00:19:09	1.21
--- symtab.h	2001/05/31 20:48:05
*************** struct blockvector
*** 418,424 ****
  
  struct block
    {
! 
      /* Addresses in the executable code that are in this block.  */
  
      CORE_ADDR startaddr;
--- 415,422 ----
  
  struct block
    {
!     /* Unique block ID */
!     unsigned int id;
      /* Addresses in the executable code that are in this block.  */
  
      CORE_ADDR startaddr;
*************** struct block
*** 448,478 ****
         of this flag is undefined.  */
  
      unsigned char gcc_compile_flag;
! 
!     /* Number of local symbols.  */
! 
!     int nsyms;
! 
!     /* The symbols.  If some of them are arguments, then they must be
!        in the order in which we would like to print them.  */
! 
      struct symbol *sym[1];
    };
  
  #define BLOCK_START(bl)		(bl)->startaddr
  #define BLOCK_END(bl)		(bl)->endaddr
- #define BLOCK_NSYMS(bl)		(bl)->nsyms
- #define BLOCK_SYM(bl, n)	(bl)->sym[n]
  #define BLOCK_FUNCTION(bl)	(bl)->function
  #define BLOCK_SUPERBLOCK(bl)	(bl)->superblock
  #define BLOCK_GCC_COMPILED(bl)	(bl)->gcc_compile_flag
  
- /* Nonzero if symbols of block BL should be sorted alphabetically.
-    Don't sort a block which corresponds to a function.  If we did the
-    sorting would have to preserve the order of the symbols for the
-    arguments.  */
- 
- #define BLOCK_SHOULD_SORT(bl) ((bl)->nsyms >= 40 && BLOCK_FUNCTION (bl) == NULL)
  
  
  /* Represent one symbol name; a variable, constant, function or typedef.  */
--- 446,467 ----
         of this flag is undefined.  */
  
      unsigned char gcc_compile_flag;
!     unsigned int nsyms;
      struct symbol *sym[1];
    };
  
  #define BLOCK_START(bl)		(bl)->startaddr
  #define BLOCK_END(bl)		(bl)->endaddr
  #define BLOCK_FUNCTION(bl)	(bl)->function
  #define BLOCK_SUPERBLOCK(bl)	(bl)->superblock
  #define BLOCK_GCC_COMPILED(bl)	(bl)->gcc_compile_flag
+ #define BLOCK_NBUCKETS(bl)	(bl)->nsyms
+ #define BLOCK_NSYMS(bl)		(bl)->nsyms
+ #define BLOCK_ID(bl)		(bl)->id
+ #define BLOCK_BUCKET(bl, i)	(bl)->sym[i]
+ #define BLOCK_SYM(bl, i)	(bl)->sym[i]
+ #define BLOCK_SHOULD_SORT(bl)   0
  
  
  
  /* Represent one symbol name; a variable, constant, function or typedef.  */
*************** struct symbol
*** 722,727 ****
--- 725,733 ----
      /* List of ranges where this symbol is active.  This is only
         used by alias symbols at the current time.  */
      struct range_list *ranges;
+     
+     struct symbol *hash_next;
+ 
    };
  
  
Index: buildsym.c
===================================================================
RCS file: /cvs/src/src/gdb/buildsym.c,v
retrieving revision 1.11
diff -c -3 -p -w -B -b -r1.11 buildsym.c
*** buildsym.c	2001/04/30 10:30:27	1.11
--- buildsym.c	2001/05/31 20:48:06
*************** struct complaint anon_block_end_complain
*** 79,85 ****
  {"block end address 0x%lx less than block start address 0x%lx (patched it)", 0, 0};
  
  struct complaint innerblock_complaint =
! {"inner block not inside outer block in %s", 0, 0};
  
  struct complaint innerblock_anon_complaint =
  {"inner block (0x%lx-0x%lx) not inside outer block (0x%lx-0x%lx)", 0, 0};
--- 78,84 ----
  {"block end address 0x%lx less than block start address 0x%lx (patched it)", 0, 0};
  
  struct complaint innerblock_complaint =
! {"inner block (0x%lx-0x%lx) not inside outer block (0x%lx-0x%lx) in %s", 0, 0};
  
  struct complaint innerblock_anon_complaint =
  {"inner block (0x%lx-0x%lx) not inside outer block (0x%lx-0x%lx)", 0, 0};
*************** really_free_pendings (PTR dummy)
*** 199,215 ****
  void
  free_pending_blocks (void)
  {
- #if 0				/* Now we make the links in the
- 				   symbol_obstack, so don't free
- 				   them.  */
-   struct pending_block *bnext, *bnext1;
- 
-   for (bnext = pending_blocks; bnext; bnext = bnext1)
-     {
-       bnext1 = bnext->next;
-       xfree ((void *) bnext);
-     }
- #endif
    pending_blocks = NULL;
  }
  
--- 198,203 ----
*************** finish_block (struct symbol *symbol, str
*** 229,235 ****
    struct pending_block *opblock;
    register int i;
    register int j;
! 
    /* Count the length of the list of symbols.  */
  
    for (next = *listhead, i = 0;
--- 217,223 ----
    struct pending_block *opblock;
    register int i;
    register int j;
!   unsigned int closest_size;
    /* Count the length of the list of symbols.  */
  
    for (next = *listhead, i = 0;
*************** finish_block (struct symbol *symbol, str
*** 238,244 ****
      {
        /* EMPTY */ ;
      }
! 
    block = (struct block *) obstack_alloc (&objfile->symbol_obstack,
  	    (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
  
--- 226,235 ----
      {
        /* EMPTY */ ;
      }
!   /* If it's a block representing a function, we can't use a hash
!      table, or else the args will get out of order. */
!   if (symbol)
!     {
        block = (struct block *) obstack_alloc (&objfile->symbol_obstack,
  					      (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
        
*************** finish_block (struct symbol *symbol, str
*** 243,257 ****
  	    (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
  
    /* Copy the symbols into the block.  */
- 
    BLOCK_NSYMS (block) = i;
    for (next = *listhead; next; next = next->next)
      {
        for (j = next->nsyms - 1; j >= 0; j--)
  	{
  	  BLOCK_SYM (block, --i) = next->symbol[j];
  	}
      }
  
    BLOCK_START (block) = start;
    BLOCK_END (block) = end;
--- 234,271 ----
  					      (sizeof (struct block) + ((i - 1) * sizeof (struct symbol *))));
        
        /* Copy the symbols into the block.  */
        BLOCK_NSYMS (block) = i;
+     }
+   /* Otherwise, use a hash table representation */
+   else
+     {
+       block = (struct block *) obstack_alloc (&objfile->symbol_obstack,
+ 					      (sizeof (struct block) + ((((i - 1)/4)+1) * sizeof (struct symbol *))));
+       BLOCK_NBUCKETS (block) = (i/4) + 1;
+       for (j = 0; j < BLOCK_NBUCKETS (block); j++)
+ 	BLOCK_BUCKET (block, j) = 0;
+     }
    for (next = *listhead; next; next = next->next)
      {
        for (j = next->nsyms - 1; j >= 0; j--)
  	{
+ 	  if (symbol) 
+ 	    {
  	      BLOCK_SYM (block, --i) = next->symbol[j];
  	    }
+ 	  else
+ 	    {
+ 	      unsigned int hashval;
+ 	      SYMBOL_INIT_DEMANGLED_NAME (next->symbol[j], &objfile->symbol_obstack);
+ 	      hashval = msymbol_hash_iw (SYMBOL_SOURCE_NAME (next->symbol[j])) % BLOCK_NBUCKETS (block);
+ 	      if (BLOCK_BUCKET (block, hashval) != NULL)
+ 		{
+ 		  next->symbol[j]->hash_next = BLOCK_BUCKET (block, hashval);
+ 		}
+ 	      BLOCK_BUCKET (block, hashval) = next->symbol[j];
  	    }
+ 	}
+     }
    
    BLOCK_START (block) = start;
    BLOCK_END (block) = end;
*************** finish_block (struct symbol *symbol, str
*** 372,378 ****
  	}
        else
  	{
! 	  complain (&anon_block_end_complaint, BLOCK_END (block), BLOCK_START (block));
  	}
        /* Better than nothing */
        BLOCK_END (block) = BLOCK_START (block);
--- 387,396 ----
  	}
        else
  	{
! 	  /* FIXME 32x64 */
! 	  complain (&anon_block_end_complaint,
! 		    (unsigned long) BLOCK_END (block),
! 		    (unsigned long) BLOCK_START (block));
  	}
        /* Better than nothing */
        BLOCK_END (block) = BLOCK_START (block);
*************** finish_block (struct symbol *symbol, str
*** 396,409 ****
  	    {
  	      if (symbol)
  		{
  		  complain (&innerblock_complaint,
  			    SYMBOL_SOURCE_NAME (symbol));
  		}
  	      else
  		{
! 		  complain (&innerblock_anon_complaint, BLOCK_START (pblock->block),
! 			    BLOCK_END (pblock->block), BLOCK_START (block),
! 			    BLOCK_END (block));
  		}
  	      if (BLOCK_START (pblock->block) < BLOCK_START (block))
  		BLOCK_START (pblock->block) = BLOCK_START (block);
--- 414,435 ----
  	    {
  	      if (symbol)
  		{
+ 		  /* FIXME 32x64 */
  		  complain (&innerblock_complaint,
+ 			    (unsigned long) BLOCK_START (pblock->block),
+ 			    (unsigned long) BLOCK_END (pblock->block),
+ 			    (unsigned long) BLOCK_START (block),
+ 			    (unsigned long) BLOCK_END (block),
  			    SYMBOL_SOURCE_NAME (symbol));
  		}
  	      else
  		{
! 		  /* FIXME 32x64 */
! 		  complain (&innerblock_anon_complaint,
! 			    (unsigned long) BLOCK_START (pblock->block),
! 			    (unsigned long) BLOCK_END (pblock->block),
! 			    (unsigned long) BLOCK_START (block),
! 			    (unsigned long) BLOCK_END (block));
  		}
  	      if (BLOCK_START (pblock->block) < BLOCK_START (block))
  		BLOCK_START (pblock->block) = BLOCK_START (block);
*************** record_pending_block (struct objfile *ob
*** 447,452 ****
--- 473,496 ----
      }
  }
  
+ static int
+ compare_blocks (const void *v1, const void *v2)
+ {
+   const struct block *const *b1 = v1;
+   const struct block *const *b2 = v2;
+ 
+   if ((*b1)->startaddr < (*b2)->startaddr)
+     return -1;
+   else if ((*b1)->startaddr > (*b2)->startaddr)
+     return 1;
+   else if ((*b1)->endaddr < (*b2)->endaddr)
+     return 1;
+   else if ((*b1)->endaddr > (*b2)->endaddr)
+     return -1;
+   else
+     return 0;
+ }
+ 
  /* Note that this is only used in this file and in dstread.c, which
     should be fixed to not need direct access to this function.  When
     that is done, it can be made static again. */
*************** make_blockvector (struct objfile *objfil
*** 481,517 ****
        BLOCKVECTOR_BLOCK (blockvector, --i) = next->block;
      }
  
- #if 0				/* Now we make the links in the
- 				   obstack, so don't free them.  */
-   /* Now free the links of the list, and empty the list.  */
- 
-   for (next = pending_blocks; next; next = next1)
-     {
-       next1 = next->next;
-       xfree (next);
-     }
- #endif
    pending_blocks = NULL;
  
! #if 1				/* FIXME, shut this off after a while
! 				   to speed up symbol reading.  */
!   /* Some compilers output blocks in the wrong order, but we depend on
!      their being in the right order so we can binary search. Check the
!      order and moan about it.  FIXME.  */
!   if (BLOCKVECTOR_NBLOCKS (blockvector) > 1)
!     {
!       for (i = 1; i < BLOCKVECTOR_NBLOCKS (blockvector); i++)
  	{
! 	  if (BLOCK_START (BLOCKVECTOR_BLOCK (blockvector, i - 1))
! 	      > BLOCK_START (BLOCKVECTOR_BLOCK (blockvector, i)))
  	    {
! 	      CORE_ADDR start
! 		= BLOCK_START (BLOCKVECTOR_BLOCK (blockvector, i));
! 
! 	      complain (&blockvector_complaint,
! 			longest_local_hex_string ((LONGEST) start));
! 	    }
  	}
      }
  #endif
  
--- 525,554 ----
        BLOCKVECTOR_BLOCK (blockvector, --i) = next->block;
      }
  
    pending_blocks = NULL;
  
! #if 1
!   if (objfile->flags & OBJF_REORDERED)
      {
! #if 0
!       for (i = 2; i < BLOCKVECTOR_NBLOCKS (blockvector) - 2; i++)
  	{
! 	  struct block *b = BLOCKVECTOR_BLOCK (blockvector, i);
! 	  struct minimal_symbol *min_sym = (struct minimal_symbol *)
! 	  lookup_minimal_symbol_by_pc_section_from_objfile
! 	  (BLOCK_START (b), find_pc_mapped_section (BLOCK_START (b)), objfile);
! 	  if (min_sym &&
! 	      ((SYMBOL_VALUE_ADDRESS (min_sym) > BLOCK_END (b)) ||
! 	       (SYMBOL_VALUE_ADDRESS (min_sym + 1) &&
! 		(SYMBOL_VALUE_ADDRESS (min_sym + 1) < BLOCK_END (b)))))
! 	    BLOCK_END (b) = SYMBOL_VALUE_ADDRESS (min_sym + 1);
  	}
+ #endif
+       if (BLOCKVECTOR_NBLOCKS (blockvector) > 2)
+ 	qsort (&blockvector->block[2],
+ 	       BLOCKVECTOR_NBLOCKS (blockvector) - 2,
+ 	       sizeof (struct block *),
+ 	       compare_blocks);
      }
  #endif
  
*************** start_subfile (char *name, char *dirname
*** 533,539 ****
  
    for (subfile = subfiles; subfile; subfile = subfile->next)
      {
!       if (FILENAME_CMP (subfile->name, name) == 0)
  	{
  	  current_subfile = subfile;
  	  return;
--- 571,577 ----
  
    for (subfile = subfiles; subfile; subfile = subfile->next)
      {
!       if (STREQ (subfile->name, name))
  	{
  	  current_subfile = subfile;
  	  return;
*************** push_subfile (void)
*** 669,675 ****
    subfile_stack = tem;
    if (current_subfile == NULL || current_subfile->name == NULL)
      {
!       internal_error (__FILE__, __LINE__, "failed internal consistency check");
      }
    tem->name = current_subfile->name;
  }
--- 707,713 ----
    subfile_stack = tem;
    if (current_subfile == NULL || current_subfile->name == NULL)
      {
!       error ("no entry to push onto subfile stack");
      }
    tem->name = current_subfile->name;
  }
*************** pop_subfile (void)
*** 682,692 ****
  
    if (link == NULL)
      {
!       internal_error (__FILE__, __LINE__, "failed internal consistency check");
      }
    name = link->name;
    subfile_stack = link->next;
!   xfree ((void *) link);
    return (name);
  }
  
--- 720,730 ----
  
    if (link == NULL)
      {
!       error ("subfile stack empty");
      }
    name = link->name;
    subfile_stack = link->next;
!   free ((void *) link);
    return (name);
  }
  
*************** record_line (register struct subfile *su
*** 727,733 ****
  
    e = subfile->line_vector->item + subfile->line_vector->nitems++;
    e->line = line;
!   e->pc = ADDR_BITS_REMOVE(pc);
  }
  
  /* Needed in order to sort line tables from IBM xcoff files.  Sigh!  */
--- 765,771 ----
  
    e = subfile->line_vector->item + subfile->line_vector->nitems++;
    e->line = line;
!   e->pc = pc;
  }
  
  /* Needed in order to sort line tables from IBM xcoff files.  Sigh!  */
*************** end_symtab (CORE_ADDR end_addr, struct o
*** 835,872 ****
  	}
      }
  
-   /* Reordered executables may have out of order pending blocks; if
-      OBJF_REORDERED is true, then sort the pending blocks.  */
-   if ((objfile->flags & OBJF_REORDERED) && pending_blocks)
-     {
-       /* FIXME!  Remove this horrid bubble sort and use merge sort!!! */
-       int swapped;
-       do
- 	{
- 	  struct pending_block *pb, *pbnext;
- 
- 	  pb = pending_blocks;
- 	  pbnext = pb->next;
- 	  swapped = 0;
- 
- 	  while (pbnext)
- 	    {
- 	      /* swap blocks if unordered! */
- 
- 	      if (BLOCK_START (pb->block) < BLOCK_START (pbnext->block))
- 		{
- 		  struct block *tmp = pb->block;
- 		  pb->block = pbnext->block;
- 		  pbnext->block = tmp;
- 		  swapped = 1;
- 		}
- 	      pb = pbnext;
- 	      pbnext = pbnext->next;
- 	    }
- 	}
-       while (swapped);
-     }
- 
    /* Cleanup any undefined types that have been left hanging around
       (this needs to be done before the finish_blocks so that
       file_symbols is still good).
--- 874,879 ----
*************** end_symtab (CORE_ADDR end_addr, struct o
*** 891,902 ****
      }
    else
      {
!       /* Define the STATIC_BLOCK & GLOBAL_BLOCK, and build the
!          blockvector.  */
!       finish_block (0, &file_symbols, 0, last_source_start_addr, end_addr,
! 		    objfile);
!       finish_block (0, &global_symbols, 0, last_source_start_addr, end_addr,
! 		    objfile);
        blockvector = make_blockvector (objfile);
      }
  
--- 898,906 ----
      }
    else
      {
!       /* Define STATIC_BLOCK and GLOBAL_BLOCK and build the blockvector. */
!       finish_block (0, &file_symbols, 0, last_source_start_addr, end_addr, objfile);
!       finish_block (0, &global_symbols, 0, last_source_start_addr, end_addr, objfile);
        blockvector = make_blockvector (objfile);
      }
  
*************** end_symtab (CORE_ADDR end_addr, struct o
*** 921,934 ****
  	    {
  	      linetablesize = sizeof (struct linetable) +
  	        subfile->line_vector->nitems * sizeof (struct linetable_entry);
- #if 0
- 	      /* I think this is artifact from before it went on the
- 	         obstack. I doubt we'll need the memory between now
- 	         and when we free it later in this function.  */
- 	      /* First, shrink the linetable to make more memory.  */
- 	      subfile->line_vector = (struct linetable *)
- 		xrealloc ((char *) subfile->line_vector, linetablesize);
- #endif
  
  	      /* Like the pending blocks, the line table may be
  	         scrambled in reordered executables.  Sort it if
--- 925,930 ----
*************** end_symtab (CORE_ADDR end_addr, struct o
*** 995,1017 ****
  	}
        if (subfile->name != NULL)
  	{
! 	  xfree ((void *) subfile->name);
  	}
        if (subfile->dirname != NULL)
  	{
! 	  xfree ((void *) subfile->dirname);
  	}
        if (subfile->line_vector != NULL)
  	{
! 	  xfree ((void *) subfile->line_vector);
  	}
        if (subfile->debugformat != NULL)
  	{
! 	  xfree ((void *) subfile->debugformat);
  	}
  
        nextsub = subfile->next;
!       xfree ((void *) subfile);
      }
  
    /* Set this for the main source file.  */
--- 991,1013 ----
  	}
        if (subfile->name != NULL)
  	{
! 	  free ((void *) subfile->name);
  	}
        if (subfile->dirname != NULL)
  	{
! 	  free ((void *) subfile->dirname);
  	}
        if (subfile->line_vector != NULL)
  	{
! 	  free ((void *) subfile->line_vector);
  	}
        if (subfile->debugformat != NULL)
  	{
! 	  free ((void *) subfile->debugformat);
  	}
  
        nextsub = subfile->next;
!       free ((void *) subfile);
      }
  
    /* Set this for the main source file.  */
Index: coffread.c
===================================================================
RCS file: /cvs/src/src/gdb/coffread.c,v
retrieving revision 1.17
diff -c -3 -p -w -B -b -r1.17 coffread.c
*** coffread.c	2001/03/26 02:50:46	1.17
--- coffread.c	2001/05/31 20:48:08
*************** patch_type (struct type *type, struct ty
*** 1419,1430 ****
    TYPE_FIELDS (target) = (struct field *) TYPE_ALLOC (target, field_size);
  
    memcpy (TYPE_FIELDS (target), TYPE_FIELDS (real_target), field_size);
- 
    if (TYPE_NAME (real_target))
      {
!       if (TYPE_NAME (target))
! 	xfree (TYPE_NAME (target));
!       TYPE_NAME (target) = concat (TYPE_NAME (real_target), NULL);
      }
  }
  
--- 1419,1427 ----
    TYPE_FIELDS (target) = (struct field *) TYPE_ALLOC (target, field_size);
  
    memcpy (TYPE_FIELDS (target), TYPE_FIELDS (real_target), field_size);
    if (TYPE_NAME (real_target))
      {
!       TYPE_NAME (target) = TYPE_NAME (real_target);
      }
  }
  
*************** patch_opaque_types (struct symtab *s)
*** 1440,1456 ****
  
    /* Go through the per-file symbols only */
    b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
!   for (i = BLOCK_NSYMS (b) - 1; i >= 0; i--)
      {
        /* Find completed typedefs to use to fix opaque ones.
           Remove syms from the chain when their types are stored,
           but search the whole chain, as there may be several syms
           from different files with the same name.  */
!       real_sym = BLOCK_SYM (b, i);
        if (SYMBOL_CLASS (real_sym) == LOC_TYPEDEF &&
  	  SYMBOL_NAMESPACE (real_sym) == VAR_NAMESPACE &&
  	  TYPE_CODE (SYMBOL_TYPE (real_sym)) == TYPE_CODE_PTR &&
! 	  TYPE_LENGTH (TYPE_TARGET_TYPE (SYMBOL_TYPE (real_sym))) != 0)
  	{
  	  register char *name = SYMBOL_NAME (real_sym);
  	  register int hash = hashname (name);
--- 1437,1454 ----
  
    /* Go through the per-file symbols only */
    b = BLOCKVECTOR_BLOCK (BLOCKVECTOR (s), STATIC_BLOCK);
!   for (i = BLOCK_NBUCKETS (b) - 1; i >= 0; i--)
      {
        /* Find completed typedefs to use to fix opaque ones.
           Remove syms from the chain when their types are stored,
           but search the whole chain, as there may be several syms
           from different files with the same name.  */
!       for (real_sym = BLOCK_BUCKET (b, i); real_sym; real_sym = real_sym->hash_next)
! 	{
  	  if (SYMBOL_CLASS (real_sym) == LOC_TYPEDEF &&
  	      SYMBOL_NAMESPACE (real_sym) == VAR_NAMESPACE &&
  	      TYPE_CODE (SYMBOL_TYPE (real_sym)) == TYPE_CODE_PTR &&
! 	      TYPE_LENGTH (POINTER_TARGET_TYPE (SYMBOL_TYPE (real_sym))) != 0)
  	    {
  	      register char *name = SYMBOL_NAME (real_sym);
  	      register int hash = hashname (name);
*************** patch_opaque_types (struct symtab *s)
*** 1491,1496 ****
--- 1489,1495 ----
  	    }
  	}
      }
+ }
  
  static struct symbol *
  process_coff_symbol (register struct coff_symbol *cs,
Index: symmisc.c
===================================================================
RCS file: /cvs/src/src/gdb/symmisc.c,v
retrieving revision 1.5
diff -c -3 -p -w -B -b -r1.5 symmisc.c
*** symmisc.c	2001/03/06 08:21:17	1.5
--- symmisc.c	2001/05/31 20:48:09
*************** static void
*** 86,96 ****
  free_symtab_block (struct objfile *objfile, struct block *b)
  {
    register int i, n;
!   n = BLOCK_NSYMS (b);
    for (i = 0; i < n; i++)
      {
        mfree (objfile->md, SYMBOL_NAME (BLOCK_SYM (b, i)));
!       mfree (objfile->md, (PTR) BLOCK_SYM (b, i));
      }
    mfree (objfile->md, (PTR) b);
  }
--- 86,98 ----
  free_symtab_block (struct objfile *objfile, struct block *b)
  {
    register int i, n;
!   n = BLOCK_NBUCKETS (b);
    for (i = 0; i < n; i++)
      {
+       register struct symbol *sym;
        mfree (objfile->md, SYMBOL_NAME (BLOCK_SYM (b, i)));
!       for (sym = BLOCK_BUCKET (b, i); sym; sym = sym->hash_next)
! 	      mfree (objfile->md, (PTR) sym);
      }
    mfree (objfile->md, (PTR) b);
  }
*************** dump_symtab (struct objfile *objfile, st
*** 455,461 ****
--- 458,467 ----
  	      gdb_print_host_address (BLOCK_SUPERBLOCK (b), outfile);
  	    }
  	  blen = BLOCK_NSYMS (b);
+ 	  if (BLOCK_FUNCTION (b))
  	    fprintf_filtered (outfile, ", %d syms in ", blen);
+ 	  else
+ 	    fprintf_filtered (outfile, ", %d buckets of symbols in ", blen);
  	  print_address_numeric (BLOCK_START (b), 1, outfile);
  	  fprintf_filtered (outfile, "..");
  	  print_address_numeric (BLOCK_END (b), 1, outfile);
*************** dump_symtab (struct objfile *objfile, st
*** 475,485 ****
  	  for (j = 0; j < blen; j++)
  	    {
  	      struct print_symbol_args s;
! 	      s.symbol = BLOCK_SYM (b, j);
  	      s.depth = depth + 1;
  	      s.outfile = outfile;
  	      catch_errors (print_symbol, &s, "Error printing symbol:\n",
  			    RETURN_MASK_ALL);
  	    }
  	}
        fprintf_filtered (outfile, "\n");
--- 481,493 ----
  	  for (j = 0; j < blen; j++)
  	    {
  	      struct print_symbol_args s;
! 	      for (s.symbol = BLOCK_BUCKET (b, j); s.symbol; s.symbol = s.symbol->hash_next)
! 		{
  		  s.depth = depth + 1;
  		  s.outfile = outfile;
  		  catch_errors (print_symbol, &s, "Error printing symbol:\n",
  				RETURN_MASK_ALL);
+ 		}
  	    }
  	}
        fprintf_filtered (outfile, "\n");
Index: minsyms.c
===================================================================
RCS file: /cvs/src/src/gdb/minsyms.c,v
retrieving revision 1.16
diff -c -3 -p -w -B -b -r1.16 minsyms.c
*** minsyms.c	2001/05/22 21:02:41	1.16
--- minsyms.c	2001/05/31 20:48:10
***************
*** 54,60 ****
     At the end, copy them all into one newly allocated location on an objfile's
     symbol obstack.  */
  
! #define BUNCH_SIZE 127
  
  struct msym_bunch
    {
--- 54,60 ----
     At the end, copy them all into one newly allocated location on an objfile's
     symbol obstack.  */
  
! #define BUNCH_SIZE 4096
  
  struct msym_bunch
    {
*************** compact_minimal_symbols (struct minimal_
*** 85,105 ****
  static void add_minsym_to_demangled_hash_table (struct minimal_symbol *sym,
  						struct minimal_symbol **table);
  
! /* Compute a hash code based using the same criteria as `strcmp_iw'.  */
  
! unsigned int
! msymbol_hash_iw (const char *string)
! {
!   unsigned int hash = 0;
    while (*string && *string != '(')
      {
        while (isspace (*string))
  	++string;
        if (*string && *string != '(')
! 	hash = (31 * hash) + *string;
        ++string;
      }
!   return hash % MINIMAL_SYMBOL_HASH_SIZE;
  }
  
  /* Compute a hash code for a string.  */
--- 85,153 ----
  static void add_minsym_to_demangled_hash_table (struct minimal_symbol *sym,
  						struct minimal_symbol **table);
  
! /* Compute a hash code based using the same criteria as `strcmp_iw'. 
!    Based on FNV-1 hash used in bcache.c.
  
!    Note, we used to have a loop that, with the new hashing algorithm, would have looked like:
!    
!   unsigned int hash = 0x811c9dc5;
    while (*string && *string != '(')
      {
        while (isspace (*string))
  	++string;
        if (*string && *string != '(')
! 	{
! 	  hash *= 16777619;
! 	  hash ^= *string;
! 	}
        ++string;
+     }
+   return hash;
+ }
+ 
+ This loop is over twice as slow as the loop used in the function.
+ 
+ The other possibility would have been
+ 
+   unsigned int hash = 0x811c9dc5;
+   for (*string; *string && *string != '('; ++string)
+   {
+ 	  if (isspace (*string))
+ 	     continue;
+ 	  hash *= 16777619;
+ 	  hash ^= *string;
+   }
+   return hash;
+ 
+ This loop is 1.5 times as slow as below.
+ 
+ All three were hand verified to give the same hash values for symbols
+ with/without spaces in them, ending/not ending in parens, etc.
+ 
+ These timings are consistent on a 500mhz G3 PPC, a 266mhz PII, and a
+ 500mhz PIII, and a 300mhz Ultrasparc IIi, using gcc 2.95.3 on all of
+ them, so AFAIK, it's because the compiler can better optimize the loop
+ below , not because of the  platform I tested on.
+ 
+ Since this function now almost completely dominates symbol lookup
+ time, please be careful in changing it.
+ 
+ */
+ 
+ unsigned int
+ msymbol_hash_iw (const char *string)
+ {
+   unsigned int hash = 0x811c9dc5;
+   unsigned int c;
+   unsigned int len = strlen(string);
+   for (c=0; c < len && string[c] != '('; c++)
+     {
+       if (isspace (string[c]))
+ 	continue;
+       hash *= 16777619;
+       hash ^= string[c];
      }
!   return hash;
  }
  
  /* Compute a hash code for a string.  */
*************** msymbol_hash_iw (const char *string)
*** 107,116 ****
  unsigned int
  msymbol_hash (const char *string)
  {
!   unsigned int hash = 0;
!   for (; *string; ++string)
!     hash = (31 * hash) + *string;
!   return hash % MINIMAL_SYMBOL_HASH_SIZE;
  }
  
  /* Add the minimal symbol SYM to an objfile's minsym hash table, TABLE.  */
--- 155,161 ----
  unsigned int
  msymbol_hash (const char *string)
  {
!   return hash((char *)string, strlen(string));
  }
  
  /* Add the minimal symbol SYM to an objfile's minsym hash table, TABLE.  */
*************** add_minsym_to_hash_table (struct minimal
*** 120,126 ****
  {
    if (sym->hash_next == NULL)
      {
!       unsigned int hash = msymbol_hash (SYMBOL_NAME (sym));
        sym->hash_next = table[hash];
        table[hash] = sym;
      }
--- 165,171 ----
  {
    if (sym->hash_next == NULL)
      {
!       unsigned int hash = msymbol_hash (SYMBOL_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
        sym->hash_next = table[hash];
        table[hash] = sym;
      }
*************** add_minsym_to_demangled_hash_table (stru
*** 134,140 ****
  {
    if (sym->demangled_hash_next == NULL)
      {
!       unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym));
        sym->demangled_hash_next = table[hash];
        table[hash] = sym;
      }
--- 179,185 ----
  {
    if (sym->demangled_hash_next == NULL)
      {
!       unsigned int hash = msymbol_hash_iw (SYMBOL_DEMANGLED_NAME (sym)) % MINIMAL_SYMBOL_HASH_SIZE;
        sym->demangled_hash_next = table[hash];
        table[hash] = sym;
      }
*************** lookup_minimal_symbol (register const ch
*** 162,169 ****
    struct minimal_symbol *found_file_symbol = NULL;
    struct minimal_symbol *trampoline_symbol = NULL;
  
!   unsigned int hash = msymbol_hash (name);
!   unsigned int dem_hash = msymbol_hash_iw (name);
  
  #ifdef SOFUN_ADDRESS_MAYBE_MISSING
    if (sfile != NULL)
--- 207,214 ----
    struct minimal_symbol *found_file_symbol = NULL;
    struct minimal_symbol *trampoline_symbol = NULL;
  
!   unsigned int hash = msymbol_hash (name) % MINIMAL_SYMBOL_HASH_SIZE;
!   unsigned int dem_hash = msymbol_hash_iw (name)  % MINIMAL_SYMBOL_HASH_SIZE;
  
  #ifdef SOFUN_ADDRESS_MAYBE_MISSING
    if (sfile != NULL)
Index: bcache.c
===================================================================
RCS file: /cvs/src/src/gdb/bcache.c,v
retrieving revision 1.6
diff -c -3 -p -w -B -b -r1.6 bcache.c
*** bcache.c	2001/03/06 08:21:05	1.6
--- bcache.c	2001/05/31 20:48:10
***************
*** 28,35 ****
  #include "bcache.h"
  #include "gdb_string.h"		/* For memcpy declaration */
  
! /* The old hash function was stolen from SDBM. This is what DB 3.0 uses now,
   * and is better than the old one. 
   */
  
  unsigned long
--- 28,36 ----
  #include "bcache.h"
  #include "gdb_string.h"		/* For memcpy declaration */
  
! /* The old hash function was stolen from SDBM. This is what DB 3.1 uses now,
   * and is better than the old one. 
+  * It's the FNV hash.
   */
  
  unsigned long
*************** hash(void *addr, int length)
*** 40,46 ****
  		
  		k = (const unsigned char *)addr;
  		e = k+length;
! 		for (h=0; k< e;++k)
  		{
  				h *=16777619;
  				h ^= *k;
--- 41,47 ----
    
    k = (const unsigned char *)addr;
    e = k+length;
!   for (h=0x811c9dc5; k< e;++k)
      {
        h *=16777619;
        h ^= *k;
*************** hash(void *addr, int length)
*** 48,53 ****
--- 49,55 ----
    return (h);
  }
  
+ 
  /* Growing the bcache's hash table.  */
  
  /* If the average chain length grows beyond this, then we want to

-- 
"If you saw a heat wave, would you wave back?
"-Steven Wright


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]