This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] Kill some linear searches in minsyms.c


Daniel Jacobowitz writes:
 > So uh, yeah, and stuff.  I found an interesting one today when I wasn't even
 > looking for it.  Earlier today I posted an updated patch to make it easy,
 > very easy, to profile GDB.  Here's a pretty good sample of why I think this
 > is important.  I was actually looking for something completely different,
 > but I noticed this on the way.
 > 
 > Consider this patch:
 > 
 > 2000-03-06  Jim Blandy  <jimb@redhat.com>
 > 
 >         From Tom Tromey <tromey@cygnus.com> and Keith Seitz <?>:
 > 
 >         * minsyms.c: #include <ctype.h>, for msymbol_hash_iw.
 >         (compact_minimal_symbols): Added `objfile' argument.
 >         Put symbols in the objfile's hash table.
 >         (install_minimal_symbols): Put symbols in the objfile's demangled
 >         hash table.
 >         (lookup_minimal_symbol): Use hash table to find symbol in
 >         objfile.
 >         (msymbol_hash_iw, msymbol_hash, add_minsym_to_hash_table): New
 >         functions.
 >         (prim_record_minimal_symbol_and_info): Initialize the
 >         hash link fields of the new minimal symbol.
 >         * symtab.h (struct minimal_symbol): New fields `hash_next',
 >         `demangled_hash_next'.
 >         (msymbol_hash_iw, msymbol_hash, add_minsym_to_hash_table): Declare.
 >         * objfiles.h (MINIMAL_SYMBOL_HASH_SIZE): New define.
 >         (struct objfile): New fields `msymbol_hash',
 >         `msymbol_demangled_hash'.
 > 
 > These replaced a linear search with a hash table.  We'd better use it, too,
 > since a good-sized application has an awful lot of minimal symbols. 
 > However, two other related functions (that's lookup_minimal_symbol_text and
 > lookup_minimal_symbol_solib_trampoline) were not converted.  These get
 > called reasonably often, vis:
 > 

What command gdb executing at this point?

 >                 0.75    0.43      62/310         create_overlay_event_breakpoint [16]
 >                 3.01    1.72     248/310         create_longjmp_breakpoint [5]
 > [4]     48.1    3.76    2.16     310         lookup_minimal_symbol_text [4]
 > 
 > That "3.76" is in _seconds_, by the way.  Then they proceed to do:
 >                 1.26    0.00 9567591/9567734     strcmp_iw [15]
 >                 0.90    0.00 27551335/30742029     symbol_demangled_name [17]
 > 
 > 27.5 million calls to a wrapper function.  My first instinct was that a
 > wrapper function was a bad idea, then.  My second instinct was to smack my
 > first instinct upside the head and find out why we were calling it so many
 > times, since I knew we hashed minsyms.  Converting them to the hash table
 > once I found the problem was quite easy.  The patch is attached; is it OK to
 > commit?  No regressions on i386-pc-linux-gnu, for what that's worth, and
 > it's quite straightforward.
 > 

how do the numbers for those functions look after your patch?

 > 
 > 
 > Future things to examine:
 > 
 > Before, GDB's CPU time to load symbols for all of mozilla, get through a
 > large number of threaded shared library events (ow, but I'm glad the pread64
 > patch is in, or this measurement would have been just impossible to sort out
 > of the noise), and reach the first dialog box was 26.630000 seconds.  After
 > the attached patch, 17.820000 seconds.  Much better.  The new hot spots are:
 > 
 >  21.32      1.42     1.42  1594586     0.00     0.00  hash
 >  10.66      2.13     0.71   185486     0.00     0.00  find_corresponding_bincl_psymtab
 >   8.11      2.67     0.54  1128766     0.00     0.00  bcache
 >   6.76      3.12     0.45       41    10.98    91.28  read_dbx_symtab
 >   6.31      3.54     0.42     1519     0.28     0.30  end_psymtab
 > 
 > I suspect we can cut find_corresponding_bincl_psymtab out of the way too;
 > and maybe the bcache needs some rethinking into a more specialized data
 > structure since we're calling it so often; but maybe there's nothing that
 > can be done about that (this test has a terrifyingly large number of symbols
 > in it).  There's 3.3 million calls to mmalloc, too - a million of them are
 > from a call to save_inferior_ptid in thread_db_xfer_memory.  And
 > symbol_demangled_name is still on the top ten, via compare_psymbols.  That's
 > another good candidate for a clever data structure if it becomes worth the
 > time, which it isn't in this test.
 > 
 > -- 
 > Daniel Jacobowitz
 > MontaVista Software                         Debian GNU/Linux Developer
 > 
 > 2003-01-05  Daniel Jacobowitz  <drow@mvista.com>
 > 
 > 	* minsyms.c (enum ms_search_which): New.
 > 	(lookup_minimal_symbol_internal): New function, broken out from
 > 	lookup_minimal_symbol.
 > 	(lookup_minimal_symbol, lookup_minimal_symbol_text)
 > 	(lookup_minimal_symbol_solib_trampoline): Use them.
                                                      ^^^^ 
                                     lookup_minimal_symbol_internal???

I don't like the _internal part, could you call it '_aux' instead?
This seems more in line with the rest of gdb's names. Hmm, but maybe
we already have a function by that name.  I also winder how much work
would it be to eventually remove these new functions, that now become
wrappers.


 > 
 > Index: minsyms.c
 > ===================================================================
 > RCS file: /cvs/src/src/gdb/minsyms.c,v
 > retrieving revision 1.22
 > diff -u -p -r1.22 minsyms.c
 > --- minsyms.c	11 Jul 2002 20:46:19 -0000	1.22
 > +++ minsyms.c	6 Jan 2003 04:19:16 -0000
 > @@ -1,5 +1,6 @@
 >  /* GDB routines for manipulating the minimal symbol tables.
 > -   Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001
 > +   Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
 > +   2002, 2003
 >     Free Software Foundation, Inc.
 >     Contributed by Cygnus Support, using pieces from other GDB modules.
 >  
 > @@ -132,11 +133,18 @@ add_minsym_to_demangled_hash_table (stru
 >      }
 >  }
 >  
 > +enum ms_search_which {
 > +  ms_search_all,
 > +  ms_search_text,
 > +  ms_search_trampoline
 > +};
 >  
 >  /* Look through all the current minimal symbol tables and find the
 >     first minimal symbol that matches NAME.  If OBJF is non-NULL, limit
 > -   the search to that objfile.  If SFILE is non-NULL, limit the search
 > -   to that source file.  Returns a pointer to the minimal symbol that
 > +   the search to that objfile.  If SFILE is non-NULL, only find file-scope
 > +   symbols from that source file (global symbols are still preferred).
 > +   WHICH indicates which minimal symbols to accept: all, text symbols only,
 > +   or trampolines only.  Returns a pointer to the minimal symbol that
 >     matches, or NULL if no match is found.
 >  
 >     Note:  One instance where there may be duplicate minimal symbols with
 > @@ -144,9 +152,10 @@ add_minsym_to_demangled_hash_table (stru
 >     symbol tables for an executable contain global symbols with the same
 >     names (the dynamic linker deals with the duplication). */
 >  
 > -struct minimal_symbol *
 > -lookup_minimal_symbol (register const char *name, const char *sfile,
 > -		       struct objfile *objf)
 > +static struct minimal_symbol *
 > +lookup_minimal_symbol_internal (register const char *name, const char *sfile,
 > +				struct objfile *objf,
 > +				enum ms_search_which which)
 >  {
 >    struct objfile *objfile;
 >    struct minimal_symbol *msymbol;
 > @@ -170,68 +179,77 @@ lookup_minimal_symbol (register const ch
 >         objfile != NULL && found_symbol == NULL;
 >         objfile = objfile->next)
 >      {
 > -      if (objf == NULL || objf == objfile)
 > +      int pass;
 > +      if (objf && objf != objfile)
 > +	continue;
 > +
 > +      /* Do two passes: the first over the ordinary hash table,
 > +	 and the second over the demangled hash table.  */
 > +      for (pass = 1; pass <= 2 && found_symbol == NULL; pass++)
 >  	{
 > -	  /* Do two passes: the first over the ordinary hash table,
 > -	     and the second over the demangled hash table.  */
 > -        int pass;
 > -
 > -        for (pass = 1; pass <= 2 && found_symbol == NULL; pass++)
 > +	  /* Select hash list according to pass.  */
 > +	  if (pass == 1)
 > +	    msymbol = objfile->msymbol_hash[hash];
 > +	  else
 > +	    msymbol = objfile->msymbol_demangled_hash[dem_hash];
 > +
 > +	  for (; msymbol != NULL && found_symbol == NULL;
 > +	       msymbol = (pass == 1)
 > +		 ? msymbol->hash_next
 > +		 : msymbol->demangled_hash_next)

please, don't do this. This is much worse than the previous while, with the 
updates at the end.

 >  	    {
 > -            /* Select hash list according to pass.  */
 > -            if (pass == 1)
 > -              msymbol = objfile->msymbol_hash[hash];
 > -            else
 > -              msymbol = objfile->msymbol_demangled_hash[dem_hash];
 > +	      if (!SYMBOL_MATCHES_NAME (msymbol, name))
 > +		continue;
 >  
 > -            while (msymbol != NULL && found_symbol == NULL)
 > +	      if (which == ms_search_trampoline)
 >  		{
 > -                if (SYMBOL_MATCHES_NAME (msymbol, name))
 > -		    {
 > -                    switch (MSYMBOL_TYPE (msymbol))
 > -                      {
 > -                      case mst_file_text:
 > -                      case mst_file_data:
 > -                      case mst_file_bss:
 > +		  if (MSYMBOL_TYPE (msymbol) == mst_solib_trampoline)
 > +		    return msymbol;
 > +		  continue;
 > +		}
 > +
 > +	      if (which == ms_search_text
 > +		  && MSYMBOL_TYPE (msymbol) != mst_file_text
 > +		  && MSYMBOL_TYPE (msymbol) != mst_text)
 > +		continue;
 > +

I don't like the fact that the check for ms_search_all is buried
inside the switch statement, while the other 2 are here. Could you
have an outer switch for the ms_search_*? Also with your changes will
the internal switch case mst_solib_trampoline ever be reached? It
seems like we would find it earlier and return right away.
The whole flow of control of this function seems a bit confused.

 > +	      switch (MSYMBOL_TYPE (msymbol))
 > +		{
 > +		case mst_file_data:
 > +		case mst_file_bss:
 > +		case mst_file_text:
 >  #ifdef SOFUN_ADDRESS_MAYBE_MISSING
 > -                        if (sfile == NULL || STREQ (msymbol->filename, sfile))
 > -                          found_file_symbol = msymbol;
 > +		  if (sfile == NULL || STREQ (msymbol->filename, sfile))
 > +		    found_file_symbol = msymbol;
 >  #else
 > -                        /* We have neither the ability nor the need to
 > -                           deal with the SFILE parameter.  If we find
 > -                           more than one symbol, just return the latest
 > -                           one (the user can't expect useful behavior in
 > -                           that case).  */
 > -                        found_file_symbol = msymbol;
 > +		  /* We have neither the ability nor the need to deal
 > +		     with the SFILE parameter.  If we find more than
 > +		     one symbol, just return the latest one (the user
 > +		     can't expect useful behavior in that case).  */
 > +		  found_file_symbol = msymbol;
 >  #endif
 > -                        break;
 > +		  break;
 >  
 > -                      case mst_solib_trampoline:
 > +		case mst_solib_trampoline:
 >  
 > -                        /* If a trampoline symbol is found, we prefer to
 > -                           keep looking for the *real* symbol. If the
 > -                           actual symbol is not found, then we'll use the
 > -                           trampoline entry. */
 > -                        if (trampoline_symbol == NULL)
 > -                          trampoline_symbol = msymbol;
 > -                        break;
 > -
 > -                      case mst_unknown:
 > -                      default:
 > -                        found_symbol = msymbol;
 > -                        break;
 > -                      }
 > -		    }
 > -
 > -                /* Find the next symbol on the hash chain.  */
 > -                if (pass == 1)
 > -                  msymbol = msymbol->hash_next;
 > -                else
 > -                  msymbol = msymbol->demangled_hash_next;
 > +		  /* If a trampoline symbol is found, we prefer to
 > +		     keep looking for the *real* symbol. If the actual
 > +		     symbol is not found, then we'll use the
 > +		     trampoline entry. */
 > +		  if (trampoline_symbol == NULL)
 > +		    trampoline_symbol = msymbol;
 > +		  break;
 > +
 > +		case mst_unknown:
 > +		default:
 > +		  if (which == ms_search_all)
 > +		    found_symbol = msymbol;
 > +		  break;
 >  		}
 >  	    }
 >  	}
 >      }
 > +
 >    /* External symbols are best.  */
 >    if (found_symbol)
 >      return found_symbol;
 > @@ -247,127 +265,28 @@ lookup_minimal_symbol (register const ch
 >    return NULL;
 >  }
 >  
 > -/* Look through all the current minimal symbol tables and find the
 > -   first minimal symbol that matches NAME and of text type.  
 > -   If OBJF is non-NULL, limit
 > -   the search to that objfile.  If SFILE is non-NULL, limit the search
 > -   to that source file.  Returns a pointer to the minimal symbol that
 > -   matches, or NULL if no match is found.
 > - */

Please, don't deleted the comments, they are still relevant, in a sense.
I mean, the function still searches for the NAME symbol of text type.


Do we have a comment for this function?

 > +struct minimal_symbol *
 > +lookup_minimal_symbol (register const char *name, const char *sfile,
 > +		       struct objfile *objf)
 > +{
 > +  return lookup_minimal_symbol_internal (name, sfile, objf, ms_search_all);
 > +}
 >  
 >  struct minimal_symbol *
 >  lookup_minimal_symbol_text (register const char *name, const char *sfile,
 >  			    struct objfile *objf)
 >  {
 > -  struct objfile *objfile;
 > -  struct minimal_symbol *msymbol;
 > -  struct minimal_symbol *found_symbol = NULL;
 > -  struct minimal_symbol *found_file_symbol = NULL;
 > -
 > -#ifdef SOFUN_ADDRESS_MAYBE_MISSING
 > -  if (sfile != NULL)
 > -    {
 > -      char *p = strrchr (sfile, '/');
 > -      if (p != NULL)
 > -	sfile = p + 1;
 > -    }
 > -#endif
 > -
 > -  for (objfile = object_files;
 > -       objfile != NULL && found_symbol == NULL;
 > -       objfile = objfile->next)
 > -    {
 > -      if (objf == NULL || objf == objfile)
 > -	{
 > -	  for (msymbol = objfile->msymbols;
 > -	       msymbol != NULL && SYMBOL_NAME (msymbol) != NULL &&
 > -	       found_symbol == NULL;
 > -	       msymbol++)
 > -	    {
 > -	      if (SYMBOL_MATCHES_NAME (msymbol, name) &&
 > -		  (MSYMBOL_TYPE (msymbol) == mst_text ||
 > -		   MSYMBOL_TYPE (msymbol) == mst_file_text))
 > -		{
 > -		  switch (MSYMBOL_TYPE (msymbol))
 > -		    {
 > -		    case mst_file_text:
 > -#ifdef SOFUN_ADDRESS_MAYBE_MISSING
 > -		      if (sfile == NULL || STREQ (msymbol->filename, sfile))
 > -			found_file_symbol = msymbol;
 > -#else
 > -		      /* We have neither the ability nor the need to
 > -		         deal with the SFILE parameter.  If we find
 > -		         more than one symbol, just return the latest
 > -		         one (the user can't expect useful behavior in
 > -		         that case).  */
 > -		      found_file_symbol = msymbol;
 > -#endif
 > -		      break;
 > -		    default:
 > -		      found_symbol = msymbol;
 > -		      break;
 > -		    }
 > -		}
 > -	    }
 > -	}
 > -    }
 > -  /* External symbols are best.  */
 > -  if (found_symbol)
 > -    return found_symbol;
 > -
 > -  /* File-local symbols are next best.  */
 > -  if (found_file_symbol)
 > -    return found_file_symbol;
 > -
 > -  return NULL;
 > +  return lookup_minimal_symbol_internal (name, sfile, objf, ms_search_text);
 >  }
 >  
 > -/* Look through all the current minimal symbol tables and find the
 > -   first minimal symbol that matches NAME and of solib trampoline type.  
 > -   If OBJF is non-NULL, limit
 > -   the search to that objfile.  If SFILE is non-NULL, limit the search
 > -   to that source file.  Returns a pointer to the minimal symbol that
 > -   matches, or NULL if no match is found.
 > - */
 > -

Same here, leave the comment (updated of course).

 >  struct minimal_symbol *
 >  lookup_minimal_symbol_solib_trampoline (register const char *name,
 > -					const char *sfile, struct objfile *objf)
 > +					const char *sfile,
 > +					struct objfile *objf)
 >  {
 > -  struct objfile *objfile;
 > -  struct minimal_symbol *msymbol;
 > -  struct minimal_symbol *found_symbol = NULL;
 > -
 > -#ifdef SOFUN_ADDRESS_MAYBE_MISSING
 > -  if (sfile != NULL)
 > -    {
 > -      char *p = strrchr (sfile, '/');
 > -      if (p != NULL)
 > -	sfile = p + 1;
 > -    }
 > -#endif
 > -
 > -  for (objfile = object_files;
 > -       objfile != NULL && found_symbol == NULL;
 > -       objfile = objfile->next)
 > -    {
 > -      if (objf == NULL || objf == objfile)
 > -	{
 > -	  for (msymbol = objfile->msymbols;
 > -	       msymbol != NULL && SYMBOL_NAME (msymbol) != NULL &&
 > -	       found_symbol == NULL;
 > -	       msymbol++)
 > -	    {
 > -	      if (SYMBOL_MATCHES_NAME (msymbol, name) &&
 > -		  MSYMBOL_TYPE (msymbol) == mst_solib_trampoline)
 > -		return msymbol;
 > -	    }
 > -	}
 > -    }
 > -
 > -  return NULL;
 > +  return lookup_minimal_symbol_internal (name, sfile, objf,
 > +					 ms_search_trampoline);
 >  }
 > -
 >  
 >  /* Search through the minimal symbol table for each objfile and find
 >     the symbol whose address is the largest address that is still less


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]