This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: PR ld/3111: LD very slow linking object files containing dwarf2 symbols


On Tue, Aug 29, 2006 at 04:00:12PM -0700, H. J. Lu wrote:
> When we are comparing symbols in a section against another section in
> a different file, we swap in 2 symbol tables, sort them and free the
> symbol tables for each section. It is a O^2 problem. When there are
> many sections and symbols in a file, it can be very slow. This patch
> caches the symbol table unless --reduce-memory-overheads is used. The
> results for the testcase are
> 
> /usr/bin/time ./ld -shared -o libfoo.so b2dynamic_nonlinear_solver.o b2dynamic_nonlinear_utile.o b2static_nonlinear_utile.o b2static_nonlinear_solver.o b2free_vibration_solver.o b2frequency_dependent_free_vibration_solver.o b2linearized_prebuckling_solver.o
> 1.06user 0.13system 0:01.20elapsed 99%CPU
> /usr/bin/time ./ld --reduce-memory-overheads -shared -o libbar.so b2dynamic_nonlinear_solver.o b2dynamic_nonlinear_utile.o b2static_nonlinear_utile.o b2static_nonlinear_solver.o b2free_vibration_solver.o b2frequency_dependent_free_vibration_solver.o b2linearized_prebuckling_solver.o
> 139.24user 8.93system 2:28.19elapsed 99%CPU
> 
> It is about 130X speed up. The only concern I have is the memory
> overhead. It seems small for the testcase.

I was debugging this too on older (FC6) binutils for
http://bugzilla.redhat.com/bugzilla/223181
(initially thought the main problem is caching, but as backport of
http://sources.redhat.com/ml/binutils/2006-11/msg00190.html
didn't help, I looked at why is bfd_elf_match_symbols_in_sections
so slow) and came up with a very different patch.

What the routine used to do is:
1) swap in both symbol tables
2) qsort both of them, with huge entry size (32 bytes on 64-bit host)
   such that SHN_UNDEF symbols come last, other symbols are sorted
   based on st_shndx and when st_shndx is equal based on binding
   (this was actually the most expensive part, on the RH#223181
   testcase oprofile said 55.98% of total time spent in mempcpy,
   19.26% in msort_with_tmp, 10.12% in memcpy and 4.23% in
   elf_sort_elf_symbol, all of which is from this qsort)
3) then we scan the sorted array and find the consecutive
   region in it which has st_shndx equal to the section,
   count how many symbols there are
4) if both sections have the same number of such symbols,
   create symtable{1,2} arrays and for symbols with st_shndx
   equal to shndx{1,2} store in the array a pointer to the
   isym and name pointer
5) qsort symtable{1,2} arrays based solely on symbol name
6) compare the symbol names, st_info, st_other

Now, unless I grossly misunderstood this, we are doing step 2)
only to select isyms with isym->st_shndx == shndx{1,2} and count them
(we don't care about SHN_UNDEF symbols, those don't have isym->st_shndx
equal to shndx{1,2}, we don't care about the ST_BIND sorting, because
we shortly after that re-sort the array based on name.
But, to select isyms with isym->st_shndx == shndx{1,2} we don't need
to qsort at all, that can be done linearly.

So the patch below (against 2.17.50.0.6) removes the expensive 2) step
and step 3) at the expense of allocating bigger temporary symtable{1,2} arrays
(previously they were just 2 * sizeof (void *) * count{1,2}, with
this patch they are * symcount{1,2}).  But those are freed before
the function returns and it pays off a lot.  I'm getting similar
speedup as you are getting with your patch.

Now, your patch instead caches when not reduce_memory_overheads
steps 1) and 2), which is quite memory hungry (some sections have
huge number of (mostly SHN_UNDEF or other section) symbols).

I wonder if we can't combine the benefits of both approaches.

--- bfd/elf.c.jj	2006-10-20 20:50:57.000000000 +0200
+++ bfd/elf.c	2007-01-19 14:54:04.000000000 +0100
@@ -8623,33 +8623,6 @@ _bfd_elf_get_synthetic_symtab (bfd *abfd
   return n;
 }
 
-/* Sort symbol by binding and section. We want to put definitions
-   sorted by section at the beginning.  */
-
-static int
-elf_sort_elf_symbol (const void *arg1, const void *arg2)
-{
-  const Elf_Internal_Sym *s1;
-  const Elf_Internal_Sym *s2;
-  int shndx;
-
-  /* Make sure that undefined symbols are at the end.  */
-  s1 = (const Elf_Internal_Sym *) arg1;
-  if (s1->st_shndx == SHN_UNDEF)
-    return 1;
-  s2 = (const Elf_Internal_Sym *) arg2;
-  if (s2->st_shndx == SHN_UNDEF)
-    return -1;
-
-  /* Sorted by section index.  */
-  shndx = s1->st_shndx - s2->st_shndx;
-  if (shndx != 0)
-    return shndx;
-
-  /* Sorted by binding.  */
-  return ELF_ST_BIND (s1->st_info)  - ELF_ST_BIND (s2->st_info);
-}
-
 struct elf_symbol
 {
   Elf_Internal_Sym *sym;
@@ -8675,9 +8648,8 @@ bfd_elf_match_symbols_in_sections (asect
   Elf_Internal_Shdr *hdr1, *hdr2;
   bfd_size_type symcount1, symcount2;
   Elf_Internal_Sym *isymbuf1, *isymbuf2;
-  Elf_Internal_Sym *isymstart1 = NULL, *isymstart2 = NULL, *isym;
-  Elf_Internal_Sym *isymend;
-  struct elf_symbol *symp, *symtable1 = NULL, *symtable2 = NULL;
+  Elf_Internal_Sym *isym, *isymend;
+  struct elf_symbol *symtable1 = NULL, *symtable2 = NULL;
   bfd_size_type count1, count2, i;
   int shndx1, shndx2;
   bfd_boolean result;
@@ -8733,75 +8705,36 @@ bfd_elf_match_symbols_in_sections (asect
   if (isymbuf1 == NULL || isymbuf2 == NULL)
     goto done;
 
-  /* Sort symbols by binding and section. Global definitions are at
-     the beginning.  */
-  qsort (isymbuf1, symcount1, sizeof (Elf_Internal_Sym),
-	 elf_sort_elf_symbol);
-  qsort (isymbuf2, symcount2, sizeof (Elf_Internal_Sym),
-	 elf_sort_elf_symbol);
+  symtable1 = bfd_malloc2 (symcount1, sizeof (struct elf_symbol));
+  symtable2 = bfd_malloc2 (symcount2, sizeof (struct elf_symbol));
 
-  /* Count definitions in the section.  */
-  count1 = 0;
-  for (isym = isymbuf1, isymend = isym + symcount1;
-       isym < isymend; isym++)
-    {
-      if (isym->st_shndx == (unsigned int) shndx1)
-	{
-	  if (count1 == 0)
-	    isymstart1 = isym;
-	  count1++;
-	}
+  if (symtable1 == NULL || symtable2 == NULL)
+    goto done;
 
-      if (count1 && isym->st_shndx != (unsigned int) shndx1)
-	break;
-    }
+  /* Choose symbol definitions in the section.  */
+  count1 = 0;
+  for (isym = isymbuf1, isymend = isym + symcount1; isym < isymend; isym++)
+    if (isym->st_shndx == (unsigned int) shndx1)
+      symtable1[count1++].sym = isym;
 
   count2 = 0;
-  for (isym = isymbuf2, isymend = isym + symcount2;
-       isym < isymend; isym++)
-    {
-      if (isym->st_shndx == (unsigned int) shndx2)
-	{
-	  if (count2 == 0)
-	    isymstart2 = isym;
-	  count2++;
-	}
-
-      if (count2 && isym->st_shndx != (unsigned int) shndx2)
-	break;
-    }
+  for (isym = isymbuf2, isymend = isym + symcount2; isym < isymend; isym++)
+    if (isym->st_shndx == (unsigned int) shndx2)
+      symtable2[count2++].sym = isym;
 
   if (count1 == 0 || count2 == 0 || count1 != count2)
     goto done;
 
-  symtable1 = bfd_malloc (count1 * sizeof (struct elf_symbol));
-  symtable2 = bfd_malloc (count1 * sizeof (struct elf_symbol));
-
-  if (symtable1 == NULL || symtable2 == NULL)
-    goto done;
+  for (i = 0; i < count1; i++)
+    symtable1[i].name
+      = bfd_elf_string_from_elf_section (bfd1, hdr1->sh_link,
+					 symtable1[i].sym->st_name);
+
+  for (i = 0; i < count2; i++)
+    symtable2[i].name
+      = bfd_elf_string_from_elf_section (bfd2, hdr2->sh_link,
+					 symtable2[i].sym->st_name);
 
-  symp = symtable1;
-  for (isym = isymstart1, isymend = isym + count1;
-       isym < isymend; isym++)
-    {
-      symp->sym = isym;
-      symp->name = bfd_elf_string_from_elf_section (bfd1,
-						    hdr1->sh_link,
-						    isym->st_name);
-      symp++;
-    }
- 
-  symp = symtable2;
-  for (isym = isymstart2, isymend = isym + count1;
-       isym < isymend; isym++)
-    {
-      symp->sym = isym;
-      symp->name = bfd_elf_string_from_elf_section (bfd2,
-						    hdr2->sh_link,
-						    isym->st_name);
-      symp++;
-    }
-  
   /* Sort symbol by name.  */
   qsort (symtable1, count1, sizeof (struct elf_symbol),
 	 elf_sym_name_compare);


	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]