This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
[commit] SPU i-cache: Correct calculation of number of required stubs (1/2)
- From: "Ulrich Weigand" <uweigand at de dot ibm dot com>
- To: binutils at sourceware dot org
- Date: Thu, 28 May 2009 12:56:49 +0200 (CEST)
- Subject: [commit] SPU i-cache: Correct calculation of number of required stubs (1/2)
Hello,
spu_elf_auto_overlay computes the number of stubs required per overlay /
i-cache section by counting the number of distinct outgoing function calls.
This is correct for the overlay case, but incorrect for the i-cache case
-- due to branch rewriting, we need a separate stub for each call site.
This patch uses the call_info->count field to properly account for multiple
calls to the same symbol. However, as spu_elf_auto_overlay operates on
copies of the call_info structs, and those copies lost their proper counts
while going through copy_callee / insert_callee, the patch fixes this as well.
Tested on spu-elf with no regressions.
Approved off-line by Alan Modra; committed to mainline.
Bye,
Ulrich
ChangeLog:
* elf32-spu.c (insert_callee): Accumulate incoming callee->count.
(mark_functions_via_relocs): Initialize callee->count to 1.
(pasted_function): Likewise.
(spu_elf_auto_overlay): Honor call counts when determining number
of stubs required in software i-cache mode.
--- src/bfd/elf32-spu.c.orig 2009-05-26 21:23:13.000000000 +0200
+++ src/bfd/elf32-spu.c 2009-05-26 21:24:41.000000000 +0200
@@ -2588,7 +2589,7 @@ insert_callee (struct function_info *cal
p->fun->start = NULL;
p->fun->is_func = TRUE;
}
- p->count += 1;
+ p->count += callee->count;
/* Reorder list so most recent call is first. */
*pp = p->next;
p->next = caller->call_list;
@@ -2596,7 +2597,6 @@ insert_callee (struct function_info *cal
return FALSE;
}
callee->next = caller->call_list;
- callee->count += 1;
caller->call_list = callee;
return TRUE;
}
@@ -2786,7 +2786,7 @@ mark_functions_via_relocs (asection *sec
callee->is_tail = !is_call;
callee->is_pasted = FALSE;
callee->priority = priority;
- callee->count = 0;
+ callee->count = 1;
if (callee->fun->last_caller != sec)
{
callee->fun->last_caller = sec;
@@ -2878,7 +2878,7 @@ pasted_function (asection *sec)
callee->fun = fun;
callee->is_tail = TRUE;
callee->is_pasted = TRUE;
- callee->count = 0;
+ callee->count = 1;
if (!insert_callee (fun_start, callee))
free (callee);
return TRUE;
@@ -4434,14 +4439,18 @@ spu_elf_auto_overlay (struct bfd_link_in
for (call = dummy_caller.call_list; call; call = call->next)
{
unsigned int k;
+ unsigned int stub_delta = 1;
+
+ if (htab->params->ovly_flavour == ovly_soft_icache)
+ stub_delta = call->count;
+ num_stubs += stub_delta;
- ++num_stubs;
/* If the call is within this overlay, we won't need a
stub. */
for (k = base; k < i + 1; k++)
if (call->fun->sec == ovly_sections[2 * k])
{
- --num_stubs;
+ num_stubs -= stub_delta;
break;
}
}
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com