This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][PR ld/10144] MIPS/BFD: Don't make debug section relocs dynamic


"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>  There is a bug in BFD for MIPS ELF targets that affects symbol processing 
> by the linker.  The prerequisite is a pair objects being linked together 
> -- a shared object and a relocatable object -- for the purpose of creating 
> an executable.  They have to multiply-define a symbol, which has to be a 
> regular definition in the shared object and a common one in the 
> relocatable object.  Additionally there has to be a relocation from an 
> unallocatable (e.g. debug) section referring to the symbol in the 
> relocatable object.
>
>  If a symbol is multiply-defined like this, then the regular definition 
> should override the common one and the latter should be converted to an 
> undefined reference.  The relocation, if from a debug section, can be 
> discarded and resolve to nil as the symbol debug records describe is no 
> longer there; this is what other targets do and also what the MIPS target 
> did before non-PIC support was added.  If not from a debug section, then 
> other targets complain and fail, because a dynamic relocation cannot work 
> from an unallocatable section.
>
>  What the MIPS target does now is as follows:
>
> 1. If copy relocations are enabled, then the common definition is retained 
>    in the relocatable object and the relocation is statically resolved 
>    against it.  This invalid symbol definition may be discarded by the 
>    dynamic linker, covering the bug; I have not checked it.
>
> 2. If copy relocations are disabled, then the static link fails, 
>    complaining about the inability to convert the relocation to a dynamic 
>    one.
>
>  Here is a fix that works for me.  Following the practice from other 
> targets I decided to explicitly check for the SEC_DEBUGGING flag (rather 
> than SEC_ALLOC).  This makes our code work for the two cases mentioned 
> above as proved by the test case below.
>
>  The case of a non-debug non-alloc section is not covered; this is a rare 
> corner case and I haven't looked into it (not that it shouldn't be fixed).

This is what worries me.  The first part of the comment, which I realise
you've taken from elsewhere:

> +	  /* Ignore relocs from SEC_DEBUGGING sections because such
> +	     sections are not SEC_ALLOC and thus ld.so will not process
> +	     them.  Don't set has_static_relocs for the corresponding
> +	     symbol.
> +
> +	     This is needed in cases such as a global symbol definition
> +	     in a shared library causing a common symbol from an object
> +	     file to be converted to an undefined reference.  If that
> +	     happens, then all the relocations against this symbol from
> +	     SEC_DEBUGGING sections in the object file will resolve to
> +	     nil.  */
> +	  if ((sec->flags & SEC_DEBUGGING) != 0)
> +	    break;
>  	  /* Fall through.  */

makes it sound like we're doing something that would be acceptable for
all !SEC_ALLOC sections, which then raises the question why we're only
testing SEC_DEBUGGING.  If I've understood correctly, we're really
saying something like "We know what debugging sections are for.  It's
OK to silently set external addresses to 0: at worst it will impair
the debugging information."  I.e. it's _not_ something we can safely
apply to all SEC_ALLOC sections.

There also seems to be the implicit judgement call: "If the only
reference to a symbol is via debugging information, it's not worth
creating a copy reloc for it."  That has nothing to do with ld.so;
it's a decision we can make entirely in the static linker.  And it's
a perfectly sensible call of course, and maybe the thinking was that
it was so obvious it didn't need to be stated, but still, the comment
seems to gloss over the rationale.

(E.g. with the simple case you're describing:

--------------------------------------------------------------
cat <<EOF >foo.c
int x = 1;
EOF
cat <<EOF >bar.c
int x;
int main (void)
{
  return 0;
}
EOF
cat <<EOF >commands
break main
run
print x
EOF
gcc foo.c -o foo.so -shared -g
gcc bar.c foo.so -o bar -g
LD_LIBRARY_PATH=. gdb --batch --command=commands bar
--------------------------------------------------------------

I get:

--------------------------------------------------------------
Breakpoint 1 at 0x4005c8: file bar.c, line 4.

Breakpoint 1, main () at bar.c:4
4         return 0;
/tmp/commands:3: Error in sourced command file:
Cannot access memory at address 0x0
--------------------------------------------------------------

on my x86_64 box.  With a copy reloc I'd be able to see "1" instead.)

All of which is fine with me.  And even if it wasn't, I wouldn't
suggest MIPS should be different from other targets.  It's just that
the comments we're using to justify the code seem a little lacking.

So with all that, the patch is OK, thanks.  However, I think you
could change:

	.ifdef	ELF64
	.8byte	foo
	.else
	.4byte	foo
	.endif

to:

	dc.a	foo

and avoid the ELF64 symbol.  If that fails for any target, I think we
could legitimately say it's a bug in the target and Not Your Fault.
The test certainly looks nicely generic.

Also, in the MIPS test:

+# Exclude non-ELF targets.
+if ![is_elf_format] {
+    return
+}
+
+# Exclude non-Linux targets; feel free to include your favourite one
+# if you like.
+if ![istarget mips*-*-linux*] {
+    return
+}

I think we can safely dispense with the first check.  I realise you're
doing it for consistency with the target-independent version,
but I wouldn't expect to have to test both of these elsewhere
in ld-mips-elf.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]