This is the mail archive of the mailing list for the binutils project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix string merging problems on mips*-linux-gnu

On Sat, Jan 10, 2004 at 08:32:31PM +0000, Richard Sandiford wrote:
> This patch fixes -O1 failures for execute/930429-1.c and
> execute/ptr-arith-1.c on mips*-linux-gnu.
> Suppose we have a string such as:
> .LC0: .asciz "x"
> and .LC0 is in a mergeable string section.  If we need a "one past
> the end" pointer to this string, we can generate an insn such as:
>         la      reg,.LC0+2
> or its explicit-reloc equivalent.  But the linker will then think
> that we're refering to the string after "x", not "x" itself.  There's
> no guarantee that the two strings will still be together in the merged
> section.
> The patch makes sure that all references of the form "string_constant +
> offset" refer to a character within the string.  Bootstrapped &
> regression tested on mips64{,el}-linux-gnu, fixes the two test cases.
> Also tested on mips64vrel-elf to verify the mips16 change.  Does it
> look OK?

The patch is changing a wrong place.
This is assembler's and linker's responsibility, not compiler's.
Please revert it and fix binutils instead.
It works on other arches and is likely a mips specific bug.

cat > test1.c <<EOF
char *r = "bob";

int main (void)
  return 0;
cat > test2.c <<EOF
char *p = "a" + 2;
char *q = "b";
gcc -O2 -o test test1.c test2.c
objdump -s -j .rodata -j .data test; readelf -Ws test | grep '[pqr]$'

test:     file format elf32-i386

Contents of section .rodata:
 80483f0 03000000 01000200 626f6200 6100      ........bob.a.
Contents of section .data:
 8049404 00000000 00000000 f0940408 f8830408  ................
 8049414 fe830408 fa830408                    ........
    44: 08049414     4 OBJECT  GLOBAL DEFAULT   16 p
    51: 08049410     4 OBJECT  GLOBAL DEFAULT   16 r
    67: 08049418     4 OBJECT  GLOBAL DEFAULT   16 q

As you can see, p still points one past '\0' in "a", q at "b" and r at
This works because the assembler for:

        .section        .rodata.str1.1,"aMS",@progbits,1
        .string "a"
        .string "b"
        .long   .LC0+2
        .long   .LC1

does not convert the first relocation into section relative one
where what it points to is lost:

gcc -c test2.c -O2; readelf -Wr test2.o; objdump -s -j .data test2.o

Relocation section '' at offset 0x348 contains 2 entries:
 Offset     Info    Type                Sym. Value  Symbol's Name
00000000  00000701 R_386_32               00000000   .LC0
00000004  00000501 R_386_32               00000000   .rodata.str1.1

test2.o:     file format elf32-i386

Contents of section .data:
 0000 02000000 02000000                    ........

i.e. linker sees .LC0 + 2 relocation and .rodata.str1.1 + 2,
where although symbol .LC0 is equal to .rodata.str1.1 symbol,
the relocations mean something different which shows up after
string merging.

The rule gas uses is if addend is 0, relocation can be changed
into section relative one (this is important, if no relocations
against mergeable sections were changed into section relative ones,
objects/.a libraries with DWARF2 debug info would be huge),
otherwise (which is relatively rare) the symbol is kept even if it is local.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]