This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[BUG?] Re: [patch]: Fix auto-import for pe targets


[ In case this post fails to pick up the correct References: header, the
original thread can be seen in the the mailing list archive at
http://sourceware.org/ml/binutils/2008-09/threads.html#00204 ]

Kai Tietz wrote:
> Hello,
>
> this patch fixes the broken auto-import feature for pe(and pe+) targets.
>
> ChangeLog
>
> 2008-09-26  Kai Tietz  <kai.tietz@onevision >
>
>         * ld/pe-dll.c (make_import_fixup_entry): Make sure reloc is
> addend.

    Hello Kai (and happy new year),

  Funnily enough, in my usage, this patch breaks the fixed auto-import
feature for PE targets!  You didn't give any details in your post, so let me
explain the problem I've been having (in depth and at length, since there may
be bystanders who aren't familiar with how PE DLL auto-import works):

  Here's my simple testcase:

-----------------------------------<cout.C>-----------------------------------
#include <iostream>

main() {
std::cout << "Invalid vector range " << 3
  << " caught in f()" << std::endl;
  return 0;
}
-----------------------------------<cout.C>-----------------------------------

and it compiles to this:

-----------------------------------<cout.s>-----------------------------------
	.text
.globl _main
	.def	_main;	.scl	2;	.type	32;	.endef
_main:
LFB923:
	.loc 1 4 0
	leal	4(%esp), %ecx
LCFI9:
	andl	$-16, %esp
	pushl	-4(%ecx)
LCFI10:
	pushl	%ebp
LCFI11:
	movl	%esp, %ebp
LCFI12:
	pushl	%ecx
LCFI13:
	subl	$20, %esp
LCFI14:
	.loc 1 4 0
	call	___main
	.loc 1 6 0
	movl	$LC0, 4(%esp)
	movl	$__ZSt4cout, (%esp)
-----------------------------------<cout.s>-----------------------------------

  Now, that reference to __ZSt4cout works fine when you're linking against
static libstdc++, but I've got a DLL.  So, this is a data import, and it needs
a fixup: we need to point an import table (of size 1) at the address field of
the movl opcode, so that the run-time loader points it at the data export in
the DLL.

  However at runtime we can see this hasn't happened:

-----------------------------------<gdb.txt>----------------------------------
(gdb) disass
Dump of assembler code for function main:
0x00401176 <main+0>:    lea    0x4(%esp),%ecx
0x0040117a <main+4>:    and    $0xfffffff0,%esp
0x0040117d <main+7>:    pushl  -0x4(%ecx)
0x00401180 <main+10>:   push   %ebp
0x00401181 <main+11>:   mov    %esp,%ebp
0x00401183 <main+13>:   push   %ecx
0x00401184 <main+14>:   sub    $0x14,%esp
0x00401187 <main+17>:   call   0x401288 <__main>
0x0040118c <main+22>:   movl   $0x40204c,0x4(%esp)
0x00401194 <main+30>:   movl   $0x40514c,(%esp)
0x0040119b <_fu0___ZSt4cout+4>: call   0x4011f0 <_ZStlsISt11char_traitsIcEERSt13
basic_ostreamIcT_ES5_PKc>
-----------------------------------<gdb.txt>----------------------------------

  The movl above should in fact have been relocated directly to the address
of std::cout in the cygstdc++ dll.  Instead, it has been statically linked to
the __imp_ entry for std::cout.

-----------------------------------<gdb.txt>----------------------------------
(gdb) x/1xw 0x0040514c
0x40514c <_imp___ZSt4cout>:     0x6c4caf00
(gdb) print &std::cout
$9 = (ostream *) 0x6c4caf00
(gdb)
-----------------------------------<gdb.txt>----------------------------------

  So, the symbol _fu0___ZSt4cout is pointing at the address field of the

	movl	$__ZSt4cout, (%esp)

instruction, as it should be.  But somehow it ends up at runtime pointing to

(gdb) x/1xw 0x0040514c
0x40514c <_imp___ZSt4cout>:     0x6c4caf00

the import stub.  It should have been handled as an import and pointed
directly to the symbol imported from the DLL, no?  Why this doesn't happen?

  Well, it looks like the import table generated for the fixup is broken:
it has zero entries.  Here's what "dumpbin /imports" (from MSVC) shows

-----------------------------------<imports>----------------------------------

/win/i/FSF-Gcc/release/gcc4-4.3.2-2/build $ dumpbin /imports cout.exe
Microsoft (R) COFF Binary File Dumper Version 6.00.8168
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.


Dump of file cout.exe

File Type: EXECUTABLE IMAGE

  Section contains the following imports:

[ .. snip healthy import dir entries for regular _imp_ imports .. ]

    cygstdc++-6.dll
                40119B Import Address Table
                4050EC Import Name Table
                     0 time date stamp
                     0 Index of first forwarder reference

-----------------------------------<imports>----------------------------------

  Looking at that last entry there, it has two problems: the address of the
IAT (the address where the imported addresses are going to be stored at, which
should be the address of the address operand field of the movl opcode shown
back above) is off by four, and there are no import entries.  (This
arrangement even crashes depends.exe!)  You have to look at the raw hex dumps
to see what's really going on.  Here's the fixup itself:

---------------------------------<raw binary>---------------------------------
Microsoft (R) COFF Binary File Dumper Version 6.00.8168
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.


Dump of file cout.exe

PE signature found

File Type: EXECUTABLE IMAGE

 [ ... snip! ... ]

            5000 [     3B8] RVA [size] of Import Directory

 [ ... snip! ... ]

SECTION HEADER #5
  .idata name
     3B8 virtual size
    5000 virtual address
     400 size of raw data
     E00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0300040 flags
         Initialized Data
         RESERVED - UNKNOWN
         RESERVED - UNKNOWN
         Read Write

RAW DATA #5
 [ ... snip! ... ]

  00405050: EC 50 00 00 00 00 00 00 00 00 00 00 A8 53 00 00  ìP..........¨S..
  00405060: 9B 11 00 00                                      ›...
---------------------------------<raw binary>---------------------------------

  This is an IMAGE_IMPORT_DESCRIPTOR struct, and what we're seeing is an RVA
of 0x50EC for the OriginalFirstThunk field, the list of symbols to import;  an
RVA of 0x53A8 for the Name field, and an RVA of 0x119B for the FirstThunk
field, AKA the IAT, the memory area where the imported symbol address(es)
is/are going to get stored to.  The Name field holds the RVA of the DLL name
to import from: the RVA 0x53A8 corresponds to the actual address 0x4053A8:

---------------------------------<raw binary>---------------------------------
  004053A0:                         63 79 67 73 74 64 63 2B          cygstdc+
  004053B0: 2B 2D 36 2E 64 6C 6C 00                          +-6.dll.
---------------------------------<raw binary>---------------------------------

  The RVAs 0x50EC and 0x119B correspond to the addresses shown in the
"dumpbin /imports" output as "Import Name Table" (0x4050EC) and "Import
Address Table" (0x40119B) and as mentioned before, the IAT pointer is off by
four from the actual address (0x401197) of the symbol _fu0___ZSt4cout.  Now we
look at the Import Name Table:

---------------------------------<raw binary>---------------------------------
  004050E0: 14 53 00 00 00 00 00 00 8C 52 00 00 00 00 00 00  .S......ŒR......
  004050F0: 00 00 00 00 60 51 00 00 6C 51 00 00 7C 51 00 00  ....`Q..lQ..|Q..
---------------------------------<raw binary>---------------------------------

  What we expect to see is a word at 0x4050EC pointing to the import name,
followed immediately by a zero sentinel; instead there's just a zero
immediately.  But look at the word immediately before (0x4050E8): that's a
plausible RVA in the .idata section, and sure enough at that address we find
the missing import (ordinal and name):

---------------------------------<raw binary>---------------------------------
 00405280:                                     78 07 5F 5A              x._Z
 00405290: 53 74 34 63 6F 75 74 00                          St4cout.
---------------------------------<raw binary>---------------------------------

  In other words, the 0x50EC is also off by four.  I stepped through the code
to see how it was getting this way, and it's caused by that line your patch
added in pe-dll.c/make_import_fixup_entry().  That function generates a fixup
that looks like this:

----------------------------------<pe-dll.c>----------------------------------
	.section	.idata$2
	; OriginalFirstThunk
  	.rva		__nm_thnk_SYM (singleton thunk with name of func)
	; TimeDateStamp
 	.long		0
	; ForwarderChain
 	.long		0
	; Name
 	.rva		__my_dll_iname (name of dll)
	; FirstThunk
 	.rva		__fuNN_SYM (pointer to reference (address) in text)
----------------------------------<pe-dll.c>----------------------------------

and it generates the three RVAs by pointing relocs of type BFD_RELOC_RVA at
the offsets 0, 12 and 16 in the fixup entry.  Your patch added the line

----------------------------------<pe-dll.c>----------------------------------
  d2[0] = d2[16] = PE_IDATA5_SIZE; /* Reloc addend.  */
----------------------------------<pe-dll.c>----------------------------------

just after these relocs are generated, and the results is that the first and
last of these RVAs are generated with an addend of four relative to the
symbols they point at, _nm_thnk_XXXXX (the lost-and-found name table shown
above) and __fuNN_SYM (the destination in the program text to write the
imported address to, the single-entry IAT) - resulting in the broken import
table I describe in the hex dumps above.

  Sure enough, I have tested a version of ld with this line disabled, and it
works (for me) in the way I would expect, dumpbin shows a single import in the
fixup image import descriptor, depends.exe doesn't crash, at runtime the
program code shows it's been correctly relocated to point directly into the DLL:

-----------------------------------<gdb.txt>----------------------------------
Breakpoint 1, main () at cout.C:4
4       main() {
(gdb) disass
Dump of assembler code for function main:
0x00401176 <main+0>:    lea    0x4(%esp),%ecx
0x0040117a <main+4>:    and    $0xfffffff0,%esp
0x0040117d <main+7>:    pushl  -0x4(%ecx)
0x00401180 <main+10>:   push   %ebp
0x00401181 <main+11>:   mov    %esp,%ebp
0x00401183 <main+13>:   push   %ecx
0x00401184 <main+14>:   sub    $0x14,%esp
0x00401187 <main+17>:   call   0x401288 <__main>
0x0040118c <main+22>:   movl   $0x40204c,0x4(%esp)
0x00401194 <main+30>:   movl   $0x6c4caf00,(%esp)
0x0040119b <_fu0___ZSt4cout+4>: call   0x4011f0 <_ZStlsISt11char_traitsIcEERSt13
basic_ostreamIcT_ES5_PKc>
-----------------------------------<gdb.txt>----------------------------------

and above all else the program works!  (Pretending to libstdc++ that some
random address in your import section is a fully constructed ostream object
turns out to be a no-no, <g>!)

  I wonder if we are having a 32-vs-64-bit clash here, because I know you
mostly work on win64?  I'm sure I don't understand your patch.  At first I
thought maybe it was something to do with the difference between
IMAGE_THUNK_DATA32 and IMAGE_THUNK_DATA64 and you were trying to offset a
32-bit reloc so it fell into the low part of a 64-bit field, and then I
started thinking about endianness, and then I got confused and decided to ask
you about it instead.  But I'm pretty sure that the right thing for
i686-pc-cygwin is not to have those offsets there; what target were you
working on at the time?

    cheers,
      DaveK


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]