This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [BUG?] Re: [patch]: Fix auto-import for pe targets


Hi Dave,

2009/1/3 Dave Korn <dave.korn.cygwin@googlemail.com>:
> [ In case this post fails to pick up the correct References: header, the
> original thread can be seen in the the mailing list archive at
> http://sourceware.org/ml/binutils/2008-09/threads.html#00204 ]
>
> Kai Tietz wrote:
>> Hello,
>>
>> this patch fixes the broken auto-import feature for pe(and pe+) targets.
>>
>> ChangeLog
>>
>> 2008-09-26  Kai Tietz  <kai.tietz@onevision >
>>
>>         * ld/pe-dll.c (make_import_fixup_entry): Make sure reloc is
>> addend.
>
>    Hello Kai (and happy new year),
>
>  Funnily enough, in my usage, this patch breaks the fixed auto-import
> feature for PE targets!  You didn't give any details in your post, so let me
> explain the problem I've been having (in depth and at length, since there may
> be bystanders who aren't familiar with how PE DLL auto-import works):
>
>  Here's my simple testcase:
>
> -----------------------------------<cout.C>-----------------------------------
> #include <iostream>
>
> main() {
> std::cout << "Invalid vector range " << 3
>  << " caught in f()" << std::endl;
>  return 0;
> }
> -----------------------------------<cout.C>-----------------------------------
>
> and it compiles to this:
>
> -----------------------------------<cout.s>-----------------------------------
>        .text
> .globl _main
>        .def    _main;  .scl    2;      .type   32;     .endef
> _main:
> LFB923:
>        .loc 1 4 0
>        leal    4(%esp), %ecx
> LCFI9:
>        andl    $-16, %esp
>        pushl   -4(%ecx)
> LCFI10:
>        pushl   %ebp
> LCFI11:
>        movl    %esp, %ebp
> LCFI12:
>        pushl   %ecx
> LCFI13:
>        subl    $20, %esp
> LCFI14:
>        .loc 1 4 0
>        call    ___main
>        .loc 1 6 0
>        movl    $LC0, 4(%esp)
>        movl    $__ZSt4cout, (%esp)
> -----------------------------------<cout.s>-----------------------------------
>
>  Now, that reference to __ZSt4cout works fine when you're linking against
> static libstdc++, but I've got a DLL.  So, this is a data import, and it needs
> a fixup: we need to point an import table (of size 1) at the address field of
> the movl opcode, so that the run-time loader points it at the data export in
> the DLL.
>
>  However at runtime we can see this hasn't happened:
>
> -----------------------------------<gdb.txt>----------------------------------
> (gdb) disass
> Dump of assembler code for function main:
> 0x00401176 <main+0>:    lea    0x4(%esp),%ecx
> 0x0040117a <main+4>:    and    $0xfffffff0,%esp
> 0x0040117d <main+7>:    pushl  -0x4(%ecx)
> 0x00401180 <main+10>:   push   %ebp
> 0x00401181 <main+11>:   mov    %esp,%ebp
> 0x00401183 <main+13>:   push   %ecx
> 0x00401184 <main+14>:   sub    $0x14,%esp
> 0x00401187 <main+17>:   call   0x401288 <__main>
> 0x0040118c <main+22>:   movl   $0x40204c,0x4(%esp)
> 0x00401194 <main+30>:   movl   $0x40514c,(%esp)
> 0x0040119b <_fu0___ZSt4cout+4>: call   0x4011f0 <_ZStlsISt11char_traitsIcEERSt13
> basic_ostreamIcT_ES5_PKc>
> -----------------------------------<gdb.txt>----------------------------------
>
>  The movl above should in fact have been relocated directly to the address
> of std::cout in the cygstdc++ dll.  Instead, it has been statically linked to
> the __imp_ entry for std::cout.
>
> -----------------------------------<gdb.txt>----------------------------------
> (gdb) x/1xw 0x0040514c
> 0x40514c <_imp___ZSt4cout>:     0x6c4caf00
> (gdb) print &std::cout
> $9 = (ostream *) 0x6c4caf00
> (gdb)
> -----------------------------------<gdb.txt>----------------------------------
>
>  So, the symbol _fu0___ZSt4cout is pointing at the address field of the
>
>        movl    $__ZSt4cout, (%esp)
>
> instruction, as it should be.  But somehow it ends up at runtime pointing to
>
> (gdb) x/1xw 0x0040514c
> 0x40514c <_imp___ZSt4cout>:     0x6c4caf00
>
> the import stub.  It should have been handled as an import and pointed
> directly to the symbol imported from the DLL, no?  Why this doesn't happen?
>
>  Well, it looks like the import table generated for the fixup is broken:
> it has zero entries.  Here's what "dumpbin /imports" (from MSVC) shows
>
> -----------------------------------<imports>----------------------------------
>
> /win/i/FSF-Gcc/release/gcc4-4.3.2-2/build $ dumpbin /imports cout.exe
> Microsoft (R) COFF Binary File Dumper Version 6.00.8168
> Copyright (C) Microsoft Corp 1992-1998. All rights reserved.
>
>
> Dump of file cout.exe
>
> File Type: EXECUTABLE IMAGE
>
>  Section contains the following imports:
>
> [ .. snip healthy import dir entries for regular _imp_ imports .. ]
>
>    cygstdc++-6.dll
>                40119B Import Address Table
>                4050EC Import Name Table
>                     0 time date stamp
>                     0 Index of first forwarder reference
>
> -----------------------------------<imports>----------------------------------
>
>  Looking at that last entry there, it has two problems: the address of the
> IAT (the address where the imported addresses are going to be stored at, which
> should be the address of the address operand field of the movl opcode shown
> back above) is off by four, and there are no import entries.  (This
> arrangement even crashes depends.exe!)  You have to look at the raw hex dumps
> to see what's really going on.  Here's the fixup itself:
>
> ---------------------------------<raw binary>---------------------------------
> Microsoft (R) COFF Binary File Dumper Version 6.00.8168
> Copyright (C) Microsoft Corp 1992-1998. All rights reserved.
>
>
> Dump of file cout.exe
>
> PE signature found
>
> File Type: EXECUTABLE IMAGE
>
>  [ ... snip! ... ]
>
>            5000 [     3B8] RVA [size] of Import Directory
>
>  [ ... snip! ... ]
>
> SECTION HEADER #5
>  .idata name
>     3B8 virtual size
>    5000 virtual address
>     400 size of raw data
>     E00 file pointer to raw data
>       0 file pointer to relocation table
>       0 file pointer to line numbers
>       0 number of relocations
>       0 number of line numbers
> C0300040 flags
>         Initialized Data
>         RESERVED - UNKNOWN
>         RESERVED - UNKNOWN
>         Read Write
>
> RAW DATA #5
>  [ ... snip! ... ]
>
>  00405050: EC 50 00 00 00 00 00 00 00 00 00 00 A8 53 00 00  ìP..........¨S..
>  00405060: 9B 11 00 00                                      ›...
> ---------------------------------<raw binary>---------------------------------
>
>  This is an IMAGE_IMPORT_DESCRIPTOR struct, and what we're seeing is an RVA
> of 0x50EC for the OriginalFirstThunk field, the list of symbols to import;  an
> RVA of 0x53A8 for the Name field, and an RVA of 0x119B for the FirstThunk
> field, AKA the IAT, the memory area where the imported symbol address(es)
> is/are going to get stored to.  The Name field holds the RVA of the DLL name
> to import from: the RVA 0x53A8 corresponds to the actual address 0x4053A8:
>
> ---------------------------------<raw binary>---------------------------------
>  004053A0:                         63 79 67 73 74 64 63 2B          cygstdc+
>  004053B0: 2B 2D 36 2E 64 6C 6C 00                          +-6.dll.
> ---------------------------------<raw binary>---------------------------------
>
>  The RVAs 0x50EC and 0x119B correspond to the addresses shown in the
> "dumpbin /imports" output as "Import Name Table" (0x4050EC) and "Import
> Address Table" (0x40119B) and as mentioned before, the IAT pointer is off by
> four from the actual address (0x401197) of the symbol _fu0___ZSt4cout.  Now we
> look at the Import Name Table:
>
> ---------------------------------<raw binary>---------------------------------
>  004050E0: 14 53 00 00 00 00 00 00 8C 52 00 00 00 00 00 00  .S......ŒR......
>  004050F0: 00 00 00 00 60 51 00 00 6C 51 00 00 7C 51 00 00  ....`Q..lQ..|Q..
> ---------------------------------<raw binary>---------------------------------
>
>  What we expect to see is a word at 0x4050EC pointing to the import name,
> followed immediately by a zero sentinel; instead there's just a zero
> immediately.  But look at the word immediately before (0x4050E8): that's a
> plausible RVA in the .idata section, and sure enough at that address we find
> the missing import (ordinal and name):
>
> ---------------------------------<raw binary>---------------------------------
>  00405280:                                     78 07 5F 5A              x._Z
>  00405290: 53 74 34 63 6F 75 74 00                          St4cout.
> ---------------------------------<raw binary>---------------------------------
>
>  In other words, the 0x50EC is also off by four.  I stepped through the code
> to see how it was getting this way, and it's caused by that line your patch
> added in pe-dll.c/make_import_fixup_entry().  That function generates a fixup
> that looks like this:
>
> ----------------------------------<pe-dll.c>----------------------------------
>        .section        .idata$2
>        ; OriginalFirstThunk
>        .rva            __nm_thnk_SYM (singleton thunk with name of func)
>        ; TimeDateStamp
>        .long           0
>        ; ForwarderChain
>        .long           0
>        ; Name
>        .rva            __my_dll_iname (name of dll)
>        ; FirstThunk
>        .rva            __fuNN_SYM (pointer to reference (address) in text)
> ----------------------------------<pe-dll.c>----------------------------------
>
> and it generates the three RVAs by pointing relocs of type BFD_RELOC_RVA at
> the offsets 0, 12 and 16 in the fixup entry.  Your patch added the line
>
> ----------------------------------<pe-dll.c>----------------------------------
>  d2[0] = d2[16] = PE_IDATA5_SIZE; /* Reloc addend.  */
> ----------------------------------<pe-dll.c>----------------------------------
>
> just after these relocs are generated, and the results is that the first and
> last of these RVAs are generated with an addend of four relative to the
> symbols they point at, _nm_thnk_XXXXX (the lost-and-found name table shown
> above) and __fuNN_SYM (the destination in the program text to write the
> imported address to, the single-entry IAT) - resulting in the broken import
> table I describe in the hex dumps above.
>
>  Sure enough, I have tested a version of ld with this line disabled, and it
> works (for me) in the way I would expect, dumpbin shows a single import in the
> fixup image import descriptor, depends.exe doesn't crash, at runtime the
> program code shows it's been correctly relocated to point directly into the DLL:
>
> -----------------------------------<gdb.txt>----------------------------------
> Breakpoint 1, main () at cout.C:4
> 4       main() {
> (gdb) disass
> Dump of assembler code for function main:
> 0x00401176 <main+0>:    lea    0x4(%esp),%ecx
> 0x0040117a <main+4>:    and    $0xfffffff0,%esp
> 0x0040117d <main+7>:    pushl  -0x4(%ecx)
> 0x00401180 <main+10>:   push   %ebp
> 0x00401181 <main+11>:   mov    %esp,%ebp
> 0x00401183 <main+13>:   push   %ecx
> 0x00401184 <main+14>:   sub    $0x14,%esp
> 0x00401187 <main+17>:   call   0x401288 <__main>
> 0x0040118c <main+22>:   movl   $0x40204c,0x4(%esp)
> 0x00401194 <main+30>:   movl   $0x6c4caf00,(%esp)
> 0x0040119b <_fu0___ZSt4cout+4>: call   0x4011f0 <_ZStlsISt11char_traitsIcEERSt13
> basic_ostreamIcT_ES5_PKc>
> -----------------------------------<gdb.txt>----------------------------------
>
> and above all else the program works!  (Pretending to libstdc++ that some
> random address in your import section is a fully constructed ostream object
> turns out to be a no-no, <g>!)
>
>  I wonder if we are having a 32-vs-64-bit clash here, because I know you
> mostly work on win64?  I'm sure I don't understand your patch.  At first I
> thought maybe it was something to do with the difference between
> IMAGE_THUNK_DATA32 and IMAGE_THUNK_DATA64 and you were trying to offset a
> 32-bit reloc so it fell into the low part of a 64-bit field, and then I
> started thinking about endianness, and then I got confused and decided to ask
> you about it instead.  But I'm pretty sure that the right thing for
> i686-pc-cygwin is not to have those offsets there; what target were you
> working on at the time?
>
>    cheers,
>      DaveK
>

Yes, I think this was a think related to the problem that by pe-dll.c
the stanard import sections .idata$4 and .idata$5 are prefixed with a
non standard zero element. By this IAT table has a offset of 4 bytes
(for 32-bit). For 64-bit the pseudo relocation works directly via
.idata$5 and therefore I didn't noticed that.
The last patch I sent should fix your subject, too.
The interesting part here is, that .idata$4 needs for the call to
make_singleton_name_thunk no offset of 4 for idata$4, but needs the
idata$5 needed it. So the thinko happens.

Could you try my last idata$4/idata$5 patch for binutils. It should
fix your problem, because it let reference the idata$2 section
directly (without the need of offsetting it) the idata$4 (import hash
table) and the idata$5 (IAT).

Cheers,
Kai
-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]