This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

CFA: pseudo-reloc v2


CFA: Call For Assistance...

In January, as part of this thread:
http://cygwin.com/ml/cygwin-developers/2009-01/msg00009.html
we discussed adding support for Kai Tietz's new v2 pseudo-relocs (while
retaining current v1 compatibility).

In that earlier thread, there were a few concerns raised (licensing,
etc) that have since been addressed. Most prominently:
> I think we have to leave it to the MinGW team to replace the two files
> in the mingw directory.
which they have now done.

cgf said in
http://cygwin.com/ml/cygwin-developers/2009-01/msg00012.html
> Can anyone confirm that it works as advertised?  It will require linking
> an app which relies on this with the above object.  I don't think there
> is any reason to rebuild cygwin.  You just need that pseudo-reloc.o.
> 
> If it works, and I have no reason to think that it won't, would you mind
> checking this in, Chuck?

Unfortunately, it *doesn't* quite work as expected.  In 'basic' mode, it
works fine: accessing 'complex' data items in a DLL from either an
application or another DLL works fine -- as long as nobody calls fork().
 However, there are interactions with fork(), so a little debugging is
necessary.  Since mingw doesn't have a fork(), there's no reason why Kai
would have thought to check that.

So, it's gonna need a little debugging -- CFA?  Test cases attached.


Test case #1:
=====================================================
A dll provides the following data item:

foo_t foo[2] = { {1, {11, 12, 13}, {'a', 'b', 'c'}},
                 {2, {21, 22, 23}, {'d', 'e', 'f'}}};

where

typedef struct
  {
    int x;
    int y[3];
    char z[3];
  }
foo_t;

The application accesses several elements of the structure. When linking
the app using --disable-runtime-pseudo-reloc, you get errors:

$ make crtest_no.exe
gcc -Wl,--enable-auto-import -Wl,--disable-runtime-pseudo-reloc  -o
crtest_no.exe crtest.o -L. -lcrtest -L /usr/lib/w32api
crtest.o: In function `print_data':
/usr/src/devel/kernel/pseudo-reloc-tests/test-app/crtest.c:8: variable
'_foo' can't be auto-imported. Please read the documentation for ld's
--enable-auto-import for details.
...

cygwin-1.7.0-62
------------------------------------------------------------
So, here are the results on stock cygwin-1.7.0-62:

$ ./crtest_v1.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

These are the expected results. Three things to notice:

#1) the first line shows that the relocation does in fact work. The rest
of the test is checking behavior of the relocation access with respect
to fork.
#2) the child's version of the data is, before we modify it, identical
to the value of that data prior to the fork.
#3) but the child has its own COPY of that data, because the child can
modify it without disturbing the parent's version.


$ ./crtest_v2.exe
<hangs>
This is expected; we use the linker flag --enable-pseudo-reloc-v2, but
the 1.7.0-62 libcygwin.a/cygwin1.dll doesn't provide the proper v2 support.

custom cygwin
------------------------------------------------------------
So, using a custom cygwin1.dll and libcygwin.a with the pseudo-reloc.c
from winsup/mingw/, here are the results:

$ ./crtest_v1.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 1
child  (after modify) : data=102 102 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

Well, the basic relocation seems to work, at least initially (line 1).
The child appears to get its own copy of the data. But
#1) the child's version does not appear to have been initialized to the
the same value as the parent's before the fork
#2) when we try to set certain elements, we appear to modify those
values, but on reading the values back we don't get what we set them to
(should be '12 11 f' on the third line).

One possible explanation is that fork completely messes up the v1 reloc
data: in the child each read is looking in some unrelated place, and our
writes are going to la la land. Badness; something ain't right here.
(But don't give up, it gets even weirder, below).


$ ./crtest_v2.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1705517088 1627572261
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

For v2 relocs, we again see that the basic relocation works (line 1).
With regards to fork, the child definitely gets its own copy of the
data, and can manipulate it without affecting the parent.  However, the
child's copy of the data does not get initialized to the same value as
that of the parent.

SUMMARY (of new pseudo-reloc code, when an app accesses 'complex' data
in a DLL):

Basic relocation works.
The interaction of fork and v1 relocs, while OK in the existing
pseudo-reloc implementation, is significantly broken in the new
pseudo-reloc impl.  The interaction of fork and v2 relocs is somewhat
better, but still not entirely correct (child data is not initialized
with parent values).


Test case #2:
=====================================================
We have the same dll as before, providing the same data. However, the
"application" code from test case #1 is moved into a second dll.  The
application simply calls a function in the second DLL (so, the app
doesn't need any 'pseudo-reloc' support at all; it doesn't access the
data in the original dll).

Since it is the crtestx_*.dll which requires reloc support, when you try
to link it with --disable-runtime-pseudo-relocs, you get a fail:

$ make crtestx_no.dll
gcc -g3 -shared -Wl,--image-base=0x20000000 -Wl,--export-all-symbols
-Wl,--out-implib=libcrtestx_no.dll.a -Wl,--enable-auto-import
-Wl,--disable-runtime-pseudo-reloc  -o crtestx_no.dll crtest.o -L.
-lcrtest -L /usr/lib/w32api
Creating library file: libcrtestx_no.dll.a
crtest.o: In function `print_data':
/usr/src/devel/kernel/pseudo-reloc-tests/test-dll/crtest.c:8: variable
'_foo' can't be auto-imported. Please read the documentation for ld's
--enable-auto-import for details.
...

This is expected.


All DLLs are compiled with the same image base -- and given the expected
load order (crtextx*.dll first) this forces crtest.dll, which contains
the data, to be relocated.

There are four test apps:
  crtest1_v1.exe [1]    --> crtestx_v1.dll [2] --> crtest.dll [3]

  crtest1_v2.exe [3]    --> crtestx_v2.dll [4] --> crtest.dll

  crtest1_no_v1.exe [5] --> crtestx_v1.dll     --> crtest.dll

  crtest1_no_v2.exe [6] --> crtestx_v2.dll     --> crtest.dll


[1] linked using v1 relocs (not that they are needed) and linked against
crtestx_v1.dll
[2] linked using v1 relocs against crtest.dll
[3] linked using v2 relocs (not that they are needed) and linked against
crtestx_v2.dll
[4] linked using v2 relocs against crtest.dll
[5] linked using --disable-runtime-pseudo-relocs and linked against
crtestx_v1.dll
[6] linked using --disable-runtime-pseudo-relocs and linked against
crtestx_v2.dll


cygwin-1.7.0-62
------------------------------------------------------------
So, here are the results on stock cygwin-1.7.0-62:

$ ./crtest1_v1.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

This is exactly what we should see.  Again, three things to notice:
#1) the basic relocation works (line 1)
#2) the child has its own copy
#3) which is initialized to the same value as that of the parent, before
the fork

$ ./crtest1_v2.exe
<no output>

More or less expected, as the crtestx_v2.dll fails to load because it
expects v2 relocs, but the linked-in pseudo-reloc.c code doesn't support
them.

$ ./crtest1_no_v1.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

$ ./crtest1_no_v2.exe
<no output>

These two are the same as above, as expected. It doesn't make any
difference whether this simple app is linked using
--disable-runtime-pseudo-relocs, or --enable-*-v1 or --enable-*-v2. This
app doesn't need relocs.

custom cygwin
------------------------------------------------------------
So, using a custom cygwin1.dll and libcygwin.a with the pseudo-reloc.c
from winsup/mingw/, here are the results:

$ ./crtest1_v1.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

...INTERESTING...
if a *DLL* needs relocation support to access data imported from another
DLL, then the new pseudo-reloc code works just as well as the old
pseudo-reloc code, even for v1 relocs.

$ ./crtest1_v2.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

And it works for v2 relocs, as well!

$ ./crtest1_no_v1.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

$ ./crtest1_no_v2.exe
parent (before fork)  : data=1 23 e
child (before modify) : data=1 23 e
child  (after modify) : data=12 11 f
parent (before modify): data=1 23 e
parent  (after modify): data=12 11 f

And, as expected, it doesn't matter whether this app is linked with
pseudo-reloc support or not -- in both cases you get the same results as
above.

SUMMARY (of new pseudo-reloc code, when a DLL accesses complex data
structures in another DLL):

Basic relocation works.
Behavior with regards to fork is consistent with existing pseudo-reloc
code -- e.g. it also works.


OVERALL SUMMARY
=====================================================
The new pseudo-reloc code works in simple (non-fork) scenarios. It also
works in fork scenarios IF the entity accessing relocated data in a DLL,
is itself a DLL -- but not if that entity is an application.

This tells me there is something *right* about the fork() procedure when
it reloads in the child address space any DLLs that contain pseudo-reloc
references.  But there is something wrong happening during fork() with
the new pseudo-reloc code if the application itself contains
pseudo-reloc references of either v1 or v2 type -- but the OLD
pseudo-reloc (v1 only) code did not suffer from this problem.

So, obviously this bug needs to be tracked down and fixed. In addition,
there are two bits of stylistic ugliness in the mingw/pseudo-reloc.c we
may need to change (either in the cygwin copy, or in both copies) before
committing the new pseudo-reloc code:

#1) Using fprintf and mingw-specific error messages. This is bad, even
if they are guarded by DEBUG:

#ifdef DEBUG
      fprintf (stderr, "internal mingw runtime error:"
               "psuedo_reloc version %d is unknown to this runtime.\n",
               (int) v2_hdr->version);
#endif

Maybe something like

#if defined(__INSIDE_CYGWIN__)
     system_printf (
        "Relocation error: invalid pseudo_reloc version %d.",
        int) v2_hdr->version);
#else
#if defined(DEBUG) && defined(__MINGW32__)
      fprintf (stderr, "internal mingw runtime error:"
               "psuedo_reloc version %d is unknown to this runtime.\n",
               (int) v2_hdr->version);
#endif

#2) __write_memory() does this:
  assert (VirtualQuery (addr, &b, sizeof(b)));
I'm wondering if we should be using assert at all (even though assert
'goes away' if !DEBUG).  Maybe:

#if defined(__INSIDE_CYGWIN__)
# if DEBUG
   if (!VirtualQuery (addr, &b, sizeof(b))
     {
       system_printf (
         "Relocation error: some informative error message");
       /* don't try to make page writable; expect a failure
        * (DLL failed to initialize) later...
        */
       return;
     }
# endif
#else
   assert (VirtualQuery (addr, &b, sizeof(b)));
#endif



I won't be able to look further into this for a week or two. If you'd
like to help (and please do!): to run this test case, you need to build
your own cygwin1.dll using CVS HEAD, AND:
  $ cd src/winsup
  $ cp mingw/pseudo-reloc.c cygwin/lib/
before building.  Then, you need to install your new cygwin1.dll AND
libcygwin.a.

Then you can build the test cases by simply typing 'make' in either the
test-app or test-dll subdirectories.

Once (if?) the bug related to the new pseudo-reloc implementation and
fork() is tracked down and fixed, then we could also address the two
stylistic issues above, and perhaps negotiate with the mingw folks to
ensure that both copies of the pseudo-reloc.c file are identical.

--
Chuck

Attachment: pseudo-reloc-tests-v2.tar.bz2
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]