This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH][BZ 17251] Calculate RPATH $ORIGIN from absolute path
- From: Brennan Shacklett <bpshacklett at gmail dot com>
- To: libc-alpha at sourceware dot org
- Date: Sun, 26 Oct 2014 12:43:58 -0700
- Subject: Re: [PATCH][BZ 17251] Calculate RPATH $ORIGIN from absolute path
- Authentication-results: sourceware.org; auth=none
- References: <1407860209-21797-1-git-send-email-bpshacklett at gmail dot com> <CAL8Wma9QBp9o7p2iipTUHDb38pNJxzopq8WHrgrU+pQTsqT1Ug at mail dot gmail dot com> <CAL8Wma-GKCeT=3qFdQC3Db_PVZZD2TpDbaxphko+PL7e2SYfSg at mail dot gmail dot com> <CAL8Wma_7uqxWhw4TKLZnXbUwwGXZ71Mc0mGakeLZ91u-wmtASg at mail dot gmail dot com>
ping
On Mon, Oct 6, 2014 at 9:18 AM, Brennan Shacklett <bpshacklett@gmail.com> wrote:
> ping
>
> On Wed, Oct 1, 2014 at 12:28 AM, Brennan Shacklett
> <bpshacklett@gmail.com> wrote:
>> ping
>>
>> On Tue, Sep 23, 2014 at 8:09 PM, Brennan Shacklett
>> <bpshacklett@gmail.com> wrote:
>>> On Tue, Aug 12, 2014 at 9:16 AM, Brennan Shacklett
>>> <bpshacklett@gmail.com> wrote:
>>>> Bug 17251 deals with (what I believe is) the incorrect calculation of $ORIGIN
>>>> for shared libraries. Currently $ORIGIN is calculated correctly for executables
>>>> on linux, because readlink is called on /proc/self/exe, which means the
>>>> resulting path is absolute and has no symlinks.
>>>> Shared libraries with relative paths on the other hand are based
>>>> off of appending the name / path of the library to the current working
>>>> directory, which means if the library is a symlink, it is not followed, which
>>>> breaks RPATH $ORIGIN in the following scenario:
>>>>
>>>> libone.so and libtwo.so are both in ~/lib
>>>> libone.so needs libtwo.so, so libone.so has an RPATH of $ORIGIN.
>>>> If I run ldd on ~/lib/libone.so, libtwo.so is found and all is good.
>>>> If I create a symlink named ~/libone.so to ~/lib/libone.so, and run ldd on it
>>>> libtwo is not found, because $ORIGIN for the library is calculated as ~
>>>> instead of ~/lib.
>>>>
>>>> If I was to repeat the above test but instead of libone.so use an executable,
>>>> everything would work as expected, which is why I think the shared library
>>>> behavior is a bug.
>>>>
>>>> I ran into this in the real world when attempting to dynamically load a .so
>>>> for python SWIG bindings, because the .so which python was loading was
>>>> symlinked to the actual library directory where all of the .so's dependencies
>>>> were located.
>>>>
>>>> The attached patch fixes this behavior by having realname evaluated with
>>>> __realpath in _dl_map_object_from_fd, before _dl_new_object is called (which is
>>>> where l_origin is assigned). This also somewhat simplifies the code in
>>>> elf/dl-object.c, because realname is guaranteed to be an absolute path
>>>> generated by realpath, so I was able to remove the code dealing with relative
>>>> paths.
>>>> I used the implementation of dl_realpath from sysdeps/tile/dl-runtime.c as the
>>>> generic implementation, and extended it with lstat and readlink in the linux
>>>> version. This means the bug is still present on other systems than linux (the
>>>> generic implementation only returns an absolute path, it doesn't do anything
>>>> with symlinks), but if there is a way to get the generic version to follow
>>>> symlinks on all systems please let me know.
>>>>
>>>> I tested and ran the elf test suite on Gentoo Linux x86-64. If this change is
>>>> wanted, I will happily write a test to go along with it.
>>>> I don't have any sort of copyright attribution set up with the FSF, let me know
>>>> if it is necessary for this change.
>>>>
>>>> Thanks,
>>>> Brennan Shacklett
>>>>
>>>> 2014-08-12 Brennan Shacklett <bpshacklett@gmail.com>
>>>>
>>>> [BZ #17251]
>>>> * elf/Makefile (rtld-routines): Add dl-realpath.
>>>> * elf/dl-load.c (_dl_map_object_from_fd):
>>>> Calculate realname with __realpath.
>>>> * elf/dl-object.c (_dl_new_object): Remove relative path handling.
>>>> * elf/dl-realpath.c: New file. Generic implementation of __realpath.
>>>> * elf/tile/dl-runtime.c (_dl_after_load): Remove dl_realpath and
>>>> change call to __realpath.
>>>> * sysdeps/unix/sysv/linux/Makefile (sysdep-rtld-routines):
>>>> Add dl-lxstat64.
>>>> * sysdeps/unix/sysv/linux/dl-lxstat64.c: New file.
>>>> * sysdeps/unix/sysv/linux/dl-realpath.c: New file. Handle symlinks
>>>> unlike generic implementation.
>>>> ---
>>>> elf/Makefile | 3 +-
>>>> elf/dl-load.c | 18 +++++
>>>> elf/dl-object.c | 54 ++-------------
>>>> elf/dl-realpath.c | 85 ++++++++++++++++++++++++
>>>> sysdeps/tile/dl-runtime.c | 66 +------------------
>>>> sysdeps/unix/sysv/linux/Makefile | 2 +-
>>>> sysdeps/unix/sysv/linux/dl-lxstat64.c | 1 +
>>>> sysdeps/unix/sysv/linux/dl-realpath.c | 121 ++++++++++++++++++++++++++++++++++
>>>> 8 files changed, 236 insertions(+), 114 deletions(-)
>>>> create mode 100644 elf/dl-realpath.c
>>>> create mode 100644 sysdeps/unix/sysv/linux/dl-lxstat64.c
>>>> create mode 100644 sysdeps/unix/sysv/linux/dl-realpath.c
>>>>
>>>> diff --git a/elf/Makefile b/elf/Makefile
>>>> index 25012cc..c130f21 100644
>>>> --- a/elf/Makefile
>>>> +++ b/elf/Makefile
>>>> @@ -43,7 +43,8 @@ shared-only-routines += dl-caller
>>>>
>>>> # ld.so uses those routines, plus some special stuff for being the program
>>>> # interpreter and operating independent of libc.
>>>> -rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal
>>>> +rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal \
>>>> + dl-realpath
>>>> all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines)
>>>>
>>>> CFLAGS-dl-runtime.c = -fexceptions -fasynchronous-unwind-tables
>>>> diff --git a/elf/dl-load.c b/elf/dl-load.c
>>>> index 016a99c..7353837 100644
>>>> --- a/elf/dl-load.c
>>>> +++ b/elf/dl-load.c
>>>> @@ -891,6 +891,7 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp,
>>>> int errval = 0;
>>>> struct r_debug *r = _dl_debug_initialize (0, nsid);
>>>> bool make_consistent = false;
>>>> + char *absolutename = NULL;
>>>>
>>>> /* Get file information. */
>>>> if (__glibc_unlikely (__fxstat64 (_STAT_VER, fd, &st) < 0))
>>>> @@ -902,6 +903,23 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp,
>>>> lose (errval, fd, name, realname, l, errstring,
>>>> make_consistent ? r : NULL, nsid);
>>>> }
>>>> + /* Find absolute pathname for object */
>>>> + absolutename = (char *) malloc (PATH_MAX);
>>>> + if (!absolutename)
>>>> + {
>>>> + errstring = N_("cannot allocate memory for absolute path");
>>>> + goto call_lose_errno;
>>>> + }
>>>> +
>>>> + if (!__realpath (realname, absolutename))
>>>> + {
>>>> + free (absolutename);
>>>> + errstring = N_("cannot find absolute path of shared object");
>>>> + goto call_lose_errno;
>>>> + }
>>>> +
>>>> + free (realname);
>>>> + realname = absolutename;
>>>>
>>>> /* Look again to see if the real name matched another already loaded. */
>>>> for (l = GL(dl_ns)[nsid]._ns_loaded; l; l = l->l_next)
>>>> diff --git a/elf/dl-object.c b/elf/dl-object.c
>>>> index afd80a6..d93935b 100644
>>>> --- a/elf/dl-object.c
>>>> +++ b/elf/dl-object.c
>>>> @@ -158,55 +158,15 @@ _dl_new_object (char *realname, const char *libname, int type,
>>>> char *origin;
>>>> char *cp;
>>>>
>>>> - if (realname[0] == '/')
>>>> + /* It is an absolute path, calculated by realpath. Use it.
>>>> + * But we have to make a copy since we strip out the trailing slash. */
>>>> + assert (realname[0] == '/');
>>>> + cp = origin = (char *) malloc (realname_len);
>>>> + if (origin == NULL)
>>>> {
>>>> - /* It is an absolute path. Use it. But we have to make a
>>>> - copy since we strip out the trailing slash. */
>>>> - cp = origin = (char *) malloc (realname_len);
>>>> - if (origin == NULL)
>>>> - {
>>>> - origin = (char *) -1;
>>>> - goto out;
>>>> - }
>>>> + origin = (char *) -1;
>>>> + goto out;
>>>> }
>>>> - else
>>>> - {
>>>> - size_t len = realname_len;
>>>> - char *result = NULL;
>>>> -
>>>> - /* Get the current directory name. */
>>>> - origin = NULL;
>>>> - do
>>>> - {
>>>> - char *new_origin;
>>>> -
>>>> - len += 128;
>>>> - new_origin = (char *) realloc (origin, len);
>>>> - if (new_origin == NULL)
>>>> - /* We exit the loop. Note that result == NULL. */
>>>> - break;
>>>> - origin = new_origin;
>>>> - }
>>>> - while ((result = __getcwd (origin, len - realname_len)) == NULL
>>>> - && errno == ERANGE);
>>>> -
>>>> - if (result == NULL)
>>>> - {
>>>> - /* We were not able to determine the current directory.
>>>> - Note that free(origin) is OK if origin == NULL. */
>>>> - free (origin);
>>>> - origin = (char *) -1;
>>>> - goto out;
>>>> - }
>>>> -
>>>> - /* Find the end of the path and see whether we have to add a
>>>> - slash. We could use rawmemchr but this need not be
>>>> - fast. */
>>>> - cp = (strchr) (origin, '\0');
>>>> - if (cp[-1] != '/')
>>>> - *cp++ = '/';
>>>> - }
>>>> -
>>>> /* Add the real file name. */
>>>> cp = __mempcpy (cp, realname, realname_len);
>>>>
>>>> diff --git a/elf/dl-realpath.c b/elf/dl-realpath.c
>>>> new file mode 100644
>>>> index 0000000..2549bb3
>>>> --- /dev/null
>>>> +++ b/elf/dl-realpath.c
>>>> @@ -0,0 +1,85 @@
>>>> +/* Dynamic loader version of realpath.
>>>> + Copyright (C) 2014 Free Software Foundation, Inc.
>>>> + This file is part of the GNU C Library.
>>>> +
>>>> + The GNU C Library is free software; you can redistribute it and/or
>>>> + modify it under the terms of the GNU Lesser General Public
>>>> + License as published by the Free Software Foundation; either
>>>> + version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> + The GNU C Library is distributed in the hope that it will be useful,
>>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>>>> + Lesser General Public License for more details.
>>>> +
>>>> + You should have received a copy of the GNU Lesser General Public
>>>> + License along with the GNU C Library; if not, see
>>>> + <http://www.gnu.org/licenses/>. */
>>>> +
>>>> +#include <stdlib.h>
>>>> +#include <unistd.h>
>>>> +#include <limits.h>
>>>> +#include <ldsodefs.h>
>>>> +
>>>> +/* Simplified implementation of realpath(): no dynamic memory use, no lstat(),
>>>> + no set_errno(), no valid "rpath" on error, etc. This simplifies cases
>>>> + involving relative paths, specifically where $ORIGIN needs to be
>>>> + calculated. For this relatively rare case, one could also imagine using
>>>> + link_map.l_origin to avoid the getcwd() here, but the simpler code
>>>> + here seems like a better solution. */
>>>> +char * internal_function
>>>> +__realpath (const char *name, char *rpath)
>>>> +{
>>>> + char *dest;
>>>> + const char *start, *end;
>>>> +
>>>> + if (name[0] != '/')
>>>> + {
>>>> + if (!__getcwd (rpath, PATH_MAX))
>>>> + return NULL;
>>>> + dest = __rawmemchr (rpath, '\0');
>>>> + }
>>>> + else
>>>> + {
>>>> + rpath[0] = '/';
>>>> + dest = rpath + 1;
>>>> + }
>>>> +
>>>> + for (start = end = name; *start; start = end)
>>>> + {
>>>> + /* Skip sequence of multiple path-separators. */
>>>> + while (*start == '/')
>>>> + ++start;
>>>> +
>>>> + /* Find end of path component. */
>>>> + for (end = start; *end && *end != '/'; ++end)
>>>> + /* Nothing. */;
>>>> +
>>>> + if (end - start == 0)
>>>> + break;
>>>> + else if (end - start == 1 && start[0] == '.')
>>>> + /* nothing */;
>>>> + else if (end - start == 2 && start[0] == '.' && start[1] == '.')
>>>> + {
>>>> + /* Back up to previous component, ignore if at root already. */
>>>> + if (dest > rpath + 1)
>>>> + while ((--dest)[-1] != '/');
>>>> + }
>>>> + else
>>>> + {
>>>> + if (dest[-1] != '/')
>>>> + *dest++ = '/';
>>>> +
>>>> + if (dest + (end - start) >= rpath + PATH_MAX)
>>>> + return NULL;
>>>> +
>>>> + dest = __mempcpy (dest, start, end - start);
>>>> + *dest = '\0';
>>>> + }
>>>> + }
>>>> + if (dest > rpath + 1 && dest[-1] == '/')
>>>> + --dest;
>>>> + *dest = '\0';
>>>> +
>>>> + return rpath;
>>>> +}
>>>> diff --git a/sysdeps/tile/dl-runtime.c b/sysdeps/tile/dl-runtime.c
>>>> index bcc00bc..45251d0 100644
>>>> --- a/sysdeps/tile/dl-runtime.c
>>>> +++ b/sysdeps/tile/dl-runtime.c
>>>> @@ -29,70 +29,6 @@
>>>> #include <arch/sim.h>
>>>> #include <dl-unmap-segments.h>
>>>>
>>>> -/* Like realpath(), but simplified: no dynamic memory use, no lstat(),
>>>> - no set_errno(), no valid "rpath" on error, etc. This handles some
>>>> - simple cases where the simulator might not have a valid entry for
>>>> - a loaded Elf object, in particular dlopen() with a relative path.
>>>> - For this relatively rare case, one could also imagine using
>>>> - link_map.l_origin to avoid the getcwd() here, but the simpler code
>>>> - here seems like a better solution. */
>>>> -static char *
>>>> -dl_realpath (const char *name, char *rpath)
>>>> -{
>>>> - char *dest;
>>>> - const char *start, *end;
>>>> -
>>>> - if (name[0] != '/')
>>>> - {
>>>> - if (!__getcwd (rpath, PATH_MAX))
>>>> - return NULL;
>>>> - dest = __rawmemchr (rpath, '\0');
>>>> - }
>>>> - else
>>>> - {
>>>> - rpath[0] = '/';
>>>> - dest = rpath + 1;
>>>> - }
>>>> -
>>>> - for (start = end = name; *start; start = end)
>>>> - {
>>>> - /* Skip sequence of multiple path-separators. */
>>>> - while (*start == '/')
>>>> - ++start;
>>>> -
>>>> - /* Find end of path component. */
>>>> - for (end = start; *end && *end != '/'; ++end)
>>>> - /* Nothing. */;
>>>> -
>>>> - if (end - start == 0)
>>>> - break;
>>>> - else if (end - start == 1 && start[0] == '.')
>>>> - /* nothing */;
>>>> - else if (end - start == 2 && start[0] == '.' && start[1] == '.')
>>>> - {
>>>> - /* Back up to previous component, ignore if at root already. */
>>>> - if (dest > rpath + 1)
>>>> - while ((--dest)[-1] != '/');
>>>> - }
>>>> - else
>>>> - {
>>>> - if (dest[-1] != '/')
>>>> - *dest++ = '/';
>>>> -
>>>> - if (dest + (end - start) >= rpath + PATH_MAX)
>>>> - return NULL;
>>>> -
>>>> - dest = __mempcpy (dest, start, end - start);
>>>> - *dest = '\0';
>>>> - }
>>>> - }
>>>> - if (dest > rpath + 1 && dest[-1] == '/')
>>>> - --dest;
>>>> - *dest = '\0';
>>>> -
>>>> - return rpath;
>>>> -}
>>>> -
>>>> /* Support notifying the simulator about new objects. */
>>>> void internal_function
>>>> _dl_after_load (struct link_map *l)
>>>> @@ -117,7 +53,7 @@ _dl_after_load (struct link_map *l)
>>>> DLPUTC (':');
>>>>
>>>> /* Write the library path, including the terminating '\0'. */
>>>> - path = dl_realpath (l->l_name, pathbuf) ?: l->l_name;
>>>> + path = __realpath (l->l_name, pathbuf) ?: l->l_name;
>>>> for (size_t i = 0;; i++)
>>>> {
>>>> DLPUTC (path[i]);
>>>> diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
>>>> index 9ad6d22..4525429 100644
>>>> --- a/sysdeps/unix/sysv/linux/Makefile
>>>> +++ b/sysdeps/unix/sysv/linux/Makefile
>>>> @@ -176,7 +176,7 @@ endif
>>>>
>>>> ifeq ($(subdir),elf)
>>>> sysdep-rtld-routines += dl-brk dl-sbrk dl-getcwd dl-openat64 dl-opendir \
>>>> - dl-fxstatat64
>>>> + dl-fxstatat64 dl-lxstat64
>>>>
>>>> CPPFLAGS-lddlibc4 += -DNOT_IN_libc
>>>>
>>>> diff --git a/sysdeps/unix/sysv/linux/dl-lxstat64.c b/sysdeps/unix/sysv/linux/dl-lxstat64.c
>>>> new file mode 100644
>>>> index 0000000..63e6800
>>>> --- /dev/null
>>>> +++ b/sysdeps/unix/sysv/linux/dl-lxstat64.c
>>>> @@ -0,0 +1 @@
>>>> +#include <lxstat64.c>
>>>> diff --git a/sysdeps/unix/sysv/linux/dl-realpath.c b/sysdeps/unix/sysv/linux/dl-realpath.c
>>>> new file mode 100644
>>>> index 0000000..91652dc
>>>> --- /dev/null
>>>> +++ b/sysdeps/unix/sysv/linux/dl-realpath.c
>>>> @@ -0,0 +1,121 @@
>>>> +/* Dynamic loader version of realpath for linux.
>>>> + Copyright (C) 2014 Free Software Foundation, Inc.
>>>> + This file is part of the GNU C Library.
>>>> +
>>>> + The GNU C Library is free software; you can redistribute it and/or
>>>> + modify it under the terms of the GNU Lesser General Public
>>>> + License as published by the Free Software Foundation; either
>>>> + version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> + The GNU C Library is distributed in the hope that it will be useful,
>>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>>>> + Lesser General Public License for more details.
>>>> +
>>>> + You should have received a copy of the GNU Lesser General Public
>>>> + License along with the GNU C Library; if not, see
>>>> + <http://www.gnu.org/licenses/>. */
>>>> +
>>>> +#include <stdlib.h>
>>>> +#include <unistd.h>
>>>> +#include <limits.h>
>>>> +#include <ldsodefs.h>
>>>> +#include <sys/stat.h>
>>>> +#include <kernel-features.h>
>>>> +#include <sysdep.h>
>>>> +
>>>> +/* Simplified version of realpath which extends elf/dl-realpath.c to use
>>>> + * linux syscalls for handling symlinks */
>>>> +char * internal_function
>>>> +__realpath (const char *name, char *rpath)
>>>> +{
>>>> + char *dest;
>>>> + char extra_buf[PATH_MAX];
>>>> + const char *start, *end;
>>>> + int num_links = 0;
>>>> +
>>>> + if (name[0] != '/')
>>>> + {
>>>> + if (!__getcwd (rpath, PATH_MAX))
>>>> + return NULL;
>>>> + dest = __rawmemchr (rpath, '\0');
>>>> + }
>>>> + else
>>>> + {
>>>> + rpath[0] = '/';
>>>> + dest = rpath + 1;
>>>> + }
>>>> +
>>>> + for (start = end = name; *start; start = end)
>>>> + {
>>>> + struct stat64 st;
>>>> + int n;
>>>> + /* Skip sequence of multiple path-separators. */
>>>> + while (*start == '/')
>>>> + ++start;
>>>> +
>>>> + /* Find end of path component. */
>>>> + for (end = start; *end && *end != '/'; ++end)
>>>> + /* Nothing. */;
>>>> +
>>>> + if (end - start == 0)
>>>> + break;
>>>> + else if (end - start == 1 && start[0] == '.')
>>>> + /* nothing */;
>>>> + else if (end - start == 2 && start[0] == '.' && start[1] == '.')
>>>> + {
>>>> + /* Back up to previous component, ignore if at root already. */
>>>> + if (dest > rpath + 1)
>>>> + while ((--dest)[-1] != '/');
>>>> + }
>>>> + else
>>>> + {
>>>> + if (dest[-1] != '/')
>>>> + *dest++ = '/';
>>>> +
>>>> + if (dest + (end - start) >= rpath + PATH_MAX)
>>>> + return NULL;
>>>> +
>>>> + dest = __mempcpy (dest, start, end - start);
>>>> + *dest = '\0';
>>>> + if (__lxstat64 (_STAT_VER, rpath, &st) < 0)
>>>> + return NULL;
>>>> +
>>>> + if (S_ISLNK (st.st_mode))
>>>> + {
>>>> + char buf[PATH_MAX];
>>>> + size_t len;
>>>> +
>>>> + INTERNAL_SYSCALL_DECL (err);
>>>> +
>>>> + n = INTERNAL_SYSCALL (readlink, err, 3, rpath, buf,
>>>> + PATH_MAX - 1);
>>>> +
>>>> + if (n <= 0 || buf[0] == '[')
>>>> + return NULL;
>>>> + buf[n] = '\0';
>>>> +
>>>> + len = strlen (end);
>>>> + if ((long int) (n + len) >= PATH_MAX)
>>>> + return NULL;
>>>> +
>>>> + /* Careful here, end may be a pointer into extra_buf... */
>>>> + memmove (&extra_buf[n], end, len + 1);
>>>> + name = end = memcpy (extra_buf, buf, n);
>>>> +
>>>> + if (buf[0] == '/')
>>>> + dest = rpath + 1; /* Absolute symlink */
>>>> + else
>>>> + /* Back up to previous component, ignore if at root already: */
>>>> + if (dest > rpath + 1)
>>>> + while ((--dest)[-1] != '/');
>>>> + }
>>>> + }
>>>> + }
>>>> + if (dest > rpath + 1 && dest[-1] == '/')
>>>> + --dest;
>>>> + *dest = '\0';
>>>> +
>>>> + return rpath;
>>>> +
>>>> +}
>>>> --
>>>> 2.0.4
>>>>
>>>
>>> ping.