This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Evolution of ELF symbol management


On 10/25/2016 05:37 PM, Joseph Myers wrote:
On Tue, 25 Oct 2016, Florian Weimer wrote:

There are a few existing __libc_foo exports at public symbol versions.  Do
all those satisfy the rule that where both foo and __libc_foo exist, the
latest version of foo and the latest version of __libc_foo are aliases or
otherwise have the same semantics?  (It would seem very confusing for old
and new __libc_* symbols to follow different rules in that regard.)

I found a few symbols which differs in the exported version.  The unprefixed
symbol has a regular version, and the prefixed one appears as GLIBC_PRIVATE.
These are:

clntudp_bufcreate
fork
longjmp
pread
pwrite
secure_getenv
siglongjmp
system
vfork

My concern is mainly about __libc_* symbols at public versions, not
GLIBC_PRIVATE, since we can freely change the ABIs for __libc_* at
GLIBC_PRIVATE if those are confusing.

I put together the attached Python script to check for collisions. It reports anything that is not UNDEF or GLIBC_PRIVATE and where the __libc_-prefixed and non-prefixed symbols have different values. It reports some mismatches in libasan, so I think it works. It does not flag anything for glibc on i386 and x86_64 with current master.

The script has a hard-coded path to elfutils readelf, it needs Mark's recent addition of the --symbols=SECTION argument.

So I think we are good on this front.

This is less relevant for functions in non-standard headers (which
applications would not include accidentally), but if we add something to
<stdio.h> (under _GNU_SOURCE) which is ripe for collisions, we need to somehow
make sure that a user-defined function of the same name does not end up
interposing the alias.

Same name and type, that is; if the type is wrong and the header
declaration is visible, a compile-time error will occur.

Good point. We could add artificial transparent unions to arguments to make it harder to write a matching definition, even with current GCC versions.

There is always the option of having the installed headers be generated
files, so the source tree has .h.in files that contain some sort of
annotations for use by a special glibc-specific preprocessor that does
things the C preprocessor cannot - converting something that looks like a
C macro call (say) into function declarations, __REDIRECT calls - and
function-like macro definitions.  Of course then you need to get those
headers generated at an early stage in the glibc build.

Interesting idea.

We could generate a different set of such headers of internal glibc user if required. This could allow us to compile more parts of glibc as standard C sources, without mangling of public symbols, which would help with things like unit testing and fuzz testing.

I think this leads to the question whether we should prefer __ over __libc_
after all because as part of fixing the glibc-internal linknamespace issues,
we often added a __ symbol with a public version (but sometimes a

We shouldn't have added them with public versions, just internally (and
only at GLIBC_PRIVATE if needed by a separate library from the
definition).

It's a bit too late for that, unfortunately.

Florian

#!/usr/bin/python

import collections
import re
import subprocess
import sys

Version = collections.namedtuple("Version", "name default")
Symbol = collections.namedtuple("Symbol", "name version value")

RE_SYMBOL_LINE = re.compile(r"^\d+: ")
RE_SPACES = re.compile(r"\s+")

READELF = "/home/fweimer/src/ext/elfutils/e/src/readelf"

def get_symbols(path):
    p = subprocess.Popen([READELF, "--symbols=.dynsym", "--", path],
                         stdout=subprocess.PIPE)
    for line in p.stdout.readlines():
        line = line.strip()
        if RE_SYMBOL_LINE.match(line):
            split_line = RE_SPACES.split(line)
            # Num(0) Value(1) Size(2) Type(3) Bind(4) Vis(5) Ndx(6) Name(7)
            if len(split_line) < 8:
                continue
            value = int(split_line[1], 16)
            binding = split_line[4]
            ndx = split_line[6]
            name = split_line[7]
            
            if ndx == 'UNDEF':
                continue
            if binding == 'LOCAL':
                continue
            default= False
            if "@@" in name:
                default = True
                name, version = name.split("@@")
            elif "@" in name:
                name, version = name.split("@")
            else:
                version = None
            if version is None:
                yield Symbol(name, None, value)
            else:
                yield Symbol(name, Version(version, default), value)
    if p.wait() != 0:
        raise IOError(
            "readelf failed with exit status {}".format(p.returncode))

def check_file(path):
    with open(path, "rb") as dso:
        if dso.read(4) != "\177ELF":
            return
    
    libc_prefix = {}
    no_prefix = {}
    for sym in get_symbols(path):
        if sym.version and sym.version.name == "GLIBC_PRIVATE":
            continue
        if sym.name.startswith("__libc_"):
            libc_prefix[sym.name] = sym
        else:
            no_prefix[sym.name] = sym

    for np_sym in no_prefix.values():
        libc_sym = libc_prefix.get("__libc_" + np_sym.name, None)
        if libc_sym is None:
            continue
        if libc_sym.value != np_sym.value:
            print path, np_sym, libc_sym

for path in sys.argv[1:]:
    check_file(path)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]