This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Evolution of ELF symbol management
- From: Florian Weimer <fweimer at redhat dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 21 Nov 2016 16:35:36 +0100
- Subject: Re: Evolution of ELF symbol management
- Authentication-results: sourceware.org; auth=none
- References: <9727f95a-df3d-ec11-8c1d-9b7ea6cbcaac@redhat.com> <alpine.DEB.2.20.1610181558590.22314@digraph.polyomino.org.uk> <2e86a3a6-3ad3-6834-4c6c-64836a956dbd@redhat.com> <alpine.DEB.2.20.1610251523320.4454@digraph.polyomino.org.uk>
On 10/25/2016 05:37 PM, Joseph Myers wrote:
On Tue, 25 Oct 2016, Florian Weimer wrote:
There are a few existing __libc_foo exports at public symbol versions. Do
all those satisfy the rule that where both foo and __libc_foo exist, the
latest version of foo and the latest version of __libc_foo are aliases or
otherwise have the same semantics? (It would seem very confusing for old
and new __libc_* symbols to follow different rules in that regard.)
I found a few symbols which differs in the exported version. The unprefixed
symbol has a regular version, and the prefixed one appears as GLIBC_PRIVATE.
These are:
clntudp_bufcreate
fork
longjmp
pread
pwrite
secure_getenv
siglongjmp
system
vfork
My concern is mainly about __libc_* symbols at public versions, not
GLIBC_PRIVATE, since we can freely change the ABIs for __libc_* at
GLIBC_PRIVATE if those are confusing.
I put together the attached Python script to check for collisions. It
reports anything that is not UNDEF or GLIBC_PRIVATE and where the
__libc_-prefixed and non-prefixed symbols have different values. It
reports some mismatches in libasan, so I think it works. It does not
flag anything for glibc on i386 and x86_64 with current master.
The script has a hard-coded path to elfutils readelf, it needs Mark's
recent addition of the --symbols=SECTION argument.
So I think we are good on this front.
This is less relevant for functions in non-standard headers (which
applications would not include accidentally), but if we add something to
<stdio.h> (under _GNU_SOURCE) which is ripe for collisions, we need to somehow
make sure that a user-defined function of the same name does not end up
interposing the alias.
Same name and type, that is; if the type is wrong and the header
declaration is visible, a compile-time error will occur.
Good point. We could add artificial transparent unions to arguments to
make it harder to write a matching definition, even with current GCC
versions.
There is always the option of having the installed headers be generated
files, so the source tree has .h.in files that contain some sort of
annotations for use by a special glibc-specific preprocessor that does
things the C preprocessor cannot - converting something that looks like a
C macro call (say) into function declarations, __REDIRECT calls - and
function-like macro definitions. Of course then you need to get those
headers generated at an early stage in the glibc build.
Interesting idea.
We could generate a different set of such headers of internal glibc user
if required. This could allow us to compile more parts of glibc as
standard C sources, without mangling of public symbols, which would help
with things like unit testing and fuzz testing.
I think this leads to the question whether we should prefer __ over __libc_
after all because as part of fixing the glibc-internal linknamespace issues,
we often added a __ symbol with a public version (but sometimes a
We shouldn't have added them with public versions, just internally (and
only at GLIBC_PRIVATE if needed by a separate library from the
definition).
It's a bit too late for that, unfortunately.
Florian
#!/usr/bin/python
import collections
import re
import subprocess
import sys
Version = collections.namedtuple("Version", "name default")
Symbol = collections.namedtuple("Symbol", "name version value")
RE_SYMBOL_LINE = re.compile(r"^\d+: ")
RE_SPACES = re.compile(r"\s+")
READELF = "/home/fweimer/src/ext/elfutils/e/src/readelf"
def get_symbols(path):
p = subprocess.Popen([READELF, "--symbols=.dynsym", "--", path],
stdout=subprocess.PIPE)
for line in p.stdout.readlines():
line = line.strip()
if RE_SYMBOL_LINE.match(line):
split_line = RE_SPACES.split(line)
# Num(0) Value(1) Size(2) Type(3) Bind(4) Vis(5) Ndx(6) Name(7)
if len(split_line) < 8:
continue
value = int(split_line[1], 16)
binding = split_line[4]
ndx = split_line[6]
name = split_line[7]
if ndx == 'UNDEF':
continue
if binding == 'LOCAL':
continue
default= False
if "@@" in name:
default = True
name, version = name.split("@@")
elif "@" in name:
name, version = name.split("@")
else:
version = None
if version is None:
yield Symbol(name, None, value)
else:
yield Symbol(name, Version(version, default), value)
if p.wait() != 0:
raise IOError(
"readelf failed with exit status {}".format(p.returncode))
def check_file(path):
with open(path, "rb") as dso:
if dso.read(4) != "\177ELF":
return
libc_prefix = {}
no_prefix = {}
for sym in get_symbols(path):
if sym.version and sym.version.name == "GLIBC_PRIVATE":
continue
if sym.name.startswith("__libc_"):
libc_prefix[sym.name] = sym
else:
no_prefix[sym.name] = sym
for np_sym in no_prefix.values():
libc_sym = libc_prefix.get("__libc_" + np_sym.name, None)
if libc_sym is None:
continue
if libc_sym.value != np_sym.value:
print path, np_sym, libc_sym
for path in sys.argv[1:]:
check_file(path)