This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ELF linking question related to symbol collisions


On 11/20/2013 10:13 PM, Carlos O'Donell wrote:
On Wed, Nov 20, 2013 at 8:22 AM, Florian Weimer <fweimer@redhat.com> wrote:
I've got a program which links (indirectly) to two DSOs which define the
same function.  Is it guaranteed that ld.so resolves a symbol reference to
the topologically closest definition (from its own dependency graph), or
will ld.so pick a definition more or less at random?

To be clear:

Program -> lib1.so -> lib1a.so (defines foo)
            \--> lib2.so -> lib2a.so (defines foo)

Call sequence is: Program->lib1.so (some function)-> foo (which foo?)

Program was built with `-l1 -l2' (very important because it sequences DT_NEEDED)

Thanks for your explanation.

In this case the topological sort results in the following flat
sequence (on x86-64):
/lib64/ld-linux-x86-64.so.2
/lib64/libc.so.6
./lib1a.so
./lib2a.so
./lib1.so
./lib2.so

Thus the answer to "which foo?" is "lib1a.so's foo."

And lib2.so will get the same foo?  Ugh.

It's the closest definition from the *program* not ld.so, but since
ld.so is always
the first dependency then it can be correct to say this also.

I was hoping that ld.so picks the closest definition from the referencing library, so that lib1.so would get the definition from lib1a.so, and lib2.so would end up with the one from lib2a.so. That would scale a little bit better despite the lack of global namespace management.

The backstory on my question is this. I mistook an embedded copy of the json-glib library for a copy of json-c, a totally different library which also uses the json_object_ prefix for some of its functions. It turns out that there is just one colliding symbol, json_object_get_type.

So I set out to find programs (f4) which link to both json-c (f1) and json-glib (f2), and also link to something (f3) that references the json_object_get_type function.

SELECT DISTINCT f4.name AS toplevel, f3.name AS json_object_get_type
  FROM symboldb.file f1
  JOIN symboldb.elf_closure ec1 ON f1.file_id = ec1.needed
  CROSS JOIN symboldb.file f2
  JOIN symboldb.elf_closure ec2
    ON f2.file_id = ec2.needed AND ec1.file_id = ec2.file_id
  JOIN symboldb.elf_closure ec3 ON ec3.file_id = ec2.file_id
  JOIN symboldb.file f3
    ON ec3.file_id = f3.file_id OR ec3.needed = f3.file_id
  JOIN symboldb.elf_reference er ON f3.contents_id = er.contents_id
  JOIN symboldb.file f4 ON ec3.file_id = f4.file_id
  JOIN symboldb.package p ON f4.package_id = p.package_id
  JOIN symboldb.package_set_member psm ON p.package_id = psm.package_id
  WHERE f1.name = '/usr/lib64/libjson-c.so.2.0.1'
  AND f2.name = '/usr/lib64/libjson-glib-1.0.so.0.1600.0'
  AND er.name = 'json_object_get_type'
  AND psm.set_id = symboldb.package_set('Fedora/19/x86_64');

I'm not sure how well the table will be preserved, but here it is:

toplevel | json_object_get_type
--------------------------------------------+-------------------------------
 /usr/bin/gnome-control-center              | /usr/lib64/libpulse.so.0.15.3
 /usr/lib64/gnome-shell/libgnome-shell.so   | /usr/lib64/libpulse.so.0.15.3
 /usr/lib64/empathy/libempathy-gtk-3.8.4.so | /usr/lib64/libpulse.so.0.15.3
 /usr/lib64/cinnamon/libcinnamon.so         | /usr/lib64/libpulse.so.0.15.3
 /usr/bin/gnome-shell                       | /usr/lib64/libpulse.so.0.15.3
 /usr/libexec/empathy-auth-client           | /usr/lib64/libpulse.so.0.15.3
 /usr/bin/empathy-accounts                  | /usr/lib64/libpulse.so.0.15.3
 /usr/bin/cinnamon                          | /usr/lib64/libpulse.so.0.15.3
 /usr/libexec/empathy-call                  | /usr/lib64/libpulse.so.0.15.3
 /usr/bin/empathy-debugger                  | /usr/lib64/libpulse.so.0.15.3
 /usr/bin/gnome-boxes                       | /usr/lib64/libpulse.so.0.15.3
 /usr/bin/empathy                           | /usr/lib64/libpulse.so.0.15.3
 /usr/libexec/empathy-chat                  | /usr/lib64/libpulse.so.0.15.3
(13 rows)

So we have several desktop applications that have an ambiguous reference to json_object_get_type, via the pulseaudio library.

The trouble with this is that's fairly difficult to detect. Static analysis misses collisions introduced by dlopen and dlsym.

Warning! The answer changes if you link with `-l2 -l1' because it changes
the DT_NEEDED ordering which changes the order in which the graph
is traversed.

Thanks for the warning. I should record the order of DT_NEEDED elements in the database. I also need to reflect this in the elf_closure table in some way, to model the ld.so behavior more accurately.

If I understand things correctly, this unpredictability means that symbol collisions are always bad, even if they are working at present, because a change in the dependency graph could interpose a different definition in the future.

Beware that if you introduce cycles the problem becomes non-deterministic
and depends on where you break the cycle. We have several bugs open at
the moment against glibc to make the cycle breaking deterministic and to
enable testing for millions of permutations of N cycles to double check
that the present code does the right thing. We'd appreciate any contributions
in this area as the dynamic linker code is in my opinion in need of refactoring
and simplification to enable future development.

I'm a bit worried that we're facing scalability issues at the distribution level because we lack proper namespace management. I've seen that a number of libraries do not export internal symbols, which is good, but I think we either need some global namespace management (which will be difficult for APIs with fairly general names, such as MAPI), a different linking algorithm which provides better encapsulation (reducing accidental symbol interposition), or a more aggressive move towards symbol versioning as a namespace management tool. Or maybe something else altogether. This is a really complicated topic, and I don't think it can be considered in isolation from ld.so performance improvements.

This is not mere speculation, we already had symbol collisions which impacted customers.

--
Florian Weimer / Red Hat Product Security Team


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]