This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug dynamic-link/17645] RFE: Improve performance of dynamic loader for deeply nested DSO dependencies.


https://sourceware.org/bugzilla/show_bug.cgi?id=17645

--- Comment #5 from Paulo Andrade <paulo.cesar.pereira.de.andrade at gmail dot com> ---
(In reply to Carlos O'Donell from comment #4)
> (In reply to Paulo Andrade from comment #1)
> > Created attachment 7972 [details]
> > Proposed patch
> > 
> > This patch implements a simple stable topological sort that moves
> > an entry at most once, by keeping track of, and resetting the weight
> > of the entries after every move. It breaks loops by choosing the
> > ones that appear first with the lowest weight, that is, less
> > dependencies.
> > If there is a loop, the "c" variable in the "sort" function will
> > be larger than 0.
> 
> At the high-level your patch is almost ready.
> 
> You refactor the sorting code out into a static function that the compiler
> can inline and you use the same pattern in all places where sorting happens.
> This is good.

  The code is small enough that this way (static functions) should be
better. But it also cannot be really shared, without extra parameters,
as there are 3 slightly different variants of the loop.

> You are missing two important things:
> 
> * More code comments. You should comment each sort routine with a high-level
> comment about what you're doing and why. This is particularly important for
> the weight functions.

  Ok. I will work on that. A quick explanation is somewhat like this:
"""
  The algorithm implements a simple stable topological sort.

  The algorithm is expected to be fast and break cycles in a reasonable
way. If there are no cycles, it does a plain topological sort, that
does not reorder the input, that is, it is stable.

  Every link_map instance has a l_idx field, that in special cases,
currently only when a DSO is being unloaded, can be less than zero.
If the l_idx is not less than zero, and not in the dl_sort_fini call
that already initializes the l_idx entries, the sort function initializes
the l_idx field of every link_map pointer to an unique identifier.

  The sort is done in-place, by decrementing the insert position (the
"k" variable) in the maps vector. Sorting is done by swapping an entry
with the current insert position, and decrementing the insert position.

  The "sort" function allocates in the stack the "seen" vector, that
works as a boolean flag for every link_map, to know if it has been
already moved. Moving, actually swapping, is done at most once, and not
done if it is already in the proper position.

  The "sort" function also allocates in stack the "w" vector, that
is a counter of how many, not self and not already "seen" direct
references exist to a link_map entry.

  The "weight" function resets the "w" entries at every call. Note
that other than the first call, every call reduces the number of
entries by one, as the last moved link_map pointer does not need
to be verified again.

  In the "weight" function, once a link_map entry l_initfini vector
does not find any dependency, either the entry is a self dependency,
the object is being unloaded or the dependency has been moved
(in which case a cyclic dependency was just broken), the "w" entry
for it will be 0.

  Note that the "w" vector is indexed by position in the input/output
maps vector, while the "seen" vector is indexed by the "l_idx" field
of the "struct link_map" representing the DSO.

  Every loop, after calling the "weight" function, the "sort" searches
all "w" entries for the first entry matching the "c" variable. The "c"
variable is initialized as zero, and reset to zero after an entry
is moved. If it scans all "w" entries and none is equal to "c", it
increments "c"; if "c" is larger than zero it means there is a loop,
in which case it chooses the first one with the lowest weight.

  Once an entry is moved, it is marked as "seen", and if it was
part of a cycle, the cycle is now broken, as it will no longer
influence its direct dependencies.

  Recognized possible problems:
o The algorithm is not expected to scale well beyond a few thousand
  entries, due to the need to reinitialize the "w" vector.
o If there are multiple "solutions", it will frequently not match
  sorting done by the previous algorithm. But, it guarantees an
  stable sorting, that is, does not change input order.
o It does not attempt to be smart when finding loops, a better solution
  probably would be to break the smallest loop first, when there are
  multiple entries with the same weight, instead of choosing the first
  entry.
"""

> * Tests. The existing set of tests are insufficient to cover the changes you
> are proposing. While it seems unfair, and I understand that, this kind of
> change has not been undertaken because of the lack of testing. You need to
> add more tests. The *best* testing is actually a test which
> deterministically generates a sequence of permutations of linked objects and
> verifies they are sorted correctly by the algorithms in question. This way
> we can show that the sorting works for a large set of object dependency
> orderings. When we get a bug report, and we will, we can enlarge the set of
> automated testing to include the case the bug report claims and see what
> went wrong.

  I will work on test cases. So far my tests were glibc "make check" and
running rawhide and rhel-7 with the patch. I am particularly interested
in knowing if there is a test case for this commit:
https://sourceware.org/git/?p=glibc.git;a=blame;f=elf/dl-fini.c;h=c35577565eb66ac241365c05efe7c7fb791e0df2#l102
line 102, because it is another special variant of the loop, and what I
did to address it is to loop over the entries, and increase the "w"
vector of the dependencies, to ensure the dependencies are not moved
before the entry, but on very complex cycles, and if having conflicting
rules, it cannot guarantee it will really " preserve a link time dependency".
But I suspect the same is true for the current algorithm; worse if the
link time dependencies generate conflicting dependencies in cycles.

> If you can cover those two things, then you're ready to post the patch to
> libc-alpha for more formal review.

  Ok. I followed wiki instructions and sent the first version to
libc-alpha, but will now wait for a more concrete situation based
on bugzilla.

> This looks really good though and you're making great progress in a
> difficult part of the dynamic loader.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]