This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Understanding why getaddrinfo_a and __gai_enqueue_request use recursive locks


On Tue, Sep 26, 2017 at 2:06 AM, Will Hawkins <whh8b@virginia.edu> wrote:
> On Tue, Sep 26, 2017 at 12:42 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 09/25/2017 05:00 PM, Will Hawkins wrote:
>>> On Mon, Sep 25, 2017 at 12:13 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>>>> On 09/23/2017 08:44 AM, Will Hawkins wrote:
>>>>> This is what confused me -- thank you for the confirmation that I was
>>>>> not going crazy! After an hour of trying to figure out how the code
>>>>> did not deadlock every time, I finally looked at the
>>>>> initialization/definition of the mutex and saw that it was recursive.
>>>>
>>>> We call this a belt-and-suspenders approach, particularly if there were
>>>> multiple callers in the past that might or might no have taken the lock.
>>>>
>>>> In this case it looks like there is just one caller so the appropriate
>>>> patch would have to be:
>>>>
>>>> * Remove locking from helper function.
>>>> * Add comments to helper function explaining that lock X must be held
>>>>   before calling.
>>>> * Add comments in caller stating that lock must be taken before calling
>>>>   helper function.
>>>> * Look to see if have a reasonable test that exercises this code path,
>>>>   and if not add one (I think there are tests already that cover this
>>>>   area of code).
>>>>
>>>
>>> There appears to be a related test in the file ga_test.c in resolv/.
>>> However, I can't figure out how the check/tests target is building
>>> this file. In fact, I can't get the file to build.
>>
>> The file is not being built because it's not listed in one of the
>> test targets.
>>
>> It *could* be added to tests (normal tests), or tests-internal (tests
>> that need access to libc internals), or xtests (tests that need special
>> access, usually root).
>>
>> However, it's not clear that this test was ever designed to be run
>> in the glibc test harness. It looks like an example of how to use the
>> interfaces, but is not a test itself.
>
> Thank you for your reply!
>
> What are you saying is that, because this tests: target does not have
> any recipe associated with it, it becomes just another way to add a
> prerequisite for the tests: target whose recipe is, ultimately,
> defined in the "master" Makefile? An example of the recipe here:
> https://www.gnu.org/software/make/manual/html_node/Multiple-Rules.html
>
>>
>>> The target that seems to control building this file is guarded by two
>>> conditionals build-shared and have-thread-library. I have verified
>>> that those flags are both set in my build environment. With that
>>> control passed, it looks like it is supposed to only then use the
>>> ga_test as the only test target. This appears to be confimed by the
>>> comments in the Changelog (Changelog.old/Changelog.12:5501 and commit
>>> log (e10546cb67fe0b5366fede3661b6c15363e4f54f) for when that file/test
>>> was incorporated into the repository.
>>>
>>> The target, then, looks like this:
>>>
>>> $(objpfx)ga_test: $(objpfx)libanl.so $(shared-thread-library)
>>>
>>> which, in turn, ensures that libanl.so is built.
>>
>> Right. The tests target depends on this, which then depends on these
>> two files, and that overrides the normal build process for ga_test.c,
>> and just ensures that libanl.so and libpthread.so are built.
>>
>>> However, that target has no other actions. I would presume that
>>> somewhere there is a pattern matching target that will build this, but
>>> I'm too stupid to find it.
>>
>> No, you are not too stupid, you are correct, the file ga_test.c is never
>> built, and looks like it was never designed to be run as a test.
>>
>> The targets are bogus though.
>>
>> Can you test removing them and ga_test.c?
>>
>> diff --git a/resolv/Makefile b/resolv/Makefile
>> index ec7e4fd..8a822e0 100644
>> --- a/resolv/Makefile
>> +++ b/resolv/Makefile
>> @@ -101,10 +101,6 @@ routines                += $(libnss_dns-routines) $(libresolv-routines)
>>  static-only-routines    += $(libnss_dns-routines) $(libresolv-routines)
>>  endif
>>
>> -ifeq (yesyes,$(build-shared)$(have-thread-library))
>> -tests: $(objpfx)ga_test
>> -endif
>> -
>>  ifeq ($(run-built-tests),yes)
>>  ifneq (no,$(PERL))
>>  tests-special += $(objpfx)mtrace-tst-leaks.out
>> @@ -134,8 +130,6 @@ $(objpfx)libnss_dns.so: $(objpfx)libresolv.so
>>  # The asynchronous name lookup code needs the thread library.
>>  $(objpfx)libanl.so: $(shared-thread-library)
>>
>> -$(objpfx)ga_test: $(objpfx)libanl.so $(shared-thread-library)
>> -
>>  $(objpfx)tst-res_hconf_reorder: $(libdl) $(shared-thread-library)
>>  tst-res_hconf_reorder-ENV = RESOLV_REORDER=on
>>
>> ---
>>
>
> I am testing this as we speak but don't think that I will know the
> answer before I head to sleep tonight. I will respond as soon as my
> tests have completed their execution.

Testing of this 'patch' is complete. I applied the change then used
the instructions on
https://sourceware.org/glibc/wiki/Testing/Testsuite#Testing_just_one_test
to test. Here's how I did it:

>From the build directory, I did
rm -f resolv/*.out
make subdirs=resolv check

and everything ran the way I expected (the same number of test
failures and passes).

Thanks again for all your help!
Will


>
>> I think you could write a much better test than ga_test.c
>> to get better test coverage for your suggested lock changes.
>>
>> Then to add the test all you need to do is plunk a tst-foo.c into
>> resolv/ and add it to the tests variable.
>>
>> Example:
>>
>> diff --git a/resolv/Makefile b/resolv/Makefile
>> index ec7e4fd..48dc25e 100644
>> --- a/resolv/Makefile
>> +++ b/resolv/Makefile
>> @@ -53,6 +53,7 @@ tests += \
>>    tst-resolv-network \
>>    tst-resolv-res_init-multi \
>>    tst-resolv-search \
>> +  tst-foo
>>
>>  # These tests need libdl.
>>  ifeq (yes,$(build-shared))
>> ---
>>
>> That is enough to build and run tst-foo from tst-foo.c in the
>> resolv subdir, and you'd use the support/README-test.c as the
>> template.
>>
>
> As with everything that you've said so far, this makes sense. The
> ga_test.c code did not seem to be an effective test for this. I only
> went down this rabbit hold because it was the only "test" that I could
> find that seemed to be even remotely related to the getaddrinfo_a.
> When I realized that even that was not being built, I started chasing
> my tail trying to answer the questions above.
>
> As before, thank you!
>
> Will
>
>> --
>> Cheers,
>> Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]