This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: dead-lock in glibc
- From: Joël Krähemann <jkraehemann at gmail dot com>
- To: jkraehemann-guest at users dot alioth dot debian dot org
- Cc: "Carlos O'Donell" <carlos at systemhalted dot org>, "libc-help at sourceware dot org" <libc-help at sourceware dot org>, Torvald Riegel <triegel at redhat dot com>
- Date: Fri, 31 Mar 2017 23:07:11 +0200
- Subject: Re: dead-lock in glibc
- Authentication-results: sourceware.org; auth=none
- References: <CA+Owze40Onq_uZs2wOjY=O5Xv3D75Ce_b7Sf5qEjMZ-bAnW_wA@mail.gmail.com> <CAE2sS1gXkrLAZf2o54QSkE_fqFMrSd987nP=QYRe=GQEdq26_w@mail.gmail.com> <CA+Owze6vtqJ4jURD2H4fouw5izePVaQ9iun2LCLQ+HqwVvkvWw@mail.gmail.com> <CAE2sS1iF1ua0w9379zm-nMToTxQfVJfTxa78uMgs6z=LEqy5GA@mail.gmail.com> <CA+Owze6nxNpB+FWAQfu_6duy0wEA1n+K5mY1C6KAZEO1-dn4eQ@mail.gmail.com> <CA+Owze6h=O+dw9eCE7LauRozK6upbYLQsY=6_whrAGmh_-BDnw@mail.gmail.com>
- Reply-to: jkraehemann-guest at users dot alioth dot debian dot org
Hi
Here was the mutex locked wrong. First unlock() and then unlock(), again.
Bests,
Joël
On Fri, Mar 31, 2017 at 10:35 PM, Joël Krähemann <jkraehemann@gmail.com> wrote:
> Hi
>
> Just run the test again, hanging at a different time.
>
> Bests,
> Joël
>
>
> On Thu, Mar 16, 2017 at 7:30 AM, Joël Krähemann <jkraehemann@gmail.com> wrote:
>> Hi Carlos
>>
>> Thank you for the hints. If you need additional information please let me know.
>>
>> regards,
>> Joël
>>
>>
>> On Thu, Mar 16, 2017 at 2:54 AM, Carlos O'Donell
>> <carlos@systemhalted.org> wrote:
>>> On Wed, Mar 15, 2017 at 4:35 PM, Joël Krähemann <jkraehemann@gmail.com> wrote:
>>>> * libc6 2.24-9
>>>
>>>> Might be I was trying to do a recursive lock on a non-recursive mutex?
>>>> I was playing 64 beats with the notation editor of GSequencer in a infinite
>>>> loop. Suddenly it aborted after some playbacka approximetaly 3 to 4 minutes.
>>>
>>> No. The asserts are intended to indicate internal consistency is violated.
>>>
>>> Recursively locking a non-recursive mutex should lead to the thread
>>> getting stuck forever, but not an assert.
>>>
>>>>>> gsequencer: ../nptl/pthread_mutex_lock.c:349:
>>>>>> __pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e,
>>>>>> __err) != EDEADLK || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind !=
>>>>>> PTHREAD_MUTEX_RECURSIVE_NP)' failed.
>>>>>> Aborted
>>>
>>> We've had a failure in the futex syscall, but that should not by
>>> itself trigger an assert.
>>>
>>> The failure was either "no thread found" or "deadlock".
>>>
>>> The assert triggers when we get "deadlock" from the kernel but the
>>> mutex was error-checking or recursive. Internally we don't ever expect
>>> to get "deadlock" from the kernel for these kinds of mutexes and
>>> indicates an algorithmic problem.
>>>
>>> It's an algorithmic problem because earlier code should have detected
>>> we owned the mutex in the recursive case, bumped the ownership
>>> counter, and returned zero.
>>>
>>> It's an algorithmic problem because earlier code should have detected
>>> we owned the mutex in the error checking case, and should have
>>> returned EDEADLK without making any futex syscalls.
>>>
>>> So we didn't own the mutex and an attempt to acquire it determined it
>>> was locked by someone else (not us), and then the kernel returned
>>> EDEADLK, which doesn't make sense because we didn't own it to begin
>>> with!
>>>
>>> It points to a kernel or glibc issue with PI mutexes.
>>>
>>> Cheers,
>>> Carlos.