This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/21374] __pthread_cond_destroy deadlock on glibc 2.25


https://sourceware.org/bugzilla/show_bug.cgi?id=21374

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #4 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
It seems exactly what the it is trying to do based on the example provided [1].
 Using master and showing the backtrace of all threads:

Thread 4 (LWP 27793):
#0  0x00007ffff78e3683 in futex_wait_cancelable (private=<optimized out>,
expected=0, futex_word=0x555556211808) at
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x555556211810,
cond=0x5555562117e0) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x5555562117e0, mutex=0x555556211810) at
pthread_cond_wait.c:655
#3  0x00007fffed148c7c in
std::condition_variable::wait(std::unique_lock<std::mutex>&) () from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/lib/libshm.so
#4  0x00007fffed902b53 in
std::condition_variable::wait<torch::autograd::ReadyQueue::pop_back()::__lambda0>
(__p=..., __lock=..., this=0x5555562117e0)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#5  torch::autograd::ReadyQueue::pop_back (this=this@entry=0x555556211790) at
torch/csrc/autograd/engine.cpp:80
#6  0x00007fffed904d23 in torch::autograd::Engine::thread_main
(this=this@entry=0x7fffee17ed00 <engine>, queue=...)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#7  0x00007fffed91589a in PythonEngine::thread_main (this=0x7fffee17ed00
<engine>, queue=...)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#8  0x00007fffd1e1b870 in ?? () from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/lib/../../../../libstdc++.so.6
#9  0x00007ffff78dd455 in start_thread (arg=0x7fffc4456700) at
pthread_create.c:455
#10 0x00007ffff6cd3e5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Thread 3 (LWP 27792):
#0  0x00007ffff78e3683 in futex_wait_cancelable (private=<optimized out>,
expected=0, futex_word=0x55555621132c) at
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x555556211330,
cond=0x555556211300) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x555556211300, mutex=0x555556211330) at
pthread_cond_wait.c:655
#3  0x00007fffed148c7c in
std::condition_variable::wait(std::unique_lock<std::mutex>&) () from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/lib/libshm.so
#4  0x00007fffed902b53 in
std::condition_variable::wait<torch::autograd::ReadyQueue::pop_back()::__lambda0>
(__p=..., __lock=..., this=0x555556211300)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#5  torch::autograd::ReadyQueue::pop_back (this=this@entry=0x5555562112b0) at
torch/csrc/autograd/engine.cpp:80
#6  0x00007fffed904d23 in torch::autograd::Engine::thread_main
(this=this@entry=0x7fffee17ed00 <engine>, queue=...)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#7  0x00007fffed91589a in PythonEngine::thread_main (this=0x7fffee17ed00
<engine>, queue=...)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#8  0x00007fffd1e1b870 in ?? () from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/lib/../../../../libstdc++.so.6
#9  0x00007ffff78dd455 in start_thread (arg=0x7fffc4c57700) at
pthread_create.c:455
#10 0x00007ffff6cd3e5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Thread 2 (LWP 27791):
#0  0x00007ffff6cd53f8 in accept4 (fd=9, addr=..., addr_len=0x7fffc5457e58,
flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:40
#1  0x00007fffc5863496 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007fffc5856cbd in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00007fffc5863e88 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4  0x00007ffff78dd455 in start_thread (arg=0x7fffc5458700) at
pthread_create.c:455
#5  0x00007ffff6cd3e5f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Thread 1 (LWP 26868):
#0  0x00007ffff78e31c9 in futex_wait (private=<optimized out>, expected=12,
futex_word=0x555556211324) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  futex_wait_simple (private=<optimized out>, expected=12,
futex_word=0x555556211324) at ../sysdeps/nptl/futex-internal.h:135
#2  __pthread_cond_destroy (cond=0x555556211300) at pthread_cond_destroy.c:54
#3  0x00007fffed90175e in torch::autograd::ReadyQueue::~ReadyQueue
(this=0x5555562112b0, __in_chrg=<optimized out>)
   from
/home/azanella/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so
#4  std::default_delete<torch::autograd::ReadyQueue>::operator()
(this=<optimized out>, __ptr=0x5555562112b0) at
torch/csrc/autograd/engine.cpp:67

It looks like thread 3 and thread 4 are both waiting on a condition variable
and thread 1 call pthread_cond_destroy on it.  Also based on the bug report
pytorch referenced bug, it indeed looks like an application issue [2].

If you have well defined example which trigger this very issue it would be
helpful, otherwise debug indicates you are relying on undefined behavior and I
will close this bug.

[1]
https://discuss.pytorch.org/t/archlinux-using-variable-backwards-appears-to-hang-program-indefinitely/1675
[2] https://github.com/pytorch/pytorch/pull/1243

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]