This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/12683] Race conditions in pthread cancellation


https://sourceware.org/bugzilla/show_bug.cgi?id=12683

Dan Searle <dan at censornet dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dan at censornet dot com

--- Comment #25 from Dan Searle <dan at censornet dot com> ---
I think we have stubmled upon this bug, or something related to it. Can someone
please confirm I'm on the right track here?

We have a multithreaded server application which calls recv() and poll() from
async cancellable threads, each thread handles a single connection with a
master thread accpeting new connections and adding them to a job queue.

More and more often now we are seeing the server lock up and on inspection two
or more threads seem deadlocked in some race condition inside libc recv() and
or poll().

One example here shows two back traces from gdb from the two threads that
seemed deadlocked chewing 100% CPU:

Thread 1 bt:
#0  __pthread_disable_asynccancel () at
../nptl/sysdeps/unix/sysv/linux/x86_64/cancellation.S:98
#1  0x00007f895ba987fd in __libc_recv (fd=0, fd@entry=33,
buf=buf@entry=0x7cada02b, n=n@entry=1024, flags=1537837035,
    flags@entry=16384) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:35
#2  0x000000000040ec54 in recv (__flags=16384, __n=1024, __buf=0x7cada02b,
__fd=33)
    at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]

Thread 2 bt:
#0  0x00007f895ba987eb in __libc_recv (fd=fd@entry=31,
buf=buf@entry=0x7ca5e02b, n=n@entry=1024, flags=-1, flags@entry=16384)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
#1  0x000000000040ec54 in recv (__flags=16384, __n=1024, __buf=0x7ca5e02b,
__fd=31)
    at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]

There can be more than two threads involved, but I'm unsure if it can happen
with just one thread locked up, but it's always inside recv() or poll() and
sometimes in __pthread_disable_asynccancel() within either of those.

Could I work around this problem by changing the threads to syncronmous
cancellable or try to work around the need to cancel the treads at all?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]