This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/12683] Race conditions in pthread cancellation
- From: "dan at censornet dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Thu, 15 Jan 2015 13:20:17 +0000
- Subject: [Bug nptl/12683] Race conditions in pthread cancellation
- Auto-submitted: auto-generated
- References: <bug-12683-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=12683
Dan Searle <dan at censornet dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dan at censornet dot com
--- Comment #25 from Dan Searle <dan at censornet dot com> ---
I think we have stubmled upon this bug, or something related to it. Can someone
please confirm I'm on the right track here?
We have a multithreaded server application which calls recv() and poll() from
async cancellable threads, each thread handles a single connection with a
master thread accpeting new connections and adding them to a job queue.
More and more often now we are seeing the server lock up and on inspection two
or more threads seem deadlocked in some race condition inside libc recv() and
or poll().
One example here shows two back traces from gdb from the two threads that
seemed deadlocked chewing 100% CPU:
Thread 1 bt:
#0 __pthread_disable_asynccancel () at
../nptl/sysdeps/unix/sysv/linux/x86_64/cancellation.S:98
#1 0x00007f895ba987fd in __libc_recv (fd=0, fd@entry=33,
buf=buf@entry=0x7cada02b, n=n@entry=1024, flags=1537837035,
flags@entry=16384) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:35
#2 0x000000000040ec54 in recv (__flags=16384, __n=1024, __buf=0x7cada02b,
__fd=33)
at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]
Thread 2 bt:
#0 0x00007f895ba987eb in __libc_recv (fd=fd@entry=31,
buf=buf@entry=0x7ca5e02b, n=n@entry=1024, flags=-1, flags@entry=16384)
at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
#1 0x000000000040ec54 in recv (__flags=16384, __n=1024, __buf=0x7ca5e02b,
__fd=31)
at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]
There can be more than two threads involved, but I'm unsure if it can happen
with just one thread locked up, but it's always inside recv() or poll() and
sometimes in __pthread_disable_asynccancel() within either of those.
Could I work around this problem by changing the threads to syncronmous
cancellable or try to work around the need to cancel the treads at all?
--
You are receiving this mail because:
You are on the CC list for the bug.