This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Ping[2]: [PATCH] Fix sporadic failure in tst-eintr1 test case


On 04/10/2013 09:52 AM, Siddhesh Poyarekar wrote:
On Wed, Apr 10, 2013 at 09:27:10AM +0200, Florian Weimer wrote:
I ran into this on current Fedora 18.

I'm surprised it took so long :)

This smells like a bug in our implementation.  Can we fix this in
glibc?  Any pointers?

pthread_join actually deallocating resources seems fairly important
to me as a quality-of-implementation issue, irrespective of what the
standard says.

It's not a fault with pthread_join and I don't think this can be fixed
with glibc.  The core problem here is the latency between the kernel
notifying the pthread_join'er about the thread exit and the actual
reaping of the thread where the latter is what reduces NPROC.  The
test case goes from joining to spawning the new thread faster than the
kernel is able to reap existing threads, resulting in a net increase
in NPROC.

My concern is that if we can't fix the test case to reliably recover from this scenario, applications will have trouble to do so as well. It is somewhat similar to deferring to garbage collection to close file descriptors.

I expect that freeing the kernel resources will become more and more expensive in relative terms, so this will eventually become visible with threading frameworks which do not perform thread pooling.

--
Florian Weimer / Red Hat Product Security Team


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]