This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/17707] nptl_db: terminated and joined threads


https://sourceware.org/bugzilla/show_bug.cgi?id=17707

--- Comment #2 from Pedro Alves <palves at redhat dot com> ---
I've investigated this some more.

I noticed that the thread's state is not actually TD_THR_ACTIVE just after the
thread is joined, before the thread is removed from the thread list, here,
in the code I pasted before:

nptl/pthread_join.c:
pthread_join ()
{
...
  if (__glibc_likely (result == 0))
    {
      /* We mark the thread as terminated and as joined.  */
      pd->tid = -1;
...
     /* Free the TCB.  */
      __free_tcb (pd);
    }


But, I _am_ seeing TD_THR_ACTIVE threads with pd->tid == -1.
Turns out that nothing in __free_tcb clears pd->tid.  So later on, when a new
thread reuses the old thread's tcb/stack, the new thread will start out with
tid==-1 (reused from the old thread), up until the kernel actually starts the
new clone and fills in tid (CLONE_CHILD_SETTID), and it's _that_ thread that
has TD_THR_ACTIVE state.  I don't think a new state for when the thread is
already listed in the thread list but doesn't have a kernel clone associated
yet could help here, as a debugger can always attach between glibc changing the
thread state and the kernel filling in the clone's tid.

This made me wonder what happens if a detached thread's tcb/stack is reused. 
Or, if a new stack is allocated for a new thread, instead of reused, and gdb
lists threads before the kernel spawns the new clone.  In that case, the
thread's tid field starts out as 0.  So I thought that just like GDB can see
threads with tid=-1, it should also find them with tid=0 as well.  But, turns
out it doesn't, because nptl_db/td_thr_get_info.c:td_thr_get_info has this:

  /* Initialization which are the same in both cases.  */
  infop->ti_ta_p = th->th_ta_p;
  infop->ti_lid = tid == 0 ? ps_getpid (th->th_ta_p->ph) : (uintptr_t) tid;
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  infop->ti_traceme = report_events != 0;

Eh.  So ti_lid (same as pd->tid inside the inferior) is never returned as zero.
 Instead, for threads that are just being created, GDB is told that their
kernel thread ID is the overall thread group id.  But this is wrong.  This can
well confuse GDB if it decides to refresh its own thread's state cache (given
NPTL's 1:1 model, gdb only keeps track of threads by their kernel ID...)

(I'm guessing that the intent here was that tid == 0 indicates that that this
is the main thread and the pthread library isn't fully initialized yet, and so
the tgid would be correct.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]