This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: GDB hangs on kill or quit (after following a fork child, not detaching from the parent)
- From: teawater <teawater at gmail dot com>
- To: "Pedro Alves" <pedro at codesourcery dot com>
- Cc: gdb-patches at sourceware dot org, "Michael Snyder" <msnyder at vmware dot com>
- Date: Fri, 19 Dec 2008 13:16:12 +0800
- Subject: Re: GDB hangs on kill or quit (after following a fork child, not detaching from the parent)
- References: <200812122113.57018.pedro@codesourcery.com>
Hi,
This patch is for multiprocess branch right?
The code in fix_killall.diff:
+ /* Also add an entry for the child fork. */
+ fp = find_fork_pid (child_pid);
+ if (!fp)
+ fp = add_fork (child_pid);
+ fork_save_infrun_state (fp, 0);
is close to linux-nat-multiprocess-focus.txt in
http://sourceware.org/ml/gdb-patches/2008-12/msg00001.html:
+ /* Retain child fork in ptrace (stopped) state. */
+ fp = find_fork_pid (child_pid);
+ if (!fp)
+ fp = add_fork (child_pid);
+ fork_save_infrun_state (fp, 0);
So I think maybe patches in
http://sourceware.org/ml/gdb-patches/2008-12/msg00001.html and
http://sourceware.org/ml/gdb-patches/2008-12/msg00058.html can make
this branch for linux-nat work better.
Thanks,
Hui
On Sat, Dec 13, 2008 at 05:13, Pedro Alves <pedro@codesourcery.com> wrote:
> [ Michael, you're the forks man. CCing you in case see an issue with
> the attached patch? ]
>
> GDB hangs if you do:
>
> > ./gdb -ex "set follow-fork-mode child" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
> GNU gdb (GDB) 6.8.50.20081212-cvs
> (gdb) start
> Temporary breakpoint 1 at 0x40054f: file ../../../src/gdb/testsuite/gdb.base/foll-fork.c, line 21.
> Starting program: /home/pedro/gdb/baseline/build/gdb/testsuite/gdb.base/foll-fork
>
> Temporary breakpoint 1, main () at ../../../src/gdb/testsuite/gdb.base/foll-fork.c:21
> 21 int v = 5;
> (gdb) n
> During symbol reading, incomplete CFI data; unspecified registers (e.g., rax) at 0x40054b.
> 23 pid = fork ();
> (gdb)
> [Switching to process 14900]
> 24 if (pid == 0)
> (gdb) kill
> Kill the program being debugged? (y or n) y
> <HANGS HERE>
>
> The same will happen if you issues 'quit' instead of 'kill', as quitting
> tries to 'kill' the inferior.
>
> When there are forks involved, linux_nat_kill calls into linux_fork_killall
> to do the killing. But, when following a fork child, and not
> detaching from the parent, we defer adding the child fork to the
> list of forks (which is confusing IMHO, see below), so linux_fork_killall
> misses killing it. Then, we hang waiting for it to die, but it
> won't happen.
>
> "kill"
> -> linux_nat_kill
> -> linux_fork_killall.
>
> This kills all forks listed. At this point, since only the parent was listed,
> the child was left alive.
>
> After killing, we call target_mourn_inferior, which goes:
>
> -> linux_nat_kill
> -> linux_nat_mourn_inferior
> ->inf_ptrace_mourn_inferior
>
> Now, inferior_ptid was pointing at the child (we followed it),
> and, we didn't kill it so, we hang here:
>
> 184 /* Clean up a rotting corpse of an inferior after it died. */
> 185
> 186 static void
> 187 inf_ptrace_mourn_inferior (struct target_ops *ops)
> 188 {
> 189 int status;
> 190
> 191 /* Wait just one more time to collect the inferior's exit status.
> 192 Do not check whether this succeeds though, since we may be
> 193 dealing with a process that we attached to. Such a process will
> 194 only report its exit status to its original parent. */
> (top-gdb)
> 195 waitpid (ptid_get_pid (inferior_ptid), &status, 0);
> ^^^^^^^
>
> #0 0x00007fb6a45364a5 in waitpid () from /lib/libc.so.6
> #1 0x00000000005df07b in inf_ptrace_mourn_inferior (ops=0xaf4ce0) at ../../src/gdb/inf-ptrace.c:195
> #2 0x0000000000477c55 in linux_nat_mourn_inferior (ops=0xaf4ce0) at ../../src/gdb/linux-nat.c:3194
> #3 0x0000000000543d82 in target_mourn_inferior () at ../../src/gdb/target.c:1899
> #4 0x0000000000477c09 in linux_nat_kill () at ../../src/gdb/linux-nat.c:3180
> #5 0x00000000004622b6 in kill_command (arg=0x0, from_tty=1) at ../../src/gdb/inflow.c:602
> ...
>
> =================
>
> The attached patch fixes it by also adding the child to the fork
> list in this case.
>
> Also, since we're now adding the child, one bit of special casing
> from linux-fork.c can be removed, as in the patch. The 'error' call
> in linux-fork.c that I'm removing is really dead code, as all callers
> do the same check.
>
> This is the current output of info forks, when following a parent,
> and when following a child:
>
> ./gdb -ex "set follow-fork-mode parent" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
> ...
> (gdb) info forks
> 1 process 15396 at 0x7ffff789ac4b, <fork>
> * 0 process 15393 (main process) at 0x40055e, file foll-fork.c, line 24
>
> ./gdb -ex "set follow-fork-mode child" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
> ...
> (gdb) info forks
> 1 process 15319 at 0x7ffff789ac4b, <fork>
>
> Notice how the 'following the child' case is a bit convoluted:
>
> - Only the parent is listed, but it isn't that obvious that fork 1
> is the parent.
> - There's no indication of which fork is current.
>
> If you do:
> (gdb) fork 1
> Switching to process 16458
> #0 0x00007ffff789ac4b in fork () from /lib/libc.so.6
> (gdb) info forks
> 2 process 16461 at 0x40055e, file foll-fork.c, line 24
> * 1 process 16458 at 0x7ffff789ac4b, <fork>
> (gdb)
>
> Voila! - the child fork was added (fork 2). I find this confusing.
>
> =================
>
> This is the output after the patch is applied:
>
>> ./gdb -ex "set follow-fork-mode child" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
> ...
> (gdb) info forks
> * 2 process 15934 at 0x40055e, file foll-fork.c, line 24
> 1 process 15927 at 0x7ffff789ac4b, <fork>
>
>>./gdb -ex "set follow-fork-mode parent" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
> ...
> (gdb) info forks
> 1 process 15979 at 0x7ffff789ac4b, <fork>
> * 0 process 15974 (main process) at 0x40055e, file foll-fork.c, line 24
>
> A bit more uniform, and the hang bug goes away, of course. That special
> casing to have a fork number 0 could go away too, IMVHO.
>
> Tested on x86_64-linux-gnu, no regressions.
>
> --
> Pedro Alves
>