This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [RFA] Checkpoint: wait the defunct process when delete it
- From: Pedro Alves <pedro at codesourcery dot com>
- To: gdb-patches at sourceware dot org
- Cc: Hui Zhu <teawater at gmail dot com>, Michael Snyder <msnyder at vmware dot com>
- Date: Tue, 11 May 2010 00:30:09 +0100
- Subject: Re: [RFA] Checkpoint: wait the defunct process when delete it
- References: <i2odaef60381005082323jf45a1b90h4f76e7767cfae9bc@mail.gmail.com>
On Sunday 09 May 2010 07:23:15, Hui Zhu wrote:
> I found that when we delete the checkpoint process, it keep defunct.
> This is because the parent process is still running and didn't wait
> it.
> So I add a wait_ptid function after ptrace kill.
You're assuming inferior_ptid is the parent process
of the checkpoint fork, but I don't believe that is always
true. E.g., if you do
(gdb) checkpoint
(gdb) checkpoint
(gdb) checkpoint
(gdb) info checkpoints
3 process 15353 at 0x457d43, file gdb.c, line 28
2 process 15352 at 0x457d43, file gdb.c, line 28
1 process 15351 at 0x457d43, file gdb.c, line 28
* 0 Thread 0x7ffff7fcc6f0 (LWP 15348) (main process) at 0x457d43, file gdb.c, line 28
(gdb) restart 1
...
(gdb) delete checkpoint 2
At this point, inferior_ptid will be process 15351, but that
is not the parent of 15352, the process you're killing. 15348
is.
> +static int
> +wait_ptid (ptid_t ptid)
> +{
I'd rename this to call_waitpid, similarly to the call_lseek function
already present in the file. You may want to reimplement your
function similarly to call_lseek is implemented too. Your call.
> + struct objfile *waitpid_objf;
> + struct value *waitpid_fn = NULL;
> + struct value *argv[4];
> + struct gdbarch *gdbarch = get_current_arch ();
> +
> + /* Get the waitpid_fn. */
> + if (lookup_minimal_symbol ("waitpid", NULL, NULL) != NULL)
> + waitpid_fn = find_function_in_inferior ("waitpid", &waitpid_objf);
> + if (!waitpid_fn)
> + if (lookup_minimal_symbol ("_waitpid", NULL, NULL) != NULL)
"_waitpid" here,
> + waitpid_fn = find_function_in_inferior ("waitpid", &waitpid_objf);
but "waitpid" here?
You could also put those two 'if's on a single line, like:
if (lookup_minimal_symbol ("waitpid", NULL, NULL) != NULL)
waitpid_fn = find_function_in_inferior ("waitpid", &waitpid_objf);
if (!waitpid_fn && lookup_minimal_symbol ("_waitpid", NULL, NULL) != NULL)
waitpid_fn = find_function_in_inferior ("_waitpid", &waitpid_objf);
> + /* Get the argv. */
> + argv[0] = value_from_longest (builtin_type (gdbarch)->builtin_int,
> PIDGET (ptid));
> + argv[1] = value_from_longest (builtin_type (gdbarch)->builtin_int, 0);
> + argv[2] = value_from_longest (builtin_type (gdbarch)->builtin_int, 0);
> + argv[3] = 0;
The second argument of waitpid is a pointer, not an integer. From
`man waitpid':
pid_t waitpid(pid_t pid, int *status, int options);
> + if (call_function_by_hand (waitpid_fn, 3, argv) == 0)
> + return -1;
This doesn't work in non-stop/async modes if the parent
is presently running. Maybe just take the easy route for now, and
add an is_stopped check, bailing out if not stopped?
> + if (wait_ptid (ptid))
> + error (_("Unable to wait pid %s"), target_pid_to_str (ptid));
--
Pedro Alves