This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH 2/6] Introduce throw_ptrace_error
- From: Pedro Alves <palves at redhat dot com>
- To: Mark Kettenis <mark dot kettenis at xs4all dot nl>
- Cc: gdb-patches at sourceware dot org
- Date: Sun, 08 Mar 2015 21:48:12 +0000
- Subject: Re: [PATCH 2/6] Introduce throw_ptrace_error
- Authentication-results: sourceware.org; auth=none
- References: <1425671886-7798-1-git-send-email-palves at redhat dot com> <1425671886-7798-3-git-send-email-palves at redhat dot com> <201503062103 dot t26L3tef004332 at glazunov dot sibelius dot xs4all dot nl> <54FA1EB3 dot 2050706 at redhat dot com> <201503082029 dot t28KToYr022852 at glazunov dot sibelius dot xs4all dot nl>
On 03/08/2015 08:29 PM, Mark Kettenis wrote:
> I think your interpretation of ESRCH is too Linux-centric. You're
> once again duct-taping around the Linux kernel's whoefully
> insufficient threads debugging capabilities.
Nice.
> It really should not be
> possible for a thread to just disappear without the debugger being
> notified. Do I sound like a broken record?
Sorry, but yes, you do. ;-)
The debugger is notified. It's just a fact that a process can
die (and become zombie) even while it was _stopped_ under
ptrace control. That's a race you can't prevent, only cope with.
I found NetBSD 5.1 in the GCC compile farm, and I see ESRCH
there too:
-bash-4.2$ uname -a
NetBSD gcc70.fsffrance.org 5.1 NetBSD 5.1 (GENERIC) #0: Sat Nov 6 13:19:33 UTC 2010 builds@b6.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/amd64/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/amd64/compile/GENERIC amd64
-bash-4.2$ gdb ./foo
GNU gdb 6.5
...
(gdb) start
Breakpoint 1 at 0x400894: file foo.c, line 5.
Starting program: /home/palves/foo
main () at foo.c:5
5 return 0;
(gdb) p getpid ()
$1 = 24557
(gdb) shell kill -9 24557
(gdb) c
Continuing.
ptrace: No such process.
(gdb)
But even if some ptrace-based OS uses a different errno
for that (which I doubt), we can just tweak throw_ptrace_error
(a centralized place, yay!) to look for a different
errno value. So what does OpenBSD's ptrace return
in the test above?
> I think at this point the right approach is to make
> linux_resume_one_lwp() call ptrace() directly instead of calling down
> into the inf_ptrace_resume(). That way you can simply check errno in
> the place where it matters.
No, your "simply" is not simple as you imply. There can be any number
of ptrace calls that fail before the PT_CONTINUE in inf_ptrace_resume
is reached. And whether to ignore the error should be left to some
caller higher up on the call chain. That was the _whole point_ of this
fuller fix, as I explained throughout the series.
E.g., the ptrace call that fails can be the one that tries to write
debug registers to the inferior, normal registers, reading the auxv,
any memory read/write, whatever. Any ptrace error that throws ends up
in the generic perror_with_name today, after the series, they'll
end up in throw_ptrace_error instead, a single place we can add
more context info to the error thrown. How is that a bad thing?
Thanks,
Pedro Alves