This is the mail archive of the archer@sourceware.org mailing list for the Archer project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: ptrace improvement ideas

From: Roland McGrath <roland at redhat dot com>
To: Jan Kratochvil <jan dot kratochvil at redhat dot com>
Cc: Project Archer <archer at sourceware dot org>, Oleg Nesterov <oleg at redhat dot com>
Date: Mon, 7 Feb 2011 17:58:44 -0800 (PST)
Subject: Re: ptrace improvement ideas
References: <20110203223905.D0C77180081@magilla.sf.frob.com><20110207211129.GA23277@host1.dyn.jankratochvil.net>

> the majority of the overhead should be solved by PTRACE_O_INHERIT,

Ok, good.  That leaves the questions of all the possible corner issues that
using it would entail.  So let's try to think of all those.

> an idea for
> a next optimization is to replace reading+parsing of "/proc/PID/xxx" text
> files by a single syscall for that small binary value such as for:
> 
> 	linux_proc_pending_signals (/proc/PID/status) -> now 2x sigset_t
> 	-> PTRACE_IS_SIGNUM_PENDING or PTRACE_GET_PENDING_SIGSET etc.
> 	(GDB is only interested in SIGINT from that sigset_t now.)

I'm surprised that GDB wants to know such a thing.  It makes me suspect
there is a deeper problem to which looking that up is a workaround.
Can you explain?

> 	linux_nat_core_of_thread_1 (/proc/%d/task/%ld/stat)
> 	-> PTRACE_GET_CPUCORE -> long return value as the CPU #

I had no idea GDB cared about such a thing at all.  Why does it?

> Also there could be PTRACE_SET_TGID_DEBUGREG to set debug registers for all
> the TIDs of a PID (=TGID), even for those that are not stopped.

IMHO the legacy style of directly setting the (now virtual anyway) hardware
registers for debugging features should just die, and not be given any new
crutches.

> There is already AFAIK some abstraction of DR regiters inside kernel so maybe
> userland could get access to this abstraction to resolve these two issues.

Indeed, there is now a layer called hw_breakpoint, with in-kernel APIs that
are largely machine-independent.  The legacy arch-specific ptrace interface
for x86 is implemented on top of that (not purely so, but rather supported
by that infrastructure as special cases).  I think Oleg knows the details
of that stuff better than I do at this point.

I think the desireable approach is to figure out a new interface (ptrace
extension, presumably) to use those new facilities directly from user
space.  It should be possible for such a new interface to be largely
machine-independent too.  Perhaps Oleg can make some suggestions for this.

> > * PTRACE_O_INHERIT
> [...]
> > Its effect is that clones of the tracee inherit the
> > ptrace attachedness and option settings of their parent.
> 
> It must explicitly require debug registers (hw watchpoints) inheritance.
> Which happened before but it no longer happens in recent upstream kernels
> (NOTABUG RH#660003).

That is a good point and one I had not been thinking of, though now that
you remind me, I was aware of the issue before.  It's my feeling that the
right way to approach this is to focus on a new set of interfaces built on
the kernel's hw_breakpoint facility (that infrastructure may well need
extending to deal well with userland better).  Those can be defined with
the inheritance and process-wide sharing issues in mind, so that if we then
add PTRACE_O_INHERIT they would mesh well to serve GDB's needs nicely.  I
think that trying to define PTRACE_O_INHERIT first in a way that has new
specific semantics interacting with fuzzily-defined arch-dependent issues
like the x86's legacy debug register behavior would be a bad route.

> > To get this information reliably,
> > the debugger needs to use the waitid call instead of waitpid/wait4.
> 
> [nitpick] or PTRACE_GETSIGINFO after waitpid, as GDB does.

This is a misunderstanding.  PTRACE_GETSIGINFO relates to the information
about a signal being delivered to a tracee.  What I'm talking about is the
information that goes with the SIGCHLD that gdb itself receives when a
tracee event/stop occurs.  This has nothing to do with any tracee's signal
details.  It has to do with how the kernel reports the tracee event to gdb.
Since multiple SIGCHLD signals do not queue with independent siginfo_t
(only SIGRTMIN+n signals do), using waitid is the only reliable way to get
that information.  It is the same information that arrives with the SIGCHLD
associated with a tracee event, but it is detail about the wait result you
are getting, not detail about the tracee status.

The point of using waitid is that you can see the new si_tgid field, and
hence receive both the thread-specific ID and the process-wide TGID for a
tracee that you haven't seen before.  Otherwise, PTRACE_O_INHERIT results
in spontaneous reports for new IDs that you know nothing about and have no
way to associate with where they came from.  The si_tgid idea addresses the
problem for the case of new threads (standard NPTL threads, that
is--CLONE_THREAD thread creations in the kernel's terms).  It doesn't help
at all for other kinds of creations, such as a normal fork or vfork.  That
is why I suggested it might actually not be desireable to have
PTRACE_O_INHERIT apply to all new creations, but instead make it limited to
CLONE_THREAD creations.  I'm interested in your thoughts on the issue of
how GDB deals with the first report of an ID it hasn't seen before.  With
current kernels, the only such situation that's possible is the brief race
between a new PTRACE_O_TRACE{CLONE,FORK,VFORK} child reporting its first
stop, and its parent reporting the PTRACE_EVENT_{CLONE,FORK,VFORK} stop for
that child's creation (at which time PTRACE_GETEVENTMSG tells you the
association between that parent's creation attempt and the new child).


Thanks,
Roland

Follow-Ups:
- Re: ptrace improvement ideas
  - From: Jan Kratochvil
- Re: ptrace improvement ideas
  - From: Oleg Nesterov

References:
- ptrace improvement ideas
  - From: Roland McGrath
- Re: ptrace improvement ideas
  - From: Jan Kratochvil

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]