This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFA [PATCH v4] Implement 'catch syscall' for gdbserver (was Re: RFA [PATCH v3] Implement 'catch syscall' for gdbserver)


Thanks for the in depth comments (sorry for still not getting all
style aspects ok).

I can handle the (easy) comments in the coming days.

However, there are a few not easy comments on which I have
questions/(long) feedback below ...


On Thu, 2013-10-03 at 19:40 +0100, Pedro Alves wrote:

> Please don't ever send back ffffffff as meaning -1.
> Numbers in the RSP are variable length hex.  You can't assume the
> host side interprets them as two's complement 32-bit integers.  If
> sending a negative number, the '-' sign needs to be there explicitly
> in the packet, like '-1'.  Perhaps it would even be better to leave
> -1 as implementation detail, and at the protocol level send something
> else, like "unknown" or just "-", or nothing.
> 
> > (do we really want that to be part of the protocol spec ?
> > This is really a very special case caused by a user "strange"
> > manipulation of forcing to use the packet QCatchSyscalls).
> 
> Yes, please specify how that should work.  It can happen, so
> better define it.  It should also be possible for the server to be
> dumb, and have GDB itself figure out the syscall number.  GDB
> could read the registers itself, just like gdbserver is currently
> doing.  I'd assume the -ENOSYS trick might not always work on
> all archs, for instance?
I agree "unknown" or "-" or something like that is cleaner.
For what concerns the -ENOSYS nasty trick: no, this does not work on
all archs. I quickly hacked ppc64 gdbserver, and there the
nasty trick is to compare sysret and orig_r3 : if identical, it is
a syscall entry.
All these logics are all today partially wrong:
* The "flip/flop" logic in native GDB becomes completely lost
  in case of enable/disable e.g. between entry and exit).
* x86 gdbserver  has a minor bug (I think): a syscall "exit" from
  a not implemented syscall is seen as an entry,
* the hacky ppc64 trick is wrong in case the sysret of the previous
  syscall is equal to the r3 argument of the new syscall.

In summary: differentiate syscall entry from exit is currently
fully correct nowhere (the less bad being gdbserver x86, which is almost
always correct and "repairs" itself). ppc64 is not always correct,
but also repairs "itself". The flip/flop logic, once broken, confuses
all entry/exits until further notice.

I think we need to have something better e.g. in common/linux-ptrace.c
which would maintain the correct expected state using maybe the
following behaviour described in the ptrace man :
       Syscall-enter-stop and  syscall-exit-stop  are  indistinguishable  from
       each  other  by  the  tracer.   The  tracer  needs to keep track of the
       sequence of ptrace-stops in order to  not  misinterpret  syscall-enter-
       stop  as  syscall-exit-stop  or  vice versa.  The rule is that syscall-
       enter-stop is always followed by syscall-exit-stop,  PTRACE_EVENT  stop
       or  the  tracee's  death;  no  other  kinds of ptrace-stop can occur in
       between.
It is however not clear to me how the above works if e.g. gdb attaches
to a process which is blocked in a syscall, and then does PTRACE_SYSCALL
Will GDB get an entry for such a restarted syscall ?
Or will it get only an exit ?

I was intending to improve (and factorise) this logic after the RSP
protocol patch, but I have to admit I do not understand the above man.
I first intended to make a regtest showing the bug in the native
implementation.


> > +QCatchSyscalls:1 [;SYSNO]...
> > +QCatchSyscalls:0
> > +  Enable ("QCatchSyscalls:1") or disable ("QCatchSyscalls:0")
> > +  catching syscalls from the inferior process.
> 
> So, "catch syscall" is per-inferior/process on the GDB side, but
> this always sets the catchpoints on all processes.  Was that
> intended?
As I understand now (after the hint about add-inferior you gave
on IRC), gdbserver supports multi process.
However, it is not clear to me how gdbserver properly processes
e.g. the packet QPassSignals.
The QCatchSyscalls packet is just following the QCatchSyscalls packet.

So, if I understand properly the comment, the QCatchSyscalls packet
should modify a list of syscalls to maintain per process
i.e. in the inferiors.h struct process_info ?

To set the process to be used for a QCatchSyscalls packet, gdb will
then first have to send a Hg packet.

Do I correctly understand the comment (and the way GDB will or should
use Hg to indicate the inferior for the next QCatchSyscalls) ?



> The whole 'SIGTRAP | 0x80' thing is super silly.  The ideal way
> would be for the kernel to tell us.  See:
> 
>  https://sourceware.org/gdb/wiki/LinuxKernelWishList
> 
> "A PTRACE_O_SYSCALL(_ENTER|_EXIT) ptrace option, along with
>  PTRACE_EVENT_SYSCALL(_ENTER|_EXIT).  PTRACE_SYSCALL/sysgood doesn't leave
>  any way to trace syscalls when single-stepping.  Separate syscall enter
>  and exit events would just be nice, to avoid having to keep track of
>  enter/exit - the kernel knows, might as well give us the info. (While
>  at it, a way to filter which syscalls trigger events would be nice to
>  avoid unnecessary slowing down the inferior until it calls the syscall the user
>  is interested in (you can catch specific syscalls with "catch syscall foosyscall";
>  gdb currently filters after the fact), though that's obviously a
>  bit more work.)"
> 
> Want to work on that?  It'd be great...
Well, I will first concentrate on trying to make a correct GDB patch
and properly follow the GDB coding style rules.


> 
> Another idea would be, in addition to supporting syscall_entry
> and syscall_exit stop replies, also support plain "syscall" (and
> a new TARGET_WAITKIND_SYSCALL), meaning, "I don't know if it was
> an entry or an exit, you decide." (just like SYSCALL_SIGTRAP), and then
> we'd leave GDB to decide.  GDB will end up fetching registers
> off the target anyway, so I think I'd end up costing 0 in terms of
> performance.
> 
> Then when we have real PTRACE_EVENT_SYSCALL(_ENTER|_EXIT), we'd
> use syscall_entry/syscall_exit.
> 
> (not completely sure about this).
Not sure also. Adding this in the protocol before we know it is needed
is not ideal. However, it looks to me that not adding it now means
that a GDB supporting only syscall enter/exit will have a problem
with a newer gdbserver that would give back the 3rd reason.


> 
> How portable is this -ENOSYS thing?  What does strace do?
-ENOSYS is documented in man ptrace, but only for x86.
strace uses -ENOSYS for x86, and (from what I understood reading
the code) cannot differentiate entry/exit (in all cases) for other
archs.
I suspect strace uses for these archs a flip/flop which might
suffer from the same bug/weakness as the native GDB flip/flop.

> 
> 
> BTW, it'd be very very nice if GDB and GDBserver didn't end
> up with different mechanisms for this.  We've been trying to
> make the gdb and gdbserver backends more similar, with the end
> goal of merging them, and more points of divergence don't really
> help that.
Yes, I agree. I was intending to look more in depth to that after
the RSP patch.

> > +# This shall be updated whenever QCatchSyscalls packet support is implemented
> > +# on some gdbserver architecture.
> > +if { [is_remote target]
> > +     && ![istarget "x86_64-*-linux*"] 
> > +     && ![istarget "i\[34567\]86-*-linux*"] } {
> > +    continue
> > +}
> 
> This isn't strictly correct, as it's quite concievable that
> e.g., a non-x86 valgrind (or some other random target) gets
> support for these packets before our own corresponding gdbserver
> port gets it.  Ideally we'd probe for support somehow, like by
> looking for some GDB error/warning in the first "catch syscall", or
> some such.
Effectively, at Valgrind side, it will be supported for all
archs out of the box (as Valgrind has already an arch independent
syscall interception, and properly identifies entry from exit).
However, up to now, I never dared to run the gdb testsuite with
Valgrind gdbserver.


> - I think that if you set a catch syscall, and then disconnect
> in a way that gdbserver leaves the target running (disconnected
> tracing, or if we have any persistent commands, see server.c), we'll end up
> with the program held stopped as soon as the program calls a
> syscall.  I think at least handle_target_event should be updated.
I will need to dig into that.
> 
> - linux-nat.c:linux_handle_syscall_trap has some hairy stuff
> when "stopping" is true, due to syscall restarts.  I was
> a little surprised that the gdbserver patch didn't need
> anything of the sort, but then again, I don't recall whether
> that bits is covered by the testsuite.
> I'm wondering whether this -ENOSYS mechanism would allow
> getting rid of all that, of if something is missing on the
> gdbserver side.
I believe the hairy stuff for native is to implement the flip/flop
logic, but this logic is broken when enabling/disabling.
-ENOSYS does not suffer from this problem (see above), but is
arch dependent.
So, I believe the current x86 gdbserver -ENOSYS is better/simpler,
but for other archs, it seems such tricks do not exist or are
not working as well (e.g. ppc64).

I am not sure how to continue with this problem of differentiating
entry/exit, together with the GDB 7.7 deadline.

I see 3 approaches:
1. implement the QCatchSyscalls packet as it is now
   (with only entry/exit), and -ENOSYS for x86,
    and see later what to do better for other archs or native GDB.
2. Same as 1, but we add the 3rd stop reasons
         syscall_no_idea_if_it_is_an_entry_or_return
   so as to have the protocol ready for archs less easy than x86.
3. work longer on the patch, first trying to have a proper
   way to differentiate entry/return in both native GDB and
   gdbserver, and preferrably arch independent.
   (unclear if this is possible. If not, we fallback to
    option 2 or 1).
   (or possibly, the bug in the GDB flip/flop is easy to fix ?
    then the flip/flop should be used by gdbserver also)

I guess 1 and 2 could be done on time for GDB 7.7, but I will not have
enough evening/WE time to finish 3 in let's say the next two weeks
(there is soon a valgrind 3.9.0 to go out, for which I have a few
things still to do).

Philippe



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]