This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: utrace syscall arguments


> I feel positive I am missing something completely here!

Oh, wait.  We are talking about ia64.  So it probably actually can go
forward and backward in time.  We need to get the ia64 kernel people on
this, for sure.  I can see the potential rip in the fabric of spacetime,
but I don't really understand all the physics involved much at all.  The
corners of this might well differ in RHEL5 vs the vanilla ia64 kernel.

I'm looking at arch/ia64/kernel/ptrace.c:ia64_syscall_get_set_arguments in
the vanilla kernel.  This gets the arguments off the kernel stack RBS.  It
does not take into account TIF_RESTORE_RSE.  If someone has done "writeback",
i.e. called user_regset.writeback in the kernel, or stopped for ptrace
(which does that or equivalent), then these words have all been copied to
the user stack RBS (plain normal user memory).  This user memory is where
strace knows to look.  When strace wants to change a syscall argument (-f
does this for clone2 syscalls), it pokes this memory.  The "writeback"
machinery has set TIF_RESTORE_RSE (at ptrace stop time, before the poke
could happen).  That makes the kernel copy this user stack data back to the
kernel stack RBS area before looking at those syscall arguments.

The kernel checks TIF_RESTORE_RSE after entry tracing and after exit
tracing.  That means if anything in the tracing callback caused "writeback"
(kernel RBS -> user memory), resumption after tracing will reload (user
memory -> kernel RBS).  In the vanilla kernel, "tracing" is ptrace stop,
which does the writeback.  

RHEL5 ptrace is a utrace engine whose callbacks call regset->writeback.
So if you are another utrace engine called before ptrace's (or you are
the only engine), writeback has not happened.  If you are called after
ptrace's engine, then the writeback has happened.  When ptrace is there,
the user memory words, as possibly modified between now and when you
resume after QUIESCE, are what will actually be the syscall's arguments.
But ia64_syscall_get_set_arguments will look at the kernel RBS words
instead.  (There is also the issue of these user memory words you want
to look at being modified after your callback, during quiescence, so you
are not there to see the eventual values.  That is in fact the subject
of Renzo Davoli's concern over on utrace-devel, so look there for that.)

If writeback has happened, then syscall_{get,set}_arguments() really
ought to be working with the new data.  It could do user memory access
if TIF_RESTORE_RSE.  Or it could synchronize back (i.e. so as to reset
TIF_RESTORE_RSE) before looking.  However, the latter is possibly
questionable since someone who called regset->writeback has the
expectation that the user memory is the authoritative state that will be
used by the process on resume, and might be examining it asynchronously.

In fact, I can't see how any angle of this could explain your failure
mode.  In your case, nobody is changing any values and AFAIK the copy
from kernel RBS doesn't molest the old data there, so the answers
ia64_syscall_get_set_arguments gives you ought to be right regardless.
But it is a real cloudy pile of issues that is real near what you are doing.


Thanks,
Roland


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]