This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Hitachi djprobe mechanism
hi,
So the assumption here is that:
1) we are dealing with non-optimized code.
Sorry ? Where did I say that ?
2) we are dealing with gcc generated code.
I did not say that either.
Here is the algorithm proposed:
1) have a function which can tell you the length of an instruction based
on a pointer to the start of the instruction. This is pretty horrible to
get right on x86 but it is quite possible and my sample code shows this.
2) have a function which can tell you if an instruction is one of:
- a direct or indirect call
- a ret
- a direct relative or absolute jump
- an indirect relative or absolute jump
3) input of the algorithm is the start and end address of a function.
For each instruction located between start and end, execute 4, 5, 6, and 7
4) for each direct or indirect call, mark the following instruction as a
block boundary
5) for each ret, mark the following instruction as a block boundary
6) for each direct relative or absolute jump, mark the following
instruction as a block boundary
7) for each indirect relative of absolute jump, mark the function as
non-parseable.
8) once you have executed 3 and if you have not stumbled upon 7), you
have a list of all the instructions which are basic block boundaries
which means you have solved the problem. end of story.
If you have hit 7), you can only place probes on 5 bytes big
instructions. Otherwise, you can place probes anywhere in blocks bigger
than 5 bytes.
None of the items presented above rely on code being generated by gcc or
specific optimization levels being used.
Whilst it's unlikely that compilers other than gcc are used, it's not
impossible - e.g. Intel's IA64 compiler. And the likelihood of non-gcc
compilers increases when we consider user-space probes. But also it is
Placing probes in userspace will simply increse the probability that you
have to fallback to the 5bytes per inst mechanism because a lot of
userspace code is built with -fPIC which increases the probability of
finding indirect jumps.
Should you be interested in these probabilities, I can come up quite
easily with definite numbers on a number of linux-standard applications.
Which applications are you interested in ?
Are we able to guard against these exceptions automatically, or do we have
The detection of the "bad case" (i.e., indirect jumps) is automatic and
inherent to the algorithm proposed which means that the fallback to
5bytes instructions is automatic.
[snip]
Not sure about that. I think I can find an example of c-code for which it
is impossible to determine the function boundaries from the assembler code,
but looks perfectly reasonable from the C perspective.
Oh, well, of course, you can do that. Detecting function boundaries is
really hard. However, one of the major assumptions here is that you have
access to the debugging information which gives you these function
boundaries. If this assumption is not valid, then I don't think the idea
of parsing basic block boundaries is reasonable for the application you
are interested in.
Mathieu