This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Hitachi djprobe mechanism

From: mathieu lacage <Mathieu dot Lacage at sophia dot inria dot fr>
To: Richard J Moore <richardj_moore at uk dot ibm dot com>
Cc: systemtap at sources dot redhat dot com
Date: Sun, 09 Oct 2005 18:47:37 +0200
Subject: Re: Hitachi djprobe mechanism
References: <OF9AF3E881.D016BF14-ON41257094.007761F8-41257094.00788843@uk.ibm.com>

hi,

So the assumption here is that: 1) we are dealing with non-optimized code.

Sorry ? Where did I say that ?

2) we are dealing with gcc generated code.

I did not say that either.

Here is the algorithm proposed: 1) have a function which can tell you the length of an instruction based on a pointer to the start of the instruction. This is pretty horrible to get right on x86 but it is quite possible and my sample code shows this. 2) have a function which can tell you if an instruction is one of: - a direct or indirect call - a ret - a direct relative or absolute jump - an indirect relative or absolute jump 3) input of the algorithm is the start and end address of a function. For each instruction located between start and end, execute 4, 5, 6, and 7 4) for each direct or indirect call, mark the following instruction as a block boundary 5) for each ret, mark the following instruction as a block boundary 6) for each direct relative or absolute jump, mark the following instruction as a block boundary 7) for each indirect relative of absolute jump, mark the function as non-parseable. 8) once you have executed 3 and if you have not stumbled upon 7), you have a list of all the instructions which are basic block boundaries which means you have solved the problem. end of story.

If you have hit 7), you can only place probes on 5 bytes big instructions. Otherwise, you can place probes anywhere in blocks bigger than 5 bytes.

None of the items presented above rely on code being generated by gcc or specific optimization levels being used.

Whilst it's unlikely that compilers other than gcc are used, it's not impossible - e.g. Intel's IA64 compiler. And the likelihood of non-gcc compilers increases when we consider user-space probes. But also it is

Placing probes in userspace will simply increse the probability that you have to fallback to the 5bytes per inst mechanism because a lot of userspace code is built with -fPIC which increases the probability of finding indirect jumps.

Should you be interested in these probabilities, I can come up quite easily with definite numbers on a number of linux-standard applications. Which applications are you interested in ?

Are we able to guard against these exceptions automatically, or do we have

The detection of the "bad case" (i.e., indirect jumps) is automatic and inherent to the algorithm proposed which means that the fallback to 5bytes instructions is automatic.

[snip]

Not sure about that. I think I can find an example of c-code for which it is impossible to determine the function boundaries from the assembler code, but looks perfectly reasonable from the C perspective.

Oh, well, of course, you can do that. Detecting function boundaries is really hard. However, one of the major assumptions here is that you have access to the debugging information which gives you these function boundaries. If this assumption is not valid, then I don't think the idea of parsing basic block boundaries is reasonable for the application you are interested in.

Mathieu

References:
- Re: Hitachi djprobe mechanism
  - From: Richard J Moore

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]