This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [RFC] fix kallsyms to allow discrimination of local symbols
- From: James Bottomley <James dot Bottomley at HansenPartnership dot com>
- To: "Frank Ch. Eigler" <fche at redhat dot com>
- Cc: linux-kernel <linux-kernel at vger dot kernel dot org>, systemtap at sourceware dot org
- Date: Mon, 21 Jul 2008 22:53:08 -0500
- Subject: Re: [RFC] fix kallsyms to allow discrimination of local symbols
- Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=2007; t=1216698791; bh=OrbB+ygAKIBEEraq9kqyH8UXxEMT8+WsHkXZd/WPHf U=; l=4302; h=Subject:From:To:Cc:In-Reply-To:References: Content-Type:Date:Message-Id:Mime-Version: Content-Transfer-Encoding; b=anbOJQeTBXgxNLp6mrGmE4B3lg04jBShjvi4R dRRj8zrYjcHueI0hy0QD/owbGxLzExEUTBl9b88HYYsKCKYKdEykOMI4x5ZKvUI3SVG ychlovraNIhRC3y9WG8lg9ltPi+GaBAV3Ureh1Q2hLXIrJi8BzK/e8YO809eLBVgw4s =
- Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=2007; t=1216698789; bh=OrbB+ygAKIBEEraq9kqyH8UXxEMT8+WsHkXZd/WPHf U=; l=4302; h=Subject:From:To:Cc:In-Reply-To:References: Content-Type:Date:Message-Id:Mime-Version: Content-Transfer-Encoding; b=jnVPYXi4EsAd9Ba7gwMdlghrt/0r5YbU08nFZ bnnSCzlRuW2ZPEtr1gE1pKR5Bvguw/nzbtDvGUwRq09Ux2w9OJjHSMaI2fD2nT1/N8I QSyaSRByEHxePc28iqv0Y0B7SB2Xw8OdjGBQ6gV8Y1tLWz9a+YjN5jTkUo9G2LInemQ =
- References: <1216676595.3433.80.camel@localhost.localdomain> <y0mprp68zg9.fsf@ton.toronto.redhat.com>
On Mon, 2008-07-21 at 21:44 -0400, Frank Ch. Eigler wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> writes:
>
> > [...] Fix all of this by prefixing local symbols with the actual C
> > file name they occur in separated by '|' (I had to use '|' since ':'
> > is already in use for module prefixes in kallsyms lookups. [...]
> > Comments?
>
> Can we take some time to review how we got here?
>
>
> - You disprefer systemtap's use of an established, non-deprecated API
> for placing kernel probes. (We calculate addresses by a mixture of
> elf-analysis and runtime user-space lookup means. That's partly
> since kallsyms_lookup was unexported over our objections.) There is
> nothing outright broken (e.g. incorrect numbers) with what systemtap
> has been doing for years.
You mean embedding half a megabyte of symbols simply so you can avoid
the inconvenience of using a kernel API? yes, I think it's ...
suboptimal.
> - You argue that symbols+offset kprobing is better. We can see that,
> in some sense, but ...
>
> - I explain that we are used to final address calculating, as we'll
> have to do that regardless for user-space probes. Plus we need to
> work with kernels that predate the symbol+offset kprobe api
> extension. So this change would not simplify systemtap in any way.
> You do not respond.
There is no current userspace infrastructure, since utrace still isn't
in the kernel, so you're predicating this argument on an event which
hasn't happened.
Even assuming utrace is accepted, embedding the symbol table of every
user space process in the probes is still daft. It's this constant
assertion that "it must be done my way" that's causing such a drag on
the open source process. For instance, the obvious way to me of doing
this would be to map the user space stack into the systemtap runtime and
unwind it from there instead of vectoring it into the kernel.
> - I offer _stext+offset (for the kernel) and (.text*)+offset (for
> modules) kprobes: basically to use the "better" symbol+offset
> kprobes api, but use the same single reference addresses we already
> do, and leaving just the final addition to the kernel. You do not
> respond materially.
I thought this and subsequent emails addressed the points pretty well:
http://marc.info/?l=linux-kernel&m=121632572409118
> - You argue that it cannot only be any symbol+offset ... but the actual
> nearest symbol+offset. But that doesn't work for local symbols. So
> you fix that to the nearest globally visible symbol+offset. But this
> requires:
> - yet more extra work and code from systemtap
I'm afraid that's how open source development works ... you iterate to
find the best solution
> - extension to the kernel build system, and kallsyms runtime data to
> fix the current local-symbol-ambiguity problem
Finding weaknesses in APIs and fixing them is what it's all about.
> - storage of all that new file name data in permanent unswappable
> kernel data (>>100kB, if done simply prefixing local symbol names
> file file names).
I'd check my facts before making assertions. The kernel symbol table is
stored in a compressed form that actually eliminates most of these
repetitions.
> - possible further complications related to filename string matching
Any substantiation of that?
> - You have yet to invent a scheme to allow offloading *data* address
> calculations to the kernel. Without that (and perhaps more),
> systemtap will *still* have to fetch same base _stext etc.
> addresses at run time that it currently does -- even if it did not
> use them to compute kprobes addresses.
That would be because I haven't actually started looking at this one
yet. Of course, that would make it a great starting point for others
who wished to help.
> In total, this path would end up with both systemtap and the kernel
> more complex, larger and a bit slower too.
Really? I count the reduction of the probe modules from 500kb to 50kb a
worthwhile saving. I don't even see where anything became larger.
> Does that still seem an
> acceptable cost, just to get systemtap to change its preferred kprobes
> api?
James