This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: x86_64 RIP-relative addressing bug

From: Roland McGrath <roland at redhat dot com>
To: Jim Keniston <jkenisto at us dot ibm dot com>
Cc: SystemTAP <systemtap at sources dot redhat dot com>,Prasanna Panchamukhi <ppancham at in dot ibm dot com>
Date: Mon, 28 Feb 2005 19:31:58 -0800
Subject: Re: x86_64 RIP-relative addressing bug

Thanks for that fine elucidation of the issues.  I think you have covered
the range of attacks available to us very well.  However, I am quite
skeptical of your conclusions.  

At first blush, I am pretty turned off by all the "outside" approaches.
That is, ones that require user-level probe preparation to "get it right"
in some detailed way.  The complication and fragility introduced into the
user-kernel interface, and more specifically the interface modules use to
the kprobes management core, is a cause for great concern.  I think this
easily overshadows the implementation issues of the low-level approaches,
which I'll discuss in a moment.  This path brings significant new risks.
The offline analysis and code-tweaking tools involved will use a lot of
code (albeit much of it existing code) and be pretty complex.  I'm not just
concerned with bugs in the code analysis per se, but with the great variety
of relatively simple errors in the probe generation/management tools that
will have potentially disastrous impact.  It is of course already the case
that e.g. a probe compiled for the wrong kernel binary can have terrible
effects if it causes an int3 to be inserted in the middle of an
instruction, and perhaps more insidious effects if a probe does any data
modification (and so inserted at a valid instruction other than the one
intended, could misuse register values to find data locations to clobber).
But I fear that incorrect RIP-relative addressing fixup could be more
likely to introduce subtle data corruption rather than just inducing a
crash.  (The same risk exists with a misplaced int3 byte clobbering any
other addressing mode's register selection byte or displacement bytes.)

Moreover, I think the layering of functionality we have now is a good
thing.  That is, kprobes is (on x86) a robust generic facility for
inserting a probe "at any reasonable instruction" and having it work.
I think this should be a goal of its own for x86-64 kprobes as well.
To put it bluntly, I have a pretty strong "just do kprobes right" position.
The motivations for this view are not strictly within the scope of the current
systemtap project, but I'll put it out there as a personal priority.

As to the low-level issues, there are two components: detection, and fixup.
In brief, I think your assessment of the difficulty is overly pessimistic,
and I offer the supposition that we can in fact do the "best" solution.
The problem is intricate but not vast, and I think the combination of the
robustness goal I just mentioned and the sheer inelegance of the interfaces
required for the avoid-the-problem approaches, obligates us to give it the
old college try before falling over ourselves to avoid thinking about it.

Detection is the key element, really.  My asserted goal of a robust, simple
facility, not intrinsically requiring arcane knowledge of the particular
instruction being instrumented, makes detection mandatory: if you ask
kprobes to insert a probe at an instruction boundary and it tells you it
did so, that instruction ought to get executed with the proper effects.
Detection alone, with kprobes simply refusing to insert a probe on a
RIP-relative instruction, would be a marked improvement on the status quo.

I really think the notion that it would take "hundreds of lines" of code to
decode x86-64 instructions adequately to identify RIP-relative ones,
overstates the complexity of the problem.  The encoding is hairy, but it's
not that hairy.  There is plenty of experience with decoding it.  The
intimate knowledge required for doing so is in the book in front of me.  In
considering this complexity, it's important to recognize that it needn't be
bulletproof (though I am claiming that it's not desperately hard to make it
so).  We're only concerned with the instructions the compiler produces and
that really appear in kernel code.  We can do the objdump|grep on kernel
text to identify every RIP-relative instruction, and point the detector
code at each one to verify that it catches them all.  We can even use
objdump to tell us the insn boundaries, and then point it at every other
instruction to verify it has no false positives.  Whether it's as hard as
you suspect it is, or as doable as I suppose it is, if we have some code
that we think does it, we can certainly achieve confidence that it does or
doesn't do it adequately for the needs we can envisage.

Given detection, we come to fixup (instruction adjustment).  There is only
one form of the RIP-relative addressing mode, which uses a signed 32-bit
displacement.  The only issue that arises is if the distance from the
instruction copy's location to the target address exceeds 2GB.  Rewriting
the instruction to use a precomputed 64-bit address instead is between
difficult and impossible (literally, depending on the instruction); in some
cases it would have to be rewritten to use a scratch register, with the
attendant hassles of that.  It's better if you can just locate the scratch
area for instruction copies somewhere +/-2GB from the code into which
probes are being inserted.  Currently x86-64 kprobes uses vmalloc space for
the instruction copies, which is far away from the region containing the
kernel's code.  However, the kernel code and all loaded modules' code is
all put within a region smaller than the 2GB cutoff.  So, kprobes could try
to find free pages within that range to allocate and make executable for
this purpose.  Another idea is to take advantage of the fact that modules
are always loaded into this close region, and require a module registering
a probe to provide some scratch space in its own executable code segment.
The scratch space would need to be written at probe insertion, exactly the
same time you are modifying the text anyway.

I hope these thoughts give you some encouragement that we can a very
satisfying result without any really monumental effort.


Thanks,
Roland

Follow-Ups:
- Re: x86_64 RIP-relative addressing bug
  - From: Jim Keniston

References:
- x86_64 RIP-relative addressing bug
  - From: Jim Keniston

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]