This is the mail archive of the
mailing list for the binutils project.
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, GCC Development <gcc at gcc dot gnu dot org>, Binutils <binutils at sourceware dot org>, "Girkar, Milind" <milind dot girkar at intel dot com>, "Kreitzer, David L" <david dot l dot kreitzer at intel dot com>
- Date: Thu, 25 Jul 2013 10:11:35 -0700
- Subject: Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
- References: <CAMe9rOp=1v38F_aV-pbv50YOGSEr_ju+byZP1L_G_h4bm5Ad3w at mail dot gmail dot com> <20130724233621 dot DA6942C08C at topped-with-meat dot com>
On Wed, Jul 24, 2013 at 4:36 PM, Roland McGrath <firstname.lastname@example.org> wrote:
> I've read through the MPX spec once, but most of it is still not very
> clear to me. So please correct any misconceptions. (HJ, if you answer
> any or all of these questions in your usual style with just, "It's not a
> problem," I will find you and I will kill you. Explain!)
> Will an MPX-using binary require an MPX-supporting dynamic linker to run
Yes. But you may lose MPX protection in MPX library since bound registers
are cleared in the first call with lazy bounding:
MPX code -> PLT -> ld.so -> PLT -> MPX library
> Those are the background questions to help me understand better.
> Now, to your specific questions.
> Now, assuming we are talking about a uniform PLT in each object, there
> is the question of whether to use a new PLT layout everywhere, or only
> when linking an object with some input files that use MPX.
I am proposing the uniform PLT in each object. That was my first
> * My initial reaction was to say that we should just change it
> unconditionally to keep things simple: use new linker, get new format,
> end of story. Simplicity is good.
This is my thinking also.
> * But, doubling the size of PLT entries means more i-cache pressure. If
> cache lines are 64 bytes, then today you fit four entries into a cache
> line. Assuming PLT entries are more used than unused, this is a good
> thing. Reducing that to two entries per cache line means twice as
> many i-cache misses if you hit a given PLT frequently (with even
> distribution of which entries you actually use--at any rate, it's
> "more" even if it's not "twice as many"). Perhaps this is enough cost
> in real-world situations to be worried about. I really don't know.
> * As I mentioned before, there are things floating around that think
> they know the size of PLT entries. Realistically, there will be
> plenty of people using new tools to build binaries but not using MPX
> at all, and these people will give those binaries to people who have
> old tools. In the case of someone running an old objdump on a new
> binary, they would see bogus foo@plt pseudo-symbols and be misled and
> confused. Not to mention the unknown unknowns, i.e. other things that
> "know" the size of PLT entries that we don't know about or haven't
> thought of here. It's just basic conservatism not to perturb things
> for these people who don't care about or need anything related to MPX
> at all.
We can investigate if the old objdump can deal with PLT entry size
> How a relocatable object is marked so that the linker knows whether its
> code is MPX-compatible at link time and how a DSO/executable is marked
> so that the dynamic linker knows at runtime are two separate subjects.
> For relocatable objects, I don't think there is really any precedent for
> using ELF notes to tell the linker things. It seems much nicer if the
We have been using .note.GNU-stack section at link-time for a long time.
> linker continues to treat notes completely normally, i.e. appending
> input files' same-named note sections together like with any other named
> section rather than magically recognizing and swallowing certain notes.
> OTOH, the SHT_GNU_ATTRIBUTES mechanism exists for exactly this sort of
> purpose and is used on other machines for very similar sorts of issues.
> There is both precedent and existing code in binutils to have the linker
> merge attribute sections from many input files together in a fashion
> aware of the semantics of those sections, and to have those attributes
> affect the linker's behavior in machine-specific ways. I think you have
> to make a very strong case to use anything other than SHT_GNU_ATTRIBUTES
> for this sort of purpose in relocatable objects.
> For linked objects, there a couple of obvious choices. They all require
> that the linker have special knowledge to create the markings. One
> option is a note. We use .note.ABI-tag for a similar purpose in libc,
> but I don't know of any precedent for the linker synthesizing notes.
> The most obvious choice is e_flags bits. That's what other machines use
> to mark ABI variants. There are no bits assigned for x86 yet. There
> are obvious limitations to using e_flags, in that it's part of the
> universal ELF psABI rather than something with vendor extensibility
> built in like notes have, and in that there are only 32 bits available
> to assign rather than being a wholly open-ended format like notes. But
> using e_flags is certainly simpler to synthesize in the linker and
> simpler to recognize in the dynamic linker than a note format. I think
> you have to make at least a reasonable (objective) case to use a note
> rather than e_flags, though I'm certainly not firmly against a note.
My main concerns are e_flags isn't very extensible and
the old tools may not be able to handle it properly. A note
section is backward compatible. Given that MPX insn are
NOPs on older hardware, it is safe to ignore it. If we use the note
section in linked objects, it is more consistent to also use it
In relocatable files. We just need to dump the note section to
get the MPX info for both relocatable files and linked objects.
> Finally, you've only mentioned x86-64. The hardware details apply about
> the same to x86-32 AFAICT. If this is something that we'll eventually
> want to do for x86-32 as well, then I think we should at least hash out
> the plan for x86-32 fairly thoroughly before committing to a plan for
> x86-64 (even if the actual implementation for x86-32 lags). Probably
> it's all much the same and working it through for x86-32 won't give us
> any pause in our x86-64 plans, but we won't know until we actually do it.
For ia32, my question is if MPX should be supported for functions
with the regparm attribute. If not, there is no problem with PLT
since bound registers won't be used for passing bounds for
pointers passed in registers and PLT isn't used for function returns.
If we want to support MPX for functions with the regparm attribute,
we will run into the same issue as x86-64. My preference is not
to support MPX functions with the regparm attribute.