This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 6/6] Compile sched_getaffinity.c with -fno-builtin-memset


On Fri, 14 Aug 2015, H.J. Lu wrote:

> On Fri, Aug 14, 2015 at 8:27 AM, Joseph Myers <joseph@codesourcery.com> wrote:
> > On Fri, 14 Aug 2015, H.J. Lu wrote:
> >
> >> Since sched_getaffinity.c calls memset which may not be inlined, we
> >> should compile it with -fno-builtin-memset so that the internal hidden
> >> memset is called without PLT.
> >>
> >> OK for master?
> >
> > memset is called all over the place.  I don't like hardcoding special
> > options for particular files based on calls to memset, and especially not
> > based on what some particular compiler version happens to do with a
> > particular call.  Why doesn't the call to libc_hidden_builtin_proto
> > (memset) in include/string.h suffice?  Can you devise some set of
> > declarations / toplevel asms to put in a header that will ensure memset
> > calls go via a hidden alias (ideally with the compiler knowing it's
> > hidden, not just the assembler / linker, so the compiler can do any
> > optimizations based on it being an intra-library call)?
> 
> libc_hidden_builtin_proto (memset)
> 
> has no impact on memset inlined by compiler.

Well, I see references to __GI_memset, not memset, in sched_getaffinity.os 
for both x86 and x86_64 (building with GCC 5).  So it seems to be working 
OK for me.  It's true some of those references use PLT-generating 
relocations, but those relocations shouldn't actually generate PLT entries 
when they are relocations against symbols not exported from the library - 
the linker should convert them to direct calls.  If they do generate PLT 
entries, it sounds like a linker optimization is missing.

Apart from any missing linker optimizations, the compiler should know when 
calls are to hidden functions as calls to those may be more efficient than 
calls that only the linker determines don't need a PLT entry.  On 32-bit 
x86, that means the compiler knowing it doesn't need to set %ebx up to 
point to the GOT as it would for a call through the PLT.  I don't know 
whether this gets optimized for built-in function calls using 
libc_hidden_builtin_proto, but if it doesn't, that seems like a missed 
optimization that should be addressed in the compiler not in glibc.

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]