This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] improve tls access for tolower table and errno


On 06/06/2015 02:40 PM, OndÅej BÃlka wrote:
> Hi, as I mentioned before that inline strcasecmp would be problematic as
> it needs to get call to tolower which is suboptimal.
> 
> On architectures with tls register you don't need to do call call for tls 
> access but start small. 

Making the offset part of the ABI (like we do for the stack canary) has
been discussed before:

<https://sourceware.org/ml/libc-alpha/2015-03/msg00132.html>

> A sample implementation would be following, where should I add
> initializer and is there other way to get %fs than assembly?
> 
> #include <errno.h>
> #include <stdio.h>
> 
> static long __errno_offset;
> __attribute__((constructor))
> void  get_offset ()
> {
>   char *offset;
>   char *location = &errno;
>  __asm__ ("mov %%fs:0, %0" : "=r" (offset));
> 
>   __errno_offset = location - offset;
> }
> 
> static __always_inline 
> int *
> __ep()
> {
>   char *__offset;
>   __asm__ ("mov %%fs:0, %0" : "=r" (__offset));
> 
>   return (int *)(__offset + __errno_offset);
> }
> 
> #define errno2 (*__ep())

Constructor functions in header files are a nightmare.  C++ has
something similar for <iostream>, and the overhead from that is
substantial.  Many projects ban inclusion of <iostream> as a result.

The problem remains that errno is mostly used on error paths and

  call __errno_location
  movl (%rax), %eax

is much shorter than

  movq	__errno_offset(%rip), %rax
  movq %fs:0, %rdx
  movl (%rax, %rdx), %eax

(7 versus 19 bytes).  On paths which are supposed to be executed rarely,
this is not desirable.  There might be some wins because less spilling
is needed, but this seems rather theoretical because in most cases, the
__errno_location call clobbers registers which have been clobbered by
the preceding function call that failed.  Therefore, I don't expect wins
on this front, either.

With the thread locale, performance concerns are different, but the
constructor issue is still valid.

Furthermore, future C++ versions may make caching the addresses of
thread-local variables invalid, so we should wait until the fate of
resumable functions and coroutines is decided, and what shape they take.

-- 
Florian Weimer / Red Hat Product Security


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]