This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC/PATCH] ARM: VDSO support
- From: Nathan Lynch <Nathan_Lynch at codesourcery dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: <libc-alpha at sourceware dot org>
- Date: Tue, 7 Apr 2015 16:19:23 -0500
- Subject: Re: [RFC/PATCH] ARM: VDSO support
- Authentication-results: sourceware.org; auth=none
- References: <1428081934-22419-1-git-send-email-nathan_lynch at codesourcery dot com> <alpine dot DEB dot 2 dot 10 dot 1504071643550 dot 20250 at digraph dot polyomino dot org dot uk>
On 04/07/2015 11:46 AM, Joseph Myers wrote:
> On Fri, 3 Apr 2015, Nathan Lynch wrote:
>
>> This patch adds support for the ARM VDSO to glibc. I have run make
>> check on OMAP5 using kernels with and without the VDSO, with no new
>> failures.
>
> How does this compare to the implementations for other architectures?
One consideration that informed this implementation for ARM is that the
timestamp-related APIs (gettimeofday, clock_gettime) rely on an
architecture extension that is implemented on a subset of CPUs. CPUs
such as Cortex-A9 which do not have this extension can be expected to
remain in wide use for some time. So I have paid particular attention
to the cost imposed by this code on systems where the VDSO is _not_
useful for those APIs. (On such systems the kernel presents a VDSO
which returns NULL for lookups of the symbols for gettimeofday and
clock_gettime).
I referred specifically to the recent x86 VDSO work when writing this.
Although unlike the support which was added for the 32-bit x86 VDSO,
this code introduces a test and branch into the system call path rather
than an unconditional dispatch through a function pointer. This seems
to impose the least overhead. For example on i.MX6 (Cortex-A9), where I
measured gettimeofday() to take 560-570ns, the overhead introduced by
unconditional dispatch was ~30ns, while the proposed patch adds only ~15ns.
If I can employ ifunc as Adhemerval suggested, I expect the overhead
could be reduced further or eliminated.
I avoided implementing INLINE_VSYSCALL for ARM partly out of the perhaps
mistaken belief that it would require assembly (both arm and thumb
versions of INLINE_VSYSCALL_NCS), and partly in order to avoid using
preprocessor-pasting which defeats grep (__vdso_##name). But if
INLINE_VSYSCALL is the preferred method, I can use that; I have an older
version of the patch which has it.
> What are the points on which the VDSO interfaces vary between
> architectures?
The vdso(7) page from the Linux man-pages project lists the VDSO
interfaces by architecture and seems to be accurate/complete except for
the omission of tile (see the "ARCHITECTURE-SPECIFIC NOTES" section):
http://man7.org/linux/man-pages/man7/vdso.7.html
There is a lot of variation (even in the soname), but it is common to
provide at least gettimeofday and clock_gettime.
> It would seem desirable for the code to be factored so
> that the ARM code only contains the minimal set of things that are
> necessarily ARM-specific (for example, just declaring the ARM-specific
> choices before including architecture-independent files shared by all
> architectures that can get these functions from a VDSO).
OK. While I agree that seems desirable, I think it's worth emphasizing
that unlike the other architectures, VDSO acceleration of the
time-related syscalls on ARM cannot be considered the usual case, as
discussed above. I don't want to pessimize the "fallback" syscall case
on ARM unnecessarily.