This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: memcmp-sse4.S EqualHappy bug
- From: Torvald Riegel <triegel at redhat dot com>
- To: Andrea Arcangeli <aarcange at redhat dot com>
- Cc: Szabolcs Nagy <nsz at port70 dot net>, libc-alpha at sourceware dot org, "H.J. Lu" <hongjiu dot lu at intel dot com>
- Date: Thu, 18 Jun 2015 16:23:35 +0200
- Subject: Re: memcmp-sse4.S EqualHappy bug
- Authentication-results: sourceware.org; auth=none
- References: <20150617172903 dot GC4317 at redhat dot com> <20150617185952 dot GE22285 at port70 dot net> <20150617210612 dot GB14955 at redhat dot com> <1434621040 dot 5250 dot 212 dot camel at localhost dot localdomain> <20150618124900 dot GD14955 at redhat dot com>
On Thu, 2015-06-18 at 14:49 +0200, Andrea Arcangeli wrote:
> On Thu, Jun 18, 2015 at 11:50:40AM +0200, Torvald Riegel wrote:
> > Nonetheless, at least the C++ committee is working on specifying a means
> > to access non-atomic-typed data atomically (but requiring all concurrent
> > accesses to also be atomic then (e.g., by using the same means)).
>
> On a side note in the kernel we already do this to access not-atomic
> regions "safely" (or as safe as feasible today) in C:
>
> static __always_inline void __read_once_size(const volatile void *p, void *res, int size)
> {
> switch (size) {
> case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
> case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
> case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
> case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
> default:
> barrier();
> __builtin_memcpy((void *)res, (const void *)p, size);
> barrier();
> }
> }
>
> static __always_inline void __write_once_size(volatile void *p, void *res, int size)
> {
> switch (size) {
> case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
> case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
> case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
> case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
> default:
> barrier();
> __builtin_memcpy((void *)p, (const void *)res, size);
> barrier();
> }
> }
>
> #define READ_ONCE(x) \
> ({ union { typeof(x) __val; char __c[1]; } __u; __read_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })
>
> #define WRITE_ONCE(x, val) \
> ({ typeof(x) __val = (val); __write_once_size(&(x), &__val, sizeof(__val)); __val; })
>
> As you can see if __builtin_memcpy starts to leverage the pure C
> semantics (like not copying everything if the data changes at the
> start of the memory buffer so triggering undefined beahvior), things
> would get "interesting" as far as the kernel is concerned. I mean we
> depend on memcpy to explicitly not behave like memcmp is behaving now.
There are a few differences to your memcmp test case though. If
barrier() is a compiler barrier, this should "remove" prior knowledge
the compiler has about memory. Second, you access volatile-qualified
data, which at least requires to do the accesses byte-wise or similar.
Does the memcmp test case also fail when you acess volatiles?
Note that I'm not saying that this is guaranteed to work. It's still UB
if there is a data race. However, in practice, this might just continue
to work.
> For non-atomic regions, we can read/write it fine with
> READ_ONCE/WRITE_ONCE, we just lost a way to do memcmp in userland on
> non-atomic regions.
Userland programs should really try to transition to atomics and the
C11/C++11 memory model or just the __atomic builtins, if possible.