This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: memcmp-sse4.S EqualHappy bug
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Andrea Arcangeli <aarcange at redhat dot com>
- Cc: Torvald Riegel <triegel at redhat dot com>, Szabolcs Nagy <nsz at port70 dot net>, libc-alpha at sourceware dot org, "H.J. Lu" <hongjiu dot lu at intel dot com>
- Date: Thu, 18 Jun 2015 15:46:33 +0200
- Subject: Re: memcmp-sse4.S EqualHappy bug
- Authentication-results: sourceware.org; auth=none
- References: <20150617172903 dot GC4317 at redhat dot com> <20150617185952 dot GE22285 at port70 dot net> <20150617210612 dot GB14955 at redhat dot com> <1434621040 dot 5250 dot 212 dot camel at localhost dot localdomain> <20150618124900 dot GD14955 at redhat dot com>
On Thu, Jun 18, 2015 at 02:49:00PM +0200, Andrea Arcangeli wrote:
> On Thu, Jun 18, 2015 at 11:50:40AM +0200, Torvald Riegel wrote:
> > Nonetheless, at least the C++ committee is working on specifying a means
> > to access non-atomic-typed data atomically (but requiring all concurrent
> > accesses to also be atomic then (e.g., by using the same means)).
>
> On a side note in the kernel we already do this to access not-atomic
> regions "safely" (or as safe as feasible today) in C:
>
> static __always_inline void __read_once_size(const volatile void *p, void *res, int size)
> {
> switch (size) {
> case 1: *(__u8 *)res = *(volatile __u8 *)p; break;
> case 2: *(__u16 *)res = *(volatile __u16 *)p; break;
> case 4: *(__u32 *)res = *(volatile __u32 *)p; break;
> case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
> default:
> barrier();
> __builtin_memcpy((void *)res, (const void *)p, size);
> barrier();
> }
> }
>
> static __always_inline void __write_once_size(volatile void *p, void *res, int size)
> {
> switch (size) {
> case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
> case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
> case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
> case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
> default:
> barrier();
> __builtin_memcpy((void *)p, (const void *)res, size);
> barrier();
> }
> }
>
> #define READ_ONCE(x) \
> ({ union { typeof(x) __val; char __c[1]; } __u; __read_once_size(&(x), __u.__c, sizeof(x)); __u.__val; })
>
> #define WRITE_ONCE(x, val) \
> ({ typeof(x) __val = (val); __write_once_size(&(x), &__val, sizeof(__val)); __val; })
>
> As you can see if __builtin_memcpy starts to leverage the pure C
> semantics (like not copying everything if the data changes at the
> start of the memory buffer so triggering undefined beahvior), things
> would get "interesting" as far as the kernel is concerned. I mean we
> depend on memcpy to explicitly not behave like memcmp is behaving now.
>
> For non-atomic regions, we can read/write it fine with
> READ_ONCE/WRITE_ONCE, we just lost a way to do memcmp in userland on
> non-atomic regions.
>
As I wrote before it wouldn't work. You need create something like READ_ONCE2(x,y)
to get both arguments simultaneously, otherwise I could create race with
(memcmp(x,x+1,1)) and other thread atomically changing both characters
by first writing 1, then 256.