This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: memcmp-sse4.S EqualHappy bug


On Fri, 2015-06-19 at 13:08 -0400, Simo Sorce wrote:
> It is true you identify 2 different issues, but I am not interested in
> either, I merely care about failing safe.
> 
> On Fri, 2015-06-19 at 17:32 +0200, Torvald Riegel wrote:
> > I see two separate issues here.  First, where do we draw the line, and
> > what do we guarantee.  I strongly believe that programs must not have
> > data races, and that they should use atomics or other synchronization
> > properly (this doesn't mean locks, but relaxed memory order atomics,
> > for example).
> 
> Programs should not have data races, true, but sometimes they do,
> failing safe makes 'bad' programs not be worse.
> 
> > Second, do we try to keep buggy programs working.  If it has no cost
> > to do so (e.g., like it might be true in this case), then doing that
> > can help to trigger less errors.  But that doesn't mean the buggy
> > programs should get fixed eventually.
> 
> Sure, but sometimes it is hard to define something as a bug.
> 
> In the specific case Andrea reported (which I pondered on), the race was
> clearly there, but also clearly the underlying memory was never
> identical at any given point, yet memcmp reported 0 which means: the
> whole memory area was checked and matched. This was clearly not the
> case, memcmp didn't even check the whole memory area, and returned 0.

This is allowed if something is undefined behavior.  A return value of 0
only has the specified meaning if there's no undefined behavior
elsewhere.

> Now, if the 2 memory areas ever where (even for a nanosecond) identical,
> then returning zero would have been perfectly acceptable.

How do you know that when there are data races?  Making this assumption
requires making assumptions about how the modification happened.  For
example, if the modifications used sequential code too, this could
potentially write temporary values.

Second, you say "even for a nanosecond", and that seems to say to you're
thinking about some kind of total order of events.  This isn't
necessarily the case though; even if you'd use relaxed memory-order
atomic stores and loads to access the array, what a the memcpy using the
atomics would see would not necessarily be ordered as if the stores
would happen one at a time, or in some way that can be totally ordered.

> During a race
> you can't know what the result is.

The standard "specifies" undefined behavior, which allows for much more
than not knowing what the result is.  It could we conforming to burn
your computer, for example ;) 

> But a function that supposedly checks
> a data area for equality and instead stops short and return "equal" seem
> just wrong, and it is certainly unexpected.

I do understand that it could be surprising behavior, especially if
thinking at the level of what a straight-forward implementation may or
may not do.  But we have to draw the line somewhere to enable compiler
optimizations, and we don't want to describe compiler optimizations in
detail nor reason about that detail, so the big hammer of undefined
behavior is a sensible choice.

> I understand that the cause for that is the program is operating on part
> of the area currently and it shouldn't but still, I would expect memcmp
> to at least always go through the whole area it is expected to check and
> not stop short on a fluke and return a "random" result.

I am sympathetic to this viewpoint, but a random result, or undefined
behavior, is really not that unrealistic if you consider more complex C
code with data races, that may be subject to more compiler optimization
than one would expect for memcmp or such.

Just to be clear, I don't want to be simply nitpicking here, nor say
that the standard is the only true design choice, or something like
that.  I just think that it's dangerous to start considering some data
races to be "benign" because that's a truly slippery slope and the
reasoning required for that can be really complicated.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]