This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]

From: Torvald Riegel <triegel at redhat dot com>
To: Florian Weimer <fweimer at redhat dot com>
Cc: GNU C Library <libc-alpha at sourceware dot org>
Date: Mon, 21 Dec 2015 16:21:23 +0100
Subject: Re: [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]
Authentication-results: sourceware.org; auth=none
References: <56389C10 dot 6060001 at redhat dot com> <56642691 dot 7060806 at redhat dot com> <1449852593 dot 27311 dot 98 dot camel at localhost dot localdomain> <566F1114 dot 1060706 at redhat dot com> <1450380373 dot 26597 dot 29 dot camel at localhost dot localdomain> <56740C24 dot 90901 at redhat dot com>

On Fri, 2015-12-18 at 14:37 +0100, Florian Weimer wrote:
> On 12/17/2015 08:26 PM, Torvald Riegel wrote:
> > On Mon, 2015-12-14 at 19:57 +0100, Florian Weimer wrote:
> >> On 12/11/2015 05:49 PM, Torvald Riegel wrote:
> >>> I would provide more detail on the synchronization of read-only
> >>> traversals of the list.  list_lock seems to be used to enforce mutual
> >>> exclusion between modification operations to the list (including
> >>> necessary reading/traversal of the list, so you can't go in without a
> >>> lock, figure out you need to change, and then grab list_lock).  For
> >>> traversal-only / write-op synchronization, atomic operations are used.
> >>> This is the reason why write ops do not need to use acquire fences when
> >>> traversing the list.  The acquire fences are missing in reused_arena,
> >>> and ptmalloc_[un]lock_all[2], it seems.
> >>
> >> Yeah.  reused_arena also has a data race on the local static variable
> >> (and this one could actually be harmful).
> > 
> > The lack of acquire fences is harmful.  Or do you mean that this piece
> > of code might be more likely to cause code-generation-related problems
> > triggered by data races?
> 
> I'm worried that the exit condition used for the loop does not trigger
> reliably and that two or more threads could loop endlessly in
> reused_arena.  I don't know if it can happen in practice.

I think eventually, a thread would pick up the most recent values and
given that the arena's stick around forever, the list navigation bug
might not lead to an error.  However, when a new arena is added, a
thread may read unitialized values for the elements of the arena struct
(because the pointer to the arena is published with mo_release but not
read with mo_acquire).

> > I think a FIXME would be good; we know there is a bug, so we should just
> > document that.
> 
> The attached version of the patch points out the data races I know of.
> I also adjusted the comments in struct malloc_state in malloc.c.

OK.  The changes should be enough to make the issue stand out.

> > I'm not sure whether my question came across, so let me ask again:  When
> > the lock is released temporarily, we still holds a pointer to the arena.
> > Until when we reacquire the lock, what can happen to the arena?  Can the
> > arena and it's memory be deleted or similar, and the memory be reused
> > for something else?
> 
> Arenas are allocated once and then stick around until the process exits.

Okay, that's important, and it seems to be required for correctness.

> The locking during arena selection is only there to select an
> uncontended arena.  Releasing an arena lock and reacquiring it changes
> the way arenas are selected and means that another thread is more likely
> to select the same arena.
> 
> I'm not sure why we even perform arena selection in parallel.  The risk
> is that the lock probing detects the additional selection process, but
> not actual arena usage (for which the acquired lock might be a poor
> indicator anyway).

Agreed, based on your description of the situation.

Overall, this patch looks good to me.  Thanks!

References:
- [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]
  - From: Florian Weimer
- Re: [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]
  - From: Torvald Riegel
- Re: [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]
  - From: Florian Weimer
- Re: [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]
  - From: Torvald Riegel
- Re: [PATCH v2] malloc: Fix list_lock/arena lock deadlock [BZ #19182]
  - From: Florian Weimer

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]