This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: pthread_rwlock and writers preference

From: chrubis at suse dot cz
To: Torvald Riegel <triegel at redhat dot com>
Cc: Xiaoguang Wang <wangxg dot fnst at cn dot fujitsu dot com>, libc-alpha at sourceware dot org
Date: Thu, 22 May 2014 15:42:59 +0200
Subject: Re: pthread_rwlock and writers preference
Authentication-results: sourceware.org; auth=none
References: <533D4314 dot 1000306 at cn dot fujitsu dot com> <1399481873 dot 32485 dot 14490 dot camel at triegel dot csb>

Hi!
> I think the LTP test is invalid.  The fundamental problem is that thread
> A cannot detect whether thread B has blocked on a lock but not yet
> acquired the lock:
> * For it to block and wait for another thread C to unlock, C must hold a
> read lock (or write lock, depending on what kind of locking by B we want
> to detect).
> * We can detect C's presences, but we can't detect whether B is already
> blocked or doing something else before actually attempting to block.
> * B does not communicate anywhere that it is blocked.
> * POSIX does not specify a maximum time that B is allowed to actually
> start blocking.  Without that, one cannot make time-based assumptions.
> 
> The LTP test is making a time-based assumption, so I think it's invalid.
> 
> At first sight, this argument may look like trying hard to avoid what
> POSIX requires -- but if you think more about it, you'll notice that
> this is just an inherent property of how asynchronous systems work.
> There's a reason why things like liveness consider eventual progress,
> not bounded progress.
> 
> One could argue that the realtime / priority scheduling provisions that
> the test is trying to use would reveal whether B has run already and
> should be blocked: If we have a thread D with lowest priority that spins
> until B signals that it's about to start blocking, then if D sees that
> signal and is running, it could conclude that B doesn't need a CPU and
> thus has blocked.  However, that requires an upper limit on available
> resources, which isn't really practical to assume or enforce.  Perhaps
> more importantly, this assumes that there can be no other reason at all
> that B could block; this would limit, for example, the blocking that B
> is allowed to do internally (e.g., when implementing a function call,
> malloc, ...).  This also seems impractical.

I more or less agree with this. And I sometimes wonder if there is
something similar to Godel's incompleteness for API specification (you
cannot test it all with the assertions from specification).

However for practical purposes you have to deal with this somehow.
Adding timeouts and expecting the code to finish in reasonable time is
one of the usuall tradeoffs.

Or in shorter version, you can pick any two of:

1. testcases proven correct
2. complete coverage
3. sticking only to assertion from specification

Open Posix Testsuite requires the third one by definition which makes
writing testcases harder.

> > And there were some clarification requests for Austin Group about this issue.
> > see:
> > 
> > http://austingroupbugs.net/view.php?id=720
> > http://austingroupbugs.net/view.php?id=722
> > 
> > According to the discussion results, it seems that write locks should be implemented to
> > take precedence before the read locks. 
> 
> I don't see this as the outcome.  The current POSIX spec (with the
> changes resulting from 720 and 722) states that unless Thread Execution
> Scheduling applies, the implementation can decide whether it blocks
> readers or writers:
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_rwlock_rdlock.html
> 
> > But actually in glibc, if a reader has owned the lock and a writers is waiting on the lock,
> > now if another reader tries to get this lock, it will succeed. This is not correct behavior
> > according to pthread_rwlock_rdlock()'s manpage.
> > 
> > Would anyone help to confirm whether is this a glibc bug or should be fixed? Thanks.
> 
> If any of the threads involved is not running under SCHED_FIFO or
> SCHED_RR, glibc's behavior is allowed by the specification, I think.

It looks to me like this applies to SCHED_SPORADIC as well but apart
from that I agree.

> Otherwise, it does seem to violate the POSIX requirements, but one can
> argue whether those are practical to implement.  If a thread executing
> rdlock sees that there is a waiting wrlock thread, it would have to
> check whether it already acquired the rwlock (ie, whether this is a
> recursive rdlock); only if it does it can ignore the waiter.  This
> requires more space than available in a realistic rwlock data structure,
> so does require memory allocation.  I believe we can implement that, but
> it will likely be slower, especially when actually doing recursive
> rdlock calls or several nested rdlocks to separate rwlock instances.

I this consensus from all glibc maintainers?

If so, can we close the bug with this reason and I will be happy to
disable these tests on Linux referencing the glibc bug.

> In summary, I think the case of starving a high-priority writer that
> POSIX tries to disallow is realistic, and it's possible to implement at
> a cost.  OTOH, one can't write a reliable test for the expected
> behavior, which doesn't exactly support that we need a change.

Again I seems that we are in agreement here. Can proceed with concrete
steps and solve the bug? :)

-- 
Cyril Hrubis
chrubis@suse.cz

References:
- Re: pthread_rwlock and writers preference
  - From: Torvald Riegel

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]