This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
[Bug nptl/23183] tst-robustpi4 test failure on s390x

From: "stli at linux dot ibm.com" <sourceware-bugzilla at sourceware dot org>
To: glibc-bugs at sourceware dot org
Date: Fri, 15 Jun 2018 12:55:15 +0000
Subject: [Bug nptl/23183] tst-robustpi4 test failure on s390x
Auto-submitted: auto-generated
References: <bug-23183-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=23183

Stefan Liebler <stli at linux dot ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |stli at linux dot ibm.com

--- Comment #1 from Stefan Liebler <stli at linux dot ibm.com> ---
Created attachment 11071
  --> https://sourceware.org/bugzilla/attachment.cgi?id=11071&action=edit
Reduced testcase which runs without the glibc-testsuite

I could reproduce this fail. I've used the source of glibc-upstream from today
without special cflags.
The used kernels are 4.14/4.16/4.17.

Here are some observations:
The tst-robustpi4.c includes tst-robust1.c with:
#define ENABLE_PI 1 (Thus the mutex is a PTHREAD_MUTEX_ROBUST_NP with
PTHREAD_PRIO_INHERIT protocol.)
#define NOT_CONSISTENT 1 (Thus pthread_mutex_init is called in each round)

For each round, the test creates the tf-thread which locks the mutex and is
then canceled by the main-thread.
Then the main-thread locks the mutex via pthread_mutex_lock and expects
EOWNERDEAD.
This call sometimes triggers the assertion.
If the round is odd, then the tf-thread is joined before this call to
pthread_mutex_lock.

According to the generated coredumps, the assertion occurs only if round is
even
=> Thus there was no pthread_join before the main-thread locks m1!

But the coredump contains only the main-thread but not the tf-thread.
=> At least at the time of coredump-generation, the tf-thread has already been
exited.

The following assertion has triggered:
/* ESRCH can happen only for non-robust PI mutexes where
   the owner of the lock died.  */
assert (INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust);
As "robust = 16",  the syscall returned ESRCH.
Is the comment "for non-robust PI mutexes" correct?

(gdb) p/x m1
$4 = {__data = {__lock = 0xc0000000 = FUTEX_OWNER_DIED | FUTEX_WAITERS, __count
= 0x1, __owner = 0xebc3, __nusers = 0x1, __kind = 0xb0,
    __spins = 0x0, __elision = 0x0, __list = {__prev = 0x3ffadcff9f0, __next =
0x3ffadcff9f0}},


I've extracted the relevant calls and adjusted the test in order to never use
pthread_join before the main-thread locks m1 and increased the number of
rounds.
On s390x, I've seen the assertion more often on a zVM-guest instead of running
directly on a lpar!
On x86_64, I've used a kvm-guest with Fedora 28. If I only run the test, I
don't see this assertion.
But if I e.g. build glibc while running the testcase, this assertion is also
triggered on x86_64!


I've used uprobes and setup the following uprobe events with a bash-script of
mine (The add_sym_tp-function adds a new tracepoint in a specific
shared-library - in this case libpthread.so):
# Mark new round when tf calls pthread_setcancelstate:
# Note: this is also called by assert!
#000000000000f908 <pthread_setcancelstate> clfi %r2,1
#000000000000f90e <pthread_setcancelstate+0x6> jh       000000000000f992
<pthread_setcancelstate+0x8a>
#000000000000f912 <pthread_setcancelstate+0xa> stmg     %r10,%r15,80(%r15)
#000000000000f918 <pthread_setcancelstate+0x10> ear     %r4,%a0
#000000000000f91c <pthread_setcancelstate+0x14> sllg    %r4,%r4,32
#000000000000f922 <pthread_setcancelstate+0x1a> ear     %r4,%a1
#-> 000000000000f926 <pthread_setcancelstate+0x1e> l    %r2,264(%r4)
#r4=volatile struct pthread *self = THREAD_SELF;
add_sym_tp libpthread pthread_setcancelstate+0x1e "NEWROUNDTFtid=+208(%r4):x32"

# pthread_cancel() called by main-thread and I see the thread-id
#000000000000f7e8 <pthread_cancel> l    %r1,208(%r2)
#-> 000000000000f7ec <pthread_cancel+0x4> cijnh %r1,0,000000000000f8b2
<pthread_cancel+0xca>
#  if (INVALID_TD_P (pd)) => r1=tid
add_sym_tp libpthread pthread_cancel+0x4 "TFtid=%r1:x32"

# call of syscall exit-thread (see nptl/pthread_create.c
# at the end of START_THREAD_DEFN() { ... __exit_thread (); }!
#-> 0000000000007c12 <start_thread+0x172> svc   1
# r2 is the first argument of the syscall.
# I just need something to mark this event with EXITTHREAD!
add_sym_tp libpthread start_thread+0x172 "EXITTHREAD=%r2:u32"

# __pthread_mutex_lock__full() after compare-and-swap
# and before/after the futex(FUTEX_LOCK_PI)-syscall which returns ESRCH!
#int newval = id;
## ifdef NO_INCR
#newval |= FUTEX_WAITERS;
## endif
#oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
#newval, 0);
add_sym_tp libpthread __pthread_mutex_lock_full+0x8e "CSOLDVAL=%r2:x32
NEWVAL=%r10:x32"
#0000000000009eda <__pthread_mutex_lock_full+0x8a> cs   %r2,%r10,0(%r11)
#-> 0000000000009ede <__pthread_mutex_lock_full+0x8e> cije     
%r2,0,000000000000a002 <__pthread_mutex_lock_full+0x1b2>
#0000000000009ee4 <__pthread_mutex_lock_full+0x94> cije %r9,0,000000000000a144
<__pthread_mutex_lock_full+0x2f4>
#0000000000009eea <__pthread_mutex_lock_full+0x9a> lghi %r3,6
#0000000000009eee <__pthread_mutex_lock_full+0x9e> lgr  %r2,%r11
#0000000000009ef2 <__pthread_mutex_lock_full+0xa2> lghi %r4,1
#0000000000009ef6 <__pthread_mutex_lock_full+0xa6> lghi %r5,0
#-> 0000000000009efa <__pthread_mutex_lock_full+0xaa> svc       238
#-> 0000000000009efc <__pthread_mutex_lock_full+0xac> lhi       %r3,-4096
#0000000000009f00 <__pthread_mutex_lock_full+0xb0> clrjnh      
%r2,%r3,0000000000009f12 <__pthread_mutex_lock_full+0xc2>
#if (oldval != 0)
#{
# /* The mutex is locked.  The kernel will now take care of
#    everything.  */
# int private = (robust
#    ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex)
#    : PTHREAD_MUTEX_PSHARED (mutex));
#    INTERNAL_SYSCALL_DECL (__err);
#int e = INTERNAL_SYSCALL (futex, __err, 4, &mutex->__data.__lock,
#                             __lll_private_flag (FUTEX_LOCK_PI,
#                                                 private), 1, 0);
# .... assert (INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust);
add_sym_tp libpthread __pthread_mutex_lock_full+0xaa "SVCFUTEXUADDR=%r2:x64
OP=%r3:u32 MEMVAL=+0(%r11):x32 VAL=%r4:x32 TIMEOUT=%r5:u32"
add_sym_tp libpthread __pthread_mutex_lock_full+0xac "SVCFUTEXRET=%r2:x32
MEMVAL=+0(%r11):x32"



Here is an excerpt of the trace of the attached test-program
which never joins the thread before the main-thread locks the mutex.
Note the last round is the one with the assertion:
# tracer: nop
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
           <...>-37926 [000] d... 358524.031410:
libpthread_pthread_setcancelstate_0x1e: (0x3ff9908f926) NEWROUNDTFtid=0x9426
           <...>-37926 [000] d... 358524.031411:
libpthread___pthread_mutex_lock_full_0x8e: (0x3ff99089ede) CSOLDVAL=0x0
NEWVAL=0x9426
           <...>-37715 [004] d... 358524.031418: libpthread_pthread_cancel_0x4:
(0x3ff9908f7ec) TFtid=0x9426
           <...>-37715 [004] d... 358524.031422:
libpthread___pthread_mutex_lock_full_0x8e: (0x3ff99089ede) CSOLDVAL=0x9426
NEWVAL=0x9353
           <...>-37715 [004] d... 358524.031423:
libpthread___pthread_mutex_lock_full_0xaa: (0x3ff99089efa)
SVCFUTEXUADDR=0x10030d0 OP=6 MEMVAL=0x9426 VAL=0x1 TIMEOUT=0
           <...>-37926 [000] d... 358524.031426: libpthread_start_thread_0x172:
(0x3ff99087c12) EXITTHREAD=0
           <...>-37715 [004] d... 358524.031433:
libpthread___pthread_mutex_lock_full_0xac: (0x3ff99089efc) SVCFUTEXRET=0x0
MEMVAL=0xc0009353

# ... all successfull rounds looks like the one before / after this comment.
# Note: the lock has the value: FUTEX_OWNER_DIED | FUTEX_WAITERS | <TID of
main-thread>

           <...>-37927 [000] d... 358524.031443:
libpthread_pthread_setcancelstate_0x1e: (0x3ff9908f926) NEWROUNDTFtid=0x9427
           <...>-37927 [000] d... 358524.031444:
libpthread___pthread_mutex_lock_full_0x8e: (0x3ff99089ede) CSOLDVAL=0x0
NEWVAL=0x9427
           <...>-37715 [004] d... 358524.031450: libpthread_pthread_cancel_0x4:
(0x3ff9908f7ec) TFtid=0x9427
           <...>-37715 [004] d... 358524.031454:
libpthread___pthread_mutex_lock_full_0x8e: (0x3ff99089ede) CSOLDVAL=0x9427
NEWVAL=0x9353
           <...>-37715 [004] d... 358524.031455:
libpthread___pthread_mutex_lock_full_0xaa: (0x3ff99089efa)
SVCFUTEXUADDR=0x10030d0 OP=6 MEMVAL=0x9427 VAL=0x1 TIMEOUT=0
           <...>-37927 [000] d... 358524.031458: libpthread_start_thread_0x172:
(0x3ff99087c12) EXITTHREAD=0
           <...>-37715 [004] d... 358524.031465:
libpthread___pthread_mutex_lock_full_0xac: (0x3ff99089efc) SVCFUTEXRET=0x0
MEMVAL=0xc0009353

# In the failing round, sys_exit is already called before sys_futex.
# Note: the lock has the value: FUTEX_OWNER_DIED | FUTEX_WAITERS | 0
#       and sys_futex is returning ESRCH!
           <...>-37928 [000] d... 358524.031476:
libpthread_pthread_setcancelstate_0x1e: (0x3ff9908f926) NEWROUNDTFtid=0x9428
           <...>-37928 [000] d... 358524.031478:
libpthread___pthread_mutex_lock_full_0x8e: (0x3ff99089ede) CSOLDVAL=0x0
NEWVAL=0x9428
           <...>-37715 [004] d... 358524.031484: libpthread_pthread_cancel_0x4:
(0x3ff9908f7ec) TFtid=0x9428
           <...>-37928 [000] d... 358524.031491: libpthread_start_thread_0x172:
(0x3ff99087c12) EXITTHREAD=0
           <...>-37715 [004] d... 358524.031494:
libpthread___pthread_mutex_lock_full_0x8e: (0x3ff99089ede) CSOLDVAL=0x9428
NEWVAL=0x9353
           <...>-37715 [004] d... 358524.031495:
libpthread___pthread_mutex_lock_full_0xaa: (0x3ff99089efa)
SVCFUTEXUADDR=0x10030d0 OP=6 MEMVAL=0x9428 VAL=0x1 TIMEOUT=0
           <...>-37715 [004] d... 358524.031505:
libpthread___pthread_mutex_lock_full_0xac: (0x3ff99089efc)
SVCFUTEXRET=0xfffffffd MEMVAL=0xc0000000

# assert() is also calling pthread_setcancelstate:
           <...>-37715 [004] d... 358524.031509:
libpthread_pthread_setcancelstate_0x1e: (0x3ff9908f926) NEWROUNDTFtid=0x9353



But it seems as ESRCH is a valid return-value of sys_futex while looking at the
kernel-sources (but I am no kernel expert):

sys_futex (FUTEX_LOCK_PI):
<ksrc>/kernel/futex.c: futex_lock_pi()
-> futex_lock_pi_atomic()
{
...
        /*
         * First waiter. Set the waiters bit before attaching ourself to
         * the owner. If owner tries to unlock, it will be forced into
         * the kernel and blocked on hb->lock.
         */
        newval = uval | FUTEX_WAITERS;
        ret = lock_pi_update_atomic(uaddr, uval, newval);
        if (ret)
                return ret;

# FUTEX_WAITERS flag is set. Now we race with sys_exit().

        /*
         * If the update of the user space value succeeded, we try to
         * attach to the owner. If that fails, no harm done, we only
         * set the FUTEX_WAITERS bit in the user space variable.
         */
        return attach_to_pi_owner(uval, key, ps);
}

attach_to_pi_owner()
{
...
        p = futex_find_get_task(pid);
        if (!p)
                return -ESRCH;
# Here is one ESRCH.
...
        /*
         * We need to look at the task state flags to figure out,
         * whether the task is exiting. To protect against the do_exit
         * change of the task flags, we do this protected by
         * p->pi_lock:
         */
        raw_spin_lock_irq(&p->pi_lock);
        if (unlikely(p->flags & PF_EXITING)) {
                /*
                 * The task is on the way out. When PF_EXITPIDONE is
                 * set, we know that the task has finished the
                 * cleanup:
                 */
                int ret = (p->flags & PF_EXITPIDONE) ? -ESRCH : -EAGAIN;
# Here is another one.

                raw_spin_unlock_irq(&p->pi_lock);
                put_task_struct(p);
                return ret;
        }
}

/*
 * Process a futex-list entry, check whether it's owned by the
 * dying task, and do notification if so:
 */
int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, int pi)
{
        u32 uval, uninitialized_var(nval), mval;

retry:
        if (get_user(uval, uaddr))
                return -1;

        if ((uval & FUTEX_TID_MASK) == task_pid_vnr(curr)) {
                /*
                 * Ok, this dying thread is truly holding a futex
                 * of interest. Set the OWNER_DIED bit atomically
                 * via cmpxchg, and if the value had FUTEX_WAITERS
                 * set, wake up a waiter (if any). (We have to do a
                 * futex_wake() even if OWNER_DIED is already set -
                 * to handle the rare but possible case of recursive
                 * thread-death.) The rest of the cleanup is done in
                 * userspace.
                 */
                mval = (uval & FUTEX_WAITERS) | FUTEX_OWNER_DIED;
# In the assertion case, I see exactly this pattern.
# Note: FUTEX_WAITERS is already set - see futex_lock_pi_atomic().
                /*
                 * We are not holding a lock here, but we want to have
                 * the pagefault_disable/enable() protection because
                 * we want to handle the fault gracefully. If the
                 * access fails we try to fault in the futex with R/W
                 * verification via get_user_pages. get_user() above
                 * does not guarantee R/W access. If that fails we
                 * give up and leave the futex locked.
                 */
                if (cmpxchg_futex_value_locked(&nval, uaddr, uval, mval)) {
                        if (fault_in_user_writeable(uaddr))
                                return -1;
                        goto retry;
                }
                if (nval != uval)
                        goto retry;

                /*
                 * Wake robust non-PI futexes here. The wakeup of
                 * PI futexes happens in exit_pi_state():
                 */
                if (!pi && (uval & FUTEX_WAITERS))
                        futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
        }
        return 0;
}




sys_exit():
<ksrc>/kernel/exit.c: do_exit()
{
        ...

        exit_signals(tsk);  /* sets PF_EXITING */
        /*
         * Ensure that all new tsk->pi_lock acquisitions must observe
         * PF_EXITING. Serializes against futex.c:attach_to_pi_owner().
         */
        smp_mb();

        ...

        exit_mm();
# This calls exit_robust_list() - see futex.c - which walks along the userspace
# robust-list of the exited thread and calls handle_futex_death() - see
futex.c.

        /*
        * We can do this unlocked here. The futex code uses this flag
        * just to verify whether the pi state cleanup has been done
        * or not. In the worst case it loops once more.
        */
        tsk->flags |= PF_EXITPIDONE;
}



I think ESRCH is a valid case and glibc is not allowed to trigger the
assertion?
Shall it also add the tid of the calling thread to the lock-value in this case?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]