This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug dynamic-link/22745] New: _nptl_setxid can loop forever if a dlmopen namespace tries to initialise pthreads after the main namespace does
- From: "vivek at collabora dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Wed, 24 Jan 2018 21:45:26 +0000
- Subject: [Bug dynamic-link/22745] New: _nptl_setxid can loop forever if a dlmopen namespace tries to initialise pthreads after the main namespace does
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=22745
Bug ID: 22745
Summary: _nptl_setxid can loop forever if a dlmopen namespace
tries to initialise pthreads after the main namespace
does
Product: glibc
Version: 2.24
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: dynamic-link
Assignee: unassigned at sourceware dot org
Reporter: vivek at collabora dot com
Target Milestone: ---
Created attachment 10759
--> https://sourceware.org/bugzilla/attachment.cgi?id=10759&action=edit
Test case - build two executables. One uses dlmopen and triggers the lock up,
the other uses dlopen and does not.
Stumbled open this while testing pulseaudio in conjunction with
dlmopen: pulseaudio seems to lock up very soon after it starts.
A bit of digging with strace and gdb shows that when it locks up
it does so inside setresuid. A bit more digging indicates that the
code is infinite looping here:
__nptl_setxid (cmdp=0xffffd9d8) at allocatestack.c:1105
+list
1103
1104 /* Now the list with threads using user-allocated stacks. */
1105 list_for_each (runp, &__stack_user)
1106 {
1107 struct pthread *t = list_entry (runp, struct pthread, list);
1108 if (t == self)
1109 continue;
1110
1111 setxid_mark_thread (cmdp, t);
1112 }
For some reason, list_for_each never terminates.
If I disable the dlmopen code path then the following holds at that
point in the code:
Breakpoint 6, __nptl_setxid (cmdp=0xffffd9e8) at allocatestack.c:1105
1105 list_for_each (runp, &__stack_user)
+bt
#0 __nptl_setxid (cmdp=0xffffd9e8) at allocatestack.c:1105
#1 0xf7b96162 in __GI___setresuid (ruid=1000, euid=1000, suid=1000)
at ../sysdeps/unix/sysv/linux/i386/setresuid.c:29
#2 0x5655b7f0 in pa_drop_root ()
#3 0x56558a6e in main ()
Digging into __stack_user:
+p __stack_user
$1 = {next = 0xf73a48a0, prev = 0xf73a48a0}
+p &__stack_user
$2 = (list_t *) 0xf7d1d1a4 <__stack_user>
+p (&__stack_user)->next
$3 = (struct list_head *) 0xf73a48a0
+p (&__stack_user)->next->next
$4 = (struct list_head *) 0xf7d1d1a4 <__stack_user>
+p (&__stack_user)->next->next->next
$5 = (struct list_head *) 0xf73a48a0
We find a circular linked list, which contains a pointer to __stack_user.
Since list_for_each is invoked as list_for_each(…, &__stack_user),
this means the for loop it implements will terminate, allowing setresuid
to proceed.
// ============================================================================
Note: The definition of list_for_each is this:
# define list_for_each(pos, head) \
for (pos = (head)->next; pos != (head); pos = pos->next)
// ============================================================================
Now let's examine the same case with the dlmopen call back in place:
Breakpoint 6, __nptl_setxid (cmdp=0xffffd9d8) at allocatestack.c:1105
1105 list_for_each (runp, &__stack_user)
⋮
+p __stack_user
$1 = {next = 0xf76eeb60, prev = 0xf76eeb60}
+p &__stack_user
$2 = (list_t *) 0xf7d8f1a4 <__stack_user>
+p (&__stack_user)->next
$3 = (struct list_head *) 0xf76eeb60
+p (&__stack_user)->next->next
$4 = (struct list_head *) 0xf71391a4
+p (&__stack_user)->next->next->next
$5 = (struct list_head *) 0xf76eeb60
We can see we have a circular linked list, as before, but it does
_not_ contain the element supplied as the head to list_for_each:
We're going to loop forever.
============================================================================
Next let's try and figure out when/where this happens.
Setting various breakpoints and watches we uncover the following:
+run
Starting program: /usr/bin/pulseaudio --start
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Breakpoint 1, __pthread_initialize_minimal_internal () at nptl-init.c:290
290 {
+break allocatestack.c:1105
Breakpoint 6 at 0xf7d78b2c: file allocatestack.c, line 1105.
+watch __stack_user
Hardware watchpoint 7: __stack_user
+watch __stack_user.next
Hardware watchpoint 8: __stack_user.next
+cont
Continuing.
Hardware watchpoint 7: __stack_user
Old value = {next = 0x0, prev = 0x0}
New value = {next = 0xf7d8f1a4 <__stack_user>, prev = 0x0}
Hardware watchpoint 8: __stack_user.next
Old value = (struct list_head *) 0x0
New value = (struct list_head *) 0xf7d8f1a4 <__stack_user>
__pthread_initialize_minimal_internal () at nptl-init.c:377
377 list_add (&pd->list, &__stack_user);
+cont
Continuing.
Hardware watchpoint 7: __stack_user
Old value = {next = 0xf7d8f1a4 <__stack_user>, prev = 0x0}
New value = {next = 0xf7d8f1a4 <__stack_user>, prev = 0xf76eeb60}
list_add (head=<optimized out>, newp=0xf76eeb60) at ../include/list.h:64
64 head->next = newp;
+cont
Continuing.
Hardware watchpoint 7: __stack_user
Old value = {next = 0xf7d8f1a4 <__stack_user>, prev = 0xf76eeb60}
New value = {next = 0xf76eeb60, prev = 0xf76eeb60}
Hardware watchpoint 8: __stack_user.next
Old value = (struct list_head *) 0xf7d8f1a4 <__stack_user>
New value = (struct list_head *) 0xf76eeb60
__pthread_initialize_minimal_internal () at nptl-init.c:381
381 THREAD_SETMEM (pd, report_events, __nptl_initial_report_events);
+cont
Continuing.
Breakpoint 2, __pthread_init_static_tls (map=0x5657e040) at
allocatestack.c:1210
1210 {
// ============================================================================
// At this point we step to the end of __pthread_init_static_tls and set
// an extra watch point on the address currently holding &__stack_user
// ============================================================================
+p __stack_user.next
$1 = (struct list_head *) 0xf76eeb60
+p __stack_user.next->next
$2 = (struct list_head *) 0xf7d8f1a4 <__stack_user> ← STILL GOOD
+watch __stack_user.next->next
Hardware watchpoint 9: __stack_user.next->next
+s
// And here it is:
Hardware watchpoint 9: __stack_user.next->next
Old value = (struct list_head *) 0xf7d8f1a4 <__stack_user>
New value = (struct list_head *) 0xf71391a4 ← >>>>> GONE WRONG HERE <<<<<
0xf7121c83 in ?? ()
// Hm, an unknown address scribbling on __stack_user.
+call calloc(1, sizeof(Dl_info))
$3 = (void *) 0x56574d18
+call dladdr(0xf7121c83, $3)
$4 = 1
+p *(Dl_info *)$3
$5 = {dli_fname = 0x565755b8 "/lib/i386-linux-gnu/libpthread.so.0",
dli_fbase = 0xf711d000,
dli_sname = 0xf711f617 "__pthread_initialize_minimal",
dli_saddr = 0xf7121be0}
// Well that can't be right, can it? gdb should have figured out the name
// of 0xf7121c83, not said ?? - let's work out the address in the other
// direction:
+p __pthread_initialize_minimal
$6 = {<text variable, no debug info>} 0xf7d77be0
<__pthread_initialize_minimal_internal>
+call dladdr(0xf7d77be0, $3)
$8 = 1
+p *(Dl_info *)$3
$10 = {dli_fname = 0xf7fd4d70 "/lib/i386-linux-gnu/libpthread.so.0",
dli_fbase = 0xf7d73000,
dli_sname = 0xf7d75617 "__pthread_initialize_minimal",
dli_saddr = 0xf7d77be0 <__pthread_initialize_minimal_internal>}
// ============================================================================
Aha! Same DSO, different base address. So the ?? instance of
__pthread_initialize_minimal_internal was from the _other_ copy of libc,
inside the dlmopen namespace - the one gdb doesn't know how to inspect.
PS: for completeness, I went back and followed the __stack_user linked list
at the "GONE WRONG HERE" point, just to be sure:
+p __stack_user
$1 = {next = 0xf76eeb60, prev = 0xf76eeb60}
+p __stack_user.next
$2 = (struct list_head *) 0xf76eeb60
+p __stack_user.next->next
$3 = (struct list_head *) 0xf71391a4
+p __stack_user.next->next->next
$4 = (struct list_head *) 0xf71391a4
+p __stack_user.next->next->next->next
$5 = (struct list_head *) 0xf71391a4
So the linked list definitely doesn't contain &__stack_user any more.
// ============================================================================
Apologies for the exegesis: It seems to me that the copy of libc in the
private namespace has somehow managed to scribble on the linked list
pointed to by __stack_user, overwriting a key address.
Is my analysis correct? Is there something I could or should have done to
avoid this?
A while ago (https://sourceware.org/ml/libc-help/2018-01/msg00002.html)
I suggested a dlmopen flag RTLD_UNIQUE or similar which would cause the
existing mapping of the target library in the main namespace/link-map to be
re-used instead of creating a new one: I believe this would prevent this
problem (and others detailed in that message) from occurring - any thoughts?
--
You are receiving this mail because:
You are on the CC list for the bug.