This is the mail archive of the frysk@sources.redhat.com mailing list for the frysk project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

remote unwinding of libunwind


Noticing there are quite some stack-unwind code checked into CVS, I spared some time to play around. The test results seem to be quite satisfactory. It can now get the function name in the dynamically-loaded library, and extract the source and line information if available. And it also start to support multi-thread unwinding now.

But I also noticed some little problems. The first one is while I am playing with Kyle's code. It can step / unwind both threads now, but it seems the unwinder swallows some frames for itself own consumption. :-) Looking into the below unwind session, you will notice that there are four level fames in both threads. But in fact, there are six frames in each. You can see this from the pstack output.

$ ./unwinddebug
Enter the PID of the main therad: 8297
Assuming second thread is pid 8298
Tracing main thread!
Frames of pid 8297:

found frame 0
0000000000bfb402                                  (sp=00000000bfe87ba4)
found frame 1
0000000008048893 main+0x10e                       (sp=00000000bfe87d70)
found frame 2
0000000000c2e724 __libc_start_main+0xdc           (sp=00000000bfe87dd0)
found frame 3
0000000008048521 _start+0x21                      (sp=00000000bfe87e40)

Trace Depth = 4

Tracing second thread!
Frames of pid 8298:

found frame 0
0000000000bfb402 +0x21                            (sp=00000000b7eef264)
found frame 1
00000000080486b6 thread1+0x77                     (sp=00000000b7eef430)
found frame 2
0000000000db440b start_thread+0xa9                (sp=00000000b7eef460)
found frame 3
0000000000ce1b7e __clone+0x5e                     (sp=00000000b7eef4d0)

Trace Depth = 4

$ pstack 8297
Thread 2 (Thread -1209074784 (LWP 8298)):
#0  0x00bfb402 in __kernel_vsyscall ()
#1  0x00ca3f16 in __nanosleep_nocancel () from /lib/libc.so.6
#2  0x00ca3d3b in sleep () from /lib/libc.so.6
#3  0x080486b6 in thread1 ()
#4  0x00db440b in start_thread () from /lib/libpthread.so.0
#5  0x00ce1b7e in clone () from /lib/libc.so.6
Thread 1 (Thread -1209071296 (LWP 8297)):
#0  0x00bfb402 in __kernel_vsyscall ()
#1  0x00ca3f16 in __nanosleep_nocancel () from /lib/libc.so.6
#2  0x00ca3d3b in sleep () from /lib/libc.so.6
#3  0x08048893 in main ()


The second one is found while I am playing with Tromey's fdtrace:


# ./frysk/bindir/fdtrace /home/woodzltc/fdtrace/Closer2
bad close() call at:
val = 0; in function: null (<Unknown file> at line 0)
val = 134513583; in function: doit2 (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 9)
val = 134513607; in function: main (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 13)
val = 12773156; in function: __libc_start_main (Unknown file at line 0)
val = 134513409; in function: _start (Unknown file at line 0)
bad close() call at:
val = 0; in function: null (<Unknown file> at line 0)
val = 134513583; in function: doit2 (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 9)
val = 134513607; in function: main (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 13)
val = 12773156; in function: __libc_start_main (Unknown file at line 0)
val = 134513409; in function: _start (Unknown file at line 0)

The address of the first frame seems to be 0, and "doit()" and "close()" was swallowed as well.

Anyone noticed these problems before? Is there any work to make improvement on this?


BTW, I also have one observation that libunwind has only two test cases for remote unwinding. That is far from enough, IMO. Stack unwind has quite some different scenarios, especially in remote unwind. We will have no way to be sure how it works in these scenario, if we have not test them. So I predict there are yet some other problems some where we didn't noticed.


My two cents is we need to write much more cases to evaluate how libunwind works in various scenarios: single thread and multi-threads, normal operation and abnormal operation (signal frame or exception handler or non-local jump)... It is better if we can also extract the backtrace information from the core dumped out.

Regards
- Wu Zhou


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]