This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Troubles with debug info, using systemtap on debian.


I built my own 2.6.31 kernel with:
make-kpkg --initrd --revision 1 --append-to-version -jknight-1-amd64 kernel_image kernel_headers kernel_debug


I have kernel-package version 12.025.

And I installed all 3 debs that created.

Therefore, I had on my filesystem direcories that look like this:
Original compile directory: /usr/src/linux-source-2.6.31
Kernel mods installed in: /lib/modules/2.6.31-jknight-1-amd64/
Debug data installed in: /usr/lib/debug/lib/modules/2.6.31-jknight-1- amd64/


/lib/modules/2.6.31-jknight-1-amd64/build got created as a symlink to: /usr/src/linux-source-2.6.31


Systemtap was working fine, for symbols in vmlinux, but segfaulted when trying to probe modules. E.g., the simplest script segfaulted in the translator.


probe module("autofs4").function("autofs4_fill_super") {}


Failed with this backtrace:


#0 0x00002b0e7568e34f in memmove () from /lib/libc.so.6
#1 0x00002b0e749cdf7c in elf64_xlatetof (dest=0x7fff515977d0, src=0x7fff51597800,
encode=<value optimized out>) at elf32_xlatetof.c:118
#2 0x00002b0e747aeb0e in relocate (offset=49, addend=0x7fff51597940, rtype=<value optimized out>,
symndx=11) at relocate.c:436
#3 0x00002b0e747af238 in relocate_section (ehdr=<value optimized out>, shstrndx=<value optimized out>,
reloc_symtab=<value optimized out>, scn=0x291d020, shdr=0x7fff515979d0, tscn=0x291cf68,
debugscn=false, partial=true) at relocate.c:501
#4 0x00002b0e747af741 in __libdwfl_relocate_section (mod=0x2908f60, relocated=0x291cbb0,
relocscn=0x291d020, tscn=0x291cf68, partial=<value optimized out>) at relocate.c:632
#5 0x00002b0e747b04a6 in dwfl_module_address_section (mod=0x2908f60, address=<value optimized out>,
bias=0x7fff51597ed8) at derelocate.c:399
#6 0x000000000046d2f5 in dump_unwindsyms (m=0x2908f60, userdata=<value optimized out>,
name=<value optimized out>, base=65536, arg=0x7fff51598330) at translate.cxx:4730
#7 0x00002b0e747b1677 in dwfl_getmodules (dwfl=0x28cb170, callback=0x46c560 <dump_unwindsyms>,
arg=0x7fff51598330, offset=2) at dwfl_getmodules.c:103
#8 0x0000000000469f66 in emit_symbol_data (s=@0x7fff515990f0) at translate.cxx:4970
#9 0x000000000046c041 in translate_pass (s=@0x7fff515990f0) at translate.cxx:5273
#10 0x000000000041062f in main (argc=2, argv=0x7fff5159aeb8) at main.cxx:1231


Adding --ignore-vmlinux --ignore-dwarf didn't cause the crash to go away.

Eventually, I figured out that it was finding debug data from a strange location:

/lib/modules/2.6.31-jknight-1-amd64/build/debian/linux-image-2.6.31- jknight-1-amd64-dbg/usr/lib/debug/lib/modules/2.6.31-jknight-1-amd64/ kernel/fs/autofs4/autofs4.ko

(I found that via, at that backtrace, "f 6; print *m").
Okay, I thought, that's odd. Let me just remove the "build" symlink, so that hopefully it finds the debug data from the installed kernel- debug package. Well, that failed, because the files there are apparently expected to be called: *.ko.debug, but I had a file called:
/usr/lib/debug/lib/modules/2.6.31-jknight-1-amd64/kernel/fs/autofs4/ autofs4.ko
instead. So, I symlinked it to be called autofs4.ko.debug.


Note that autofs4.ko there is the same file (same md5sum) as the one it found and crashed with above in /lib/modules/../build.

And, it still crashed. But, now, in a different place!!!

#0 0x00002b4884c2f34f in memmove () from /lib/libc.so.6
#1 0x00002b4883f6ef7c in elf64_xlatetof (dest=0x7fff49c3cff0, src=0x7fff49c3d020,
encode=<value optimized out>) at elf32_xlatetof.c:118
#2 0x00002b4883d4fb0e in relocate (offset=47, addend=0x7fff49c3d160, rtype=<value optimized out>,
symndx=179) at relocate.c:436
#3 0x00002b4883d50238 in relocate_section (ehdr=<value optimized out>, shstrndx=<value optimized out>,
reloc_symtab=<value optimized out>, scn=0x2a160b0, shdr=0x7fff49c3d1f0, tscn=0x2a15ff8,
debugscn=false, partial=true) at relocate.c:501
#4 0x00002b4883d50898 in __libdwfl_relocate (mod=0x2a511f0, debugfile=0x2a15db0,
debug=<value optimized out>) at relocate.c:609
#5 0x00002b4883d539e8 in dwfl_module_getelf (mod=0x2a511f0, loadbase=0x7fff49c3d6e0)
at dwfl_module_getelf.c:76
#6 0x000000000046cf79 in dump_unwindsyms (m=0x2a511f0, userdata=<value optimized out>,
name=0x2b488f606b8f "autofs4_direct_root_inode_operations", base=65536, arg=0x7fff49c3db40)
at translate.cxx:4475
#7 0x00002b4883d52677 in dwfl_getmodules (dwfl=0x19b9440, callback=0x46c560 <dump_unwindsyms>,
arg=0x7fff49c3db40, offset=2) at dwfl_getmodules.c:103
#8 0x0000000000469f66 in emit_symbol_data (s=@0x7fff49c3e900) at translate.cxx:4970
#9 0x000000000046c041 in translate_pass (s=@0x7fff49c3e900) at translate.cxx:5273
#10 0x000000000041062f in main (argc=2, argv=0x7fff49c406c8) at main.cxx:1231


Eventually after a bit of flailing, I decided to put the build symlink back, but remove all the temporary packaging build directories: rm - rf /usr/src/linux-source-2.6.31/debian/linux*

Now, stap found the debuginfo in:
/lib/modules/2.6.31-jknight-1-amd64/build/fs/autofs4/autofs4.ko

That is the file actually generated by the kernel build process, unmangled by debian packaging scripts. And, then it worked! Without segfaulting, hooray!


So, some questions, at the end of all this:
1) Surely --ignore-dwarf --ignore-vmlinux should've caused systemtap to not use libelf to find and parse the dwarf debug info?


2) Why did stap find the debug data at such a strange path in /lib/ modules/.../build/debian/.... Does it do something like traverse every file, recursively, under the modules directory until it finds one it likes? That's quite...odd. I noticed that even if I renamed "build" to "build.foo", it *STILL* looked in there.

3) The debian kernel's debuginfo does "objcopy --only-keep- debug"...That seems like it shouldn't cause systemtap to blow up, but it does. I guess that's a known bug?

4) Why does it blow up *differently* depending on whether it found the file in /usr/lib/debug or /lib/modules?

5) Whose bug is it that systemtap doesn't look for /usr/lib/debug/.../ autofs4.ko, but only autofs4.ko.debug?
Apparently this is a difference between debian and Fedora. Fedora systems append .debug, Debian systems do not. My guess: debian should be patching their copy of elfutils to not append ".debug"? But maybe that's an upstream bug, and it should try both by default (or something). I dunno.


Someone else discovered the ".debug" issue in another program:
http://www.visophyte.org/rev_control/patches/chronicle-recorder/debian-usr-lib-debug-support.patch
And here's the debian reference about how to install debuginfo:
http://www.debian.org/doc/developers-reference/best-pkging-practices.html#bpp-dbg


I guess all these except the first are probably bugs in elfutils, not systemtap, so perhaps I should be reporting it there instead. But despite what you might think, I actually have no clue about any of this crap: any clue you might infer from the above has all been gained by random flailing over the course of the last couple hours. So I figure it's safer to report here, first and redirect if requested. :)


James


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]