One part of the breakage is the bundled-elfutils multilib bug mentioned on the mailing list. Another part is staprun/insmod errors (fedora 8, up-to-date kernel etc.): [12919.839449] Systemtap Error at _stp_transport_init:274 failed to initialize modules
Iam not able to test systemtap scripts on ppc systems due to the above issue. Hence boosting the severity of the bug. Thanks Srinivasa DS
I'll take a crack at reorganizing this code.
*** Bug 6405 has been marked as a duplicate of this bug. ***
Frank This is with respect to this error "Systemtap Error at _stp_transport_init:274 failed to initialize modules" This error is generated by below patch "http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=fa670082537aea7f090bc8dcfab69ac5f62546bc;hp=073b6ba57a498c3c97426f6f6d0666f1f5eb30d4" These are my observations, hope it will help you to reorganize the code. 1) In my system __start and _stext both symbols have same address(may not be same on all architecture). But emit_symbol_data() or emit_symbol_data_from_debuginfo() searches for symbols whose address are not same and puts it in stap-symbols.h. cat /proc/kallsyms c000000000000000 T .__start c000000000000000 T _stext Hence stap-symbols.h doesn't contain _stext symbol and systemtap fails. vim /tmp/staptouTBk/stap-symbols.h struct _stp_symbol _stp_kernel_symbols [] = { { 0xc000000000000000, ".__start" }, { 0xc000000000000060, ".__secondary_hold" }, { 0xc0000000000044f8, ".slb_miss_realmode" }, Thanks Srinivasa DS
Created attachment 2728 [details] Patch to search "__start" symbol incase of PPC system. Systemtap searches for "_stext" symbol in symbol table during initialization of systentap module. Since address of _stext and __start symbols are same, only __start symbol is added to symbol table in ppc and that causes systemtap to fail on ppc. This patch searches for __start symbol instead of _stext in ppc systems and hence solves the problem. Thanks Srinivasa DS
*** Bug 6510 has been marked as a duplicate of this bug. ***
Frank, Is it possible to revert aaf2af3e3b0c159a64609c82811662d7253c3a96 till the unwind related problems are fixed, since that is the cause of quite a few failures recently? Ananth
Please give me a few more days to try to fix this stuff. Because of the way the unwind branch was built, it would be difficult to unroll the code partway. If it helps you get stuff done in the interim, please commit/push the patch from comment #5.
(In reply to comment #8) > > If it helps you get stuff done in the interim, please > commit/push the patch from comment #5. > Frank There are 2 issues here and we have patch for the first issue. 1) This is related to " _stp_transport_init:274 failed to initialize modules" problem and patch attached in comment#5 solves the problem temporarily 2)This is related to compilation error messages displayed when a simple systemtap script is executed, " /usr/local/share/systemtap/runtime/transport/symbols.c:407: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:425: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:427: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:456: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:457: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:460: error: dereferencing pointer to incomplete type" We don't have fix for this problem. So fixing first issue, doesn't resolve the problem completely. Thanks Srinivasa DS
/usr/local/share/systemtap/runtime/transport/symbols.c:407: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:425: error: dereferencing pointer to incomplete type /usr/local/share/systemtap/runtime/transport/symbols.c:427: error: dereferencing pointer to incomplete type we can solve these compilation problems by extending autoconf-module-nsection.c file like below and protecting "attr" variable with STAPCONF_MODULE_NSECTIONS in runtime/transport/symbols.c. #include <linux/module.h> struct module *x; void foo (void) { (void) x->sect_attrs->nsections; (void) x->sect_attrs->attrs; }
(In reply to comment #10) > we can solve these compilation problems by extending autoconf-module-nsection.c > file like below and protecting "attr" variable with STAPCONF_MODULE_NSECTIONS in > runtime/transport/symbols.c. > > #include <linux/module.h> > > struct module *x; > > void foo (void) > { > (void) x->sect_attrs->nsections; > (void) x->sect_attrs->attrs; > } The method is to provide the check of structure/member. But I found there are several references like ->attr and ->grp. We need to find the replacements or figure out other ways which could be bypass the two structures. Another workround is to include the defintions in symbols.c or sym.h. #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26) struct module_sect_attr { struct module_attribute mattr; char *name; unsigned long address; }; struct module_sect_attrs { struct attribute_group grp; unsigned int nsections; struct module_sect_attr attrs[0]; }; #endif Of course, it is not very graceful.
Created attachment 2742 [details] Patch to fix compilation errors, found on execution of systemtap scripts Frank This is just an interim fix, I applied this patch and executed systemtap tests on ppc and x86_64 systems on latest kernel.
This is still broken on powerpc. Looks like libdw is 32bit while the runtime needs to be 64bit. I even tried ifdefing out using STP_USE_DWARF_UNWINDER, but it still doesn't help. Can this either be fixed or the code be disabled till such time it works seamlessly across architectures please?
commit 8928443 should defang this particular issue - leaving the unwinder in place, but not feeding it with data from staprun/stapio.
(In reply to comment #14) > commit 8928443 should defang this particular issue - leaving the unwinder > in place, but not feeding it with data from staprun/stapio. we encountered one build issue while building systemtap. Investigating further... Thanks Srinivasa DS
Verified build and tested again. Works fine. Thanks Frank! Ananth