This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
catching c++ memory handling errors
- From: "Frank Ch. Eigler" <fche at redhat dot com>
- To: systemtap at sources dot redhat dot com
- Date: Mon, 1 Nov 2010 21:56:20 -0400
- Subject: catching c++ memory handling errors
Hi -
FYI, the following little script helped me pinpoint a logic error.
valgrind/glibc were not as specific, just noted that a double free
occurred, but not enough symbolic context to figure out why.
This first script just prints out all the destructors defined in two
of the messiest stap source files:
# stap -e 'probe process("./stap").function("*::~*@tapsets.cxx"),
process("./stap").function("*::~*@dwflpp.cxx") {
printf("%s %p\n",probefunc(),$this)
}' -c './stap TRIGGER_ERROR'
[...]
dwarf_query::~dwarf_query 0x7fffcedc7f70
base_query::~base_query 0x40f090
dwflpp::~dwflpp 0x2eed660
module_info::~module_info 0x2ef0310
dwflpp::~dwflpp 0x2f06420
module_info::~module_info 0x2ef0310
Note the duplicate destructor parameters. (In other programs, it's
possible for the same memory region to be reallocated later, where
this would not be an error. In this part of stap, that is not an
issue.)
A bonus analysis step shows the pretty-printed form of the object
being deleted, along with the stack tracebacks of the destructor
calls, but only when duplication is detected:
# stap -e 'probe process("./stap").function("*::~*@tapsets.cxx"),
process("./stap").function("*::~*@dwflpp.cxx") {
if ($this in seen) {
printf("\n\n%s %p duplicate: \n%s\n%s\n --- vs. ---\n%s\n%s\n",
probefunc(),$this,value[$this],seen[$this],
$this$$, sprint_ubacktrace())
}
seen[$this]=sprint_ubacktrace()
value[$this]=$this$$
}
global seen, value' -DMAXSTRINGLEN=4096 -c './stap TRIGGER_ERROR' | c++filt
[...]
module_info::~module_info 0x355b310 duplicate:
{.mod=0x355b0b0, .name="o?=P[, .elf_path={._M_dataplus={._M_p="/usr/lib/debug/lib/modules/2.6.34.7-61.fc13.x86_64/vmlinux"}}, .addr=18446744071578845184, .bias=0, .sym_table=0x0, .dwarf_status=0, .symtab_status=0}
module_info::~module_info()+0xf [stap]
dwflpp::~dwflpp()+0x9e [stap]
void delete_map<std::map<std::basic_string<[...]
dwarf_builder::dwarf_build_no_more(bool)+0x21 [stap]
dwarf_builder::build_no_more(systemtap_session&)+0x31 [stap]
match_node::build_no_more(systemtap_session&)+0xbe [stap]
match_node::build_no_more(systemtap_session&)+0x42 [stap]
match_node::build_no_more(systemtap_session&)+0x42 [stap]
match_node::build_no_more(systemtap_session&)+0x42 [stap]
semantic_pass_symbols(systemtap_session&)+0x620 [stap]
semantic_pass(systemtap_session&)+0x42 [stap]
passes_0_4(systemtap_session&)+0x158f [stap]
main+0xb1 [stap]
0x32a2c1ec5d
--- vs. ---
{.mod=0x35585d0, .name="o?=P[, .elf_path={._M_dataplus={._M_p="/usr/lib/debug/lib/modules/2.6.34.7-61.fc13.x86_64/vmlinux"}}, .addr=18446744071578845184, .bias=0, .sym_table=0x0, .dwarf_status=0, .symtab_status=0}
module_info::~module_info()+0xf [stap]
dwflpp::~dwflpp()+0x9e [stap]
void delete_map<std::map<std::basic_string<[...]
dwarf_builder::dwarf_build_no_more(bool)+0x21 [stap]
match_node::build_no_more(systemtap_session&)+0xbe [stap]
match_node::build_no_more(systemtap_session&)+0x42 [stap]
match_node::build_no_more(systemtap_session&)+0x42 [stap]
match_node::build_no_more(systemtap_session&)+0x42 [stap]
semantic_pass_symbols(systemtap_session&)+0x620 [stap]
semantic_pass(systemtap_session&)+0x42 [stap]
passes_0_4(systemtap_session&)+0x158f [stap]
main+0xb1 [stap]
0x32a2c1ec5d
Bottom line, we know the type & content & context of the objects being
repeatedly destroyed. Fixing the problem is left as an exercise.
- FChE