This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Slow handling of C++ symbol names


When debugging some C++ program with gdb (both versions 5.3 and 6.0) it takes a long time to load the debugging information and get a command-line prompt. The particular example that demonstrates this problem is monotone (http://www.venge.net/monotone/) which makes use of the boost libraries (http://www.boost.org/). gdb loading monotone with the associated debugging information takes over 30 seconds to get the initial command-line prompt on a 1.7GHz Athlon with 512MB or DRAM. monotone is not a small program, over 3MB of debugging information. I tooks some measurements with OProfile to find out where gdb 5.3 was spending its time. 3/4 of the time is spent in dyn_string_substring and __GI_strncpy, the column below the '%'. I recompiled gdb with '-pg' to find that this is due to the C++ name demangling. The highlights of gprof output are shown below. I looked at the symbols for monotone/boost in the executable, some of them are VERY long and for monotone/boost they average 147 characters in length. An example of ONE symbol over 2KB is at the end of this mail (there are others like that). I suspect that some of the algorithms used in gdb may not expect such long symbol names and be part of the reason for the slow startup, e.g. work O(length_of_string ^2). Has anyone else noticed this type of problem before?

[wcohen@slingshot monotone-0.7]$ oprofpp -l -r -k /usr/bin/gdb|more
Cpu type: Athlon
Cpu speed was (MHz estimation) : 1687.55
Counter 0 counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a
unit mask of 0x00 (No unit mask) count 843500
vma samples % symbol name image name
081e20c0 36781 55.9424 dyn_string_substring /usr/lib/debug/usr/bin/gdb.debug
00077290 14743 22.4235 __GI_strncpy /lib/tls/libc-2.3.2.so
080f5e40 3434 5.22297 msymbol_hash_iw /usr/lib/debug/usr/bin/gdb.debug
00021dc0 1748 2.65864 __ctype_b_loc /lib/tls/libc-2.3.2.so
00076990 1603 2.4381 __GI_strcpy /lib/tls/libc-2.3.2.so
081e1e10 1417 2.1552 dyn_string_insert /usr/lib/debug/usr/bin/gdb.debug



The profile information from gprof.


Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 88.41     18.46    18.46   354132     0.00     0.00  dyn_string_substring
  5.84     19.68     1.22    19299     0.00     0.00  msymbol_hash_iw
...


0.00 0.00 220617 demangle_class_enum_type <cycle 2> [423]
0.00 0.00 95 demangle_function_type <cycle 2> [482]
-----------------------------------------------
0.00 0.74 14108/354132 demangle_name <cycle 2> [29]
0.00 1.04 19908/354132 demangle_type_ptr <cycle 2> [28]
0.00 4.55 87228/354132 demangle_prefix <cycle 2> [24]
0.01 12.16 232888/354132 demangle_type <cycle 2> [23]
[21] 88.6 0.01 18.49 354132 substitution_add [21]
18.46 0.01 354132/354132 dyn_string_substring [22]
0.00 0.01 354132/979183 dyn_string_new [40]
-----------------------------------------------
18.46 0.01 354132/354132 substitution_add [21]
[22] 88.5 18.46 0.01 354132 dyn_string_substring [22]
0.01 0.01 354132/4003799 dyn_string_resize [35]
-----------------------------------------------
13904 demangle_type <cycle 2> [23]
55 demangle_operator_name <cycle 2>



Find out how many symbols monotone has $ nm monotone|wc 23623 70853 3657239



Some of symbol names from the boost library are really, really, really long. Here is an example that is over 2K in length:

08145572 W _ZN5boost6spirit12subrule_listINS0_14subrule_parserILi0ENS0_8sequenceINS3_INS3_INS3_INS0_6actionINS0_7subruleILi1ENS0_15closure_contextI17rcs_admin_closureEEEEN7phoenix5actorINSA_9compositeINSA_9assign_opENSB_INSC_INSA_21member_var_ptr_actionI9rcs_admin8rcs_fileEENSB_INSA_14closure_memberILi0ENSA_7closureISG_NSA_5nil_tESK_EEEEEESK_SK_SK_EEEENSB_INSA_8argumentILi0EEEEESK_SK_EEEEEENS0_11alternativeINS0_11kleene_starINS4_INS5_ILi3ENS6_I17rcs_delta_closureEEEENSB_INSC_INSA_26member_function_ptr_actionIvSG_RK9rcs_deltaSK_SK_SK_EESN_SS_SK_SK_EEEEEEEENS3_INSX_INS5_ILi12ENS0_14parser_contextEEEEES19_EEEEEENSW_INS5_ILi2ENS6_I16rcs_desc_closureEEEENS3_IS1C_S1I_EEEEEENSX_INS4_INS5_ILi4ENS6_I21rcs_deltatext_closureEEEENSB_INSC_INS11_IvSG_RK13rcs_deltatextSK_SK_SK_EESN_SS_SK_SK_EEEEEEEEEENS0_10contiguousINSX_INS0_5chlitIcEEEEEEEENS6_I16rcs_file_closureEEEENS1_INS2_ILi10ES20_S1A_EENS1_INS2_ILi14ENS0_8positiveINSW_INS0_12digit_parserES20_EEEES1A_EENS1_INS2_ILi13ENS0_10differenceINS0_12graph

_parserENS0_5chsetIcEEEES1A_EENS1_INS2_ILi12ENS3_INS3_INS5_ILi8ENS6_I14string_closureEEEENSX_INS5_ILi9ES2K_EEEEEENS5_ILi10ES1A_EEEES1A_EENS1_INS2_ILi9ENSW_INSW_INSW_IS2L_NS5_ILi7ES2K_EEEENS5_ILi5ES2K_EEEES2P_EES2K_EENS1_INS2_ILi4ENS3_INS3_INS3_INS3_INS4_IS2S_NSB_INSC_ISD_NSB_INSC_INSE_ISsS1P_EENSB_INSI_ILi0ENSJ_IS1P_SK_SK_EEEEEESK_SK_SK_EEEESS_SK_SK_EEEEEENS0_6strlitIPKcEEEENS4_IS2U_S35_EEEENSW_IS3A_NS3_IS1C_S3A_EEEEEES3C_EES1N_EENS1_INS2_ILi2ENS3_IS3A_NS4_IS2U_NSB_INSC_ISD_NSB_INSI_ILi0ENSJ_ISsSK_SK_EEEEEESS_SK_SK_EEEEEEEES1H_EENS1_INS2_ILi3ENS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS4_IS2S_NSB_INSC_ISD_NSB_INSC_INSE_ISsS12_EENSB_INSI_ILi0ENSJ_IS12_SK_SK_EEEEEESK_SK_SK_EEEESS_SK_SK_EEEEEES3A_EES3Z_EES2P_EES3A_EENS4_IS2L_S3Y_EEEES2P_EES3A_EENS0_8optionalIS44_EEEES2P_EES3A_EENSX_INS4_IS2S_NSB_INSC_INS11_IvS12_RKSsSK_SK_SK_EES3U_SS_SK_SK_EEEEEEEEEES2P_EES3A_EENS48_IS3Z_EEEES2P_EESZ_EENS1_INS2_ILi1ENS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS3_INS

3_INS3_INS3_INS3_INS3_INS3_IS3A_NS4_IS2S_NSB_INSC_ISD_NSB_INSC_INSE_ISsSF_EENSB_INSI_ILi0ENSJ_ISF_SK_SK_EEEEEESK_SK_SK_EEEESS_SK_SK_EEEEEEEES2P_EENS48_IS51_EEEES3A_EENSX_INS4_IS2L_S4Y_EEEEEES2P_EES3A_EENSX_INS3_INS3_INS5_ILi6ES2K_EENS5_ILi11ES1A_EEEES2S_EEEEEES2P_EES3A_EENSX_INS3_INS3_IS2L_S5B_EES2S_EEEEEES2P_EENS48_INS3_IS3A_S2P_EEEEEENS48_INS3_INS3_IS3A_NS4_IS2U_S4Y_EEEES2P_EEEEEES5T_EES8_EENS0_5nil_tEEEEEEEEEEEEEEEEEEEEC1ERKS26_RKS66_

Doing a little computation on the wc of the nm output to figure out the average length of each symbol. Account for the 8 character address and 1 character symbol type information:

(3657239 char - (8 char address + 1 char type) * 23623)/23623= 146 char


-Will



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]