This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: dwarflint --stats


21.07.2010 03:58, Roland McGrath wrote:
> Jakub was interested in sampling some DWARF data to compare what one
> compiler vs another is doing with some broad statistics of semantic
> interest.  The first particular thing to measure is how much location/value
> information we are getting for variables and parameters.

This is now on the dwarf branch.  Use like this:
dwarflint --check=locstats binary

> (Double-check the cases with no DW_AT_location or
> DW_AT_const_value at all to see if there is some other exclusion from
> "might have a location" that I'm overlooking at the moment.)

I'm getting a lot of misses from the following areas:
- artificial variables/parameters (but some of them do have location)
- descendants of DIEs that have DW_AT_inlined == "inlined" or 
"declared_inlined" (but many of them do have location)
- descendants of DIEs that are DW_TAG_inlined_subroutine (but there's so 
much of them that that's no wonder.  Ignoring those cuts off about 90% 
of variable/parameter DIEs)

There's a bunch of options for dumping and ignoring groups of dies, so 
one can e.g. do this:
dwarflint --check=locstats --locstats:ignore=artificial,inlined
dwarflint --check=locstats --locstats:dump=no_coverage

The dumps (that's not the statistics-gathering part, just the debugging 
one) list the whole DIE stack so that one has some context to ponder:

dumping no_coverage DIE
DIE 2730c compile_unit
     producer="GNU C++ 4.4.4 20100630 (Red Hat 4.4.4-10)"
     language=C_plus_plus
     name="../../elfutils/dwarflint/dwarflint.cc"
     comp_dir="...
     low_pc=0
     entry_pc=0
     ranges=<0x407790-0x40872e,0x408730-0x4087fb,...
     stmt_list=[{11 dirs}, {432 line entries}]
  DIE 373bf subprogram
      specification=[0x2843e]
      inline=declared_inlined
   DIE 373d3 lexical_block
    DIE 373d5 variable
        name="__ret"
        decl_file="/usr/lib/gcc/...
        decl_line=122
        type=[0x2a2ce]


> Globals (with DW_AT_external) and statics (without) are identified by
> having a non-list DW_AT_location that is a singleton DW_OP_addr expression.
> Perhaps (optionally?)  exclude these from the tally entirely, so they don't
> dilute the cumulative percentages.

That's --locstats:ignore=single-addr

> Otherwise, it's a location list.  So, first find the "scope" ranges set for
> this DIE.  That is, if the DIE itself has a DW_AT_start_scope that is a
> rangelistptr, then exactly that is the set.  If the DIE itself has
> DW_AT_{ranges,high_pc} then that's the set (but it won't).  Otherwise, look
> back up parent DIEs until one has ranges/high_pc.  If the variable DIE has
> a DW_AT_start_scope that is a constant, then exclude the portion of the
> range before it (see DWARFv4 3.3.8.2 item 11).

I used c++/dwarf header for getting the scope.  Does that/will that take 
into account the DW_AT_start_scope adjustments, or is that something 
that the client should take care of themselves?

> Finally, count up the cumulative bytes covered by the scope set and those
> covered by the location set.  Tally the ratio of those two counts as a
> percentage.  Perhaps produce a scatter plot with x axis the location/scope
> rounded to an integer percentage and y axis the percentage of the DIEs
> considered whose ratio is x.  And perhaps say min/max/avg(/median?) ratios
> seen.

Currently it emits a table like this:
cov%	samples	cumul
0..10	44269/52%	44269/52%
11..20	665/0%	44934/53%
21..30	910/1%	45844/54%
31..40	1273/1%	47117/55%
41..50	1425/1%	48542/57%
51..60	1326/1%	49868/59%
61..70	2102/2%	51970/61%
71..80	2317/2%	54287/64%
81..90	3549/4%	57836/68%
91..100	26684/31%	84520/100%

One can adjust the way it's tabulated by setting the stops manually. 
The table above is produced with a default rule of 
--locstats:tabulate=10:10 (first stop at 10, then each 10 percent 
points).  E.g. --locstats:tabulate=0,99 gives this:
0	14330/26%	14330/26%
1..99	17890/32%	32220/59%
100	22002/40%	54222/100%

> Another variant would be to also tally what portion of available locations
> is mutable vs immutable (or distribution of the ratios, or whatever).  That
> is, DW_AT_const_value is immutable.  A location expression is immutable if
> it ends in DW_OP_implicit_value or DW_OP_stack_value.  If an expression
> uses DW_OP_{,bit_}piece, then it can be partially mutable and partially
> immutable.  You can probably just choose arbitrarily to count those on one
> side or the other, or perhaps tally them as a separate third statistic.

That's not yet there.

PM

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]