This is the mail archive of the
elfutils-devel@sourceware.org
mailing list for the elfutils project.
Re: dwarflint --stats
- From: Petr Machata <pmachata at redhat dot com>
- To: elfutils-devel at lists dot fedorahosted dot org
- Date: Fri, 17 Sep 2010 18:32:29 +0200
- Subject: Re: dwarflint --stats
21.07.2010 03:58, Roland McGrath wrote:
> Jakub was interested in sampling some DWARF data to compare what one
> compiler vs another is doing with some broad statistics of semantic
> interest. The first particular thing to measure is how much location/value
> information we are getting for variables and parameters.
This is now on the dwarf branch. Use like this:
dwarflint --check=locstats binary
> (Double-check the cases with no DW_AT_location or
> DW_AT_const_value at all to see if there is some other exclusion from
> "might have a location" that I'm overlooking at the moment.)
I'm getting a lot of misses from the following areas:
- artificial variables/parameters (but some of them do have location)
- descendants of DIEs that have DW_AT_inlined == "inlined" or
"declared_inlined" (but many of them do have location)
- descendants of DIEs that are DW_TAG_inlined_subroutine (but there's so
much of them that that's no wonder. Ignoring those cuts off about 90%
of variable/parameter DIEs)
There's a bunch of options for dumping and ignoring groups of dies, so
one can e.g. do this:
dwarflint --check=locstats --locstats:ignore=artificial,inlined
dwarflint --check=locstats --locstats:dump=no_coverage
The dumps (that's not the statistics-gathering part, just the debugging
one) list the whole DIE stack so that one has some context to ponder:
dumping no_coverage DIE
DIE 2730c compile_unit
producer="GNU C++ 4.4.4 20100630 (Red Hat 4.4.4-10)"
language=C_plus_plus
name="../../elfutils/dwarflint/dwarflint.cc"
comp_dir="...
low_pc=0
entry_pc=0
ranges=<0x407790-0x40872e,0x408730-0x4087fb,...
stmt_list=[{11 dirs}, {432 line entries}]
DIE 373bf subprogram
specification=[0x2843e]
inline=declared_inlined
DIE 373d3 lexical_block
DIE 373d5 variable
name="__ret"
decl_file="/usr/lib/gcc/...
decl_line=122
type=[0x2a2ce]
> Globals (with DW_AT_external) and statics (without) are identified by
> having a non-list DW_AT_location that is a singleton DW_OP_addr expression.
> Perhaps (optionally?) exclude these from the tally entirely, so they don't
> dilute the cumulative percentages.
That's --locstats:ignore=single-addr
> Otherwise, it's a location list. So, first find the "scope" ranges set for
> this DIE. That is, if the DIE itself has a DW_AT_start_scope that is a
> rangelistptr, then exactly that is the set. If the DIE itself has
> DW_AT_{ranges,high_pc} then that's the set (but it won't). Otherwise, look
> back up parent DIEs until one has ranges/high_pc. If the variable DIE has
> a DW_AT_start_scope that is a constant, then exclude the portion of the
> range before it (see DWARFv4 3.3.8.2 item 11).
I used c++/dwarf header for getting the scope. Does that/will that take
into account the DW_AT_start_scope adjustments, or is that something
that the client should take care of themselves?
> Finally, count up the cumulative bytes covered by the scope set and those
> covered by the location set. Tally the ratio of those two counts as a
> percentage. Perhaps produce a scatter plot with x axis the location/scope
> rounded to an integer percentage and y axis the percentage of the DIEs
> considered whose ratio is x. And perhaps say min/max/avg(/median?) ratios
> seen.
Currently it emits a table like this:
cov% samples cumul
0..10 44269/52% 44269/52%
11..20 665/0% 44934/53%
21..30 910/1% 45844/54%
31..40 1273/1% 47117/55%
41..50 1425/1% 48542/57%
51..60 1326/1% 49868/59%
61..70 2102/2% 51970/61%
71..80 2317/2% 54287/64%
81..90 3549/4% 57836/68%
91..100 26684/31% 84520/100%
One can adjust the way it's tabulated by setting the stops manually.
The table above is produced with a default rule of
--locstats:tabulate=10:10 (first stop at 10, then each 10 percent
points). E.g. --locstats:tabulate=0,99 gives this:
0 14330/26% 14330/26%
1..99 17890/32% 32220/59%
100 22002/40% 54222/100%
> Another variant would be to also tally what portion of available locations
> is mutable vs immutable (or distribution of the ratios, or whatever). That
> is, DW_AT_const_value is immutable. A location expression is immutable if
> it ends in DW_OP_implicit_value or DW_OP_stack_value. If an expression
> uses DW_OP_{,bit_}piece, then it can be partially mutable and partially
> immutable. You can probably just choose arbitrarily to count those on one
> side or the other, or perhaps tally them as a separate third statistic.
That's not yet there.
PM