This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Updated patch adding line number enumeration support to statement probe

From: BR Chrisman <brchrisman at gmail dot com>
Cc: systemtap at sourceware dot org
Date: Tue, 3 Jun 2014 10:46:44 -0700
Subject: Re: Updated patch adding line number enumeration support to statement probe
Authentication-results: sourceware.org; auth=none
References: <CAN4=B2kFBfT3WKsyaVshcco-2=7dikggEGt0O7X=e+J3E_qGAw at mail dot gmail dot com> <1844543191 dot 19591707 dot 1401808203426 dot JavaMail dot zimbra at redhat dot com>

On Tue, Jun 3, 2014 at 8:10 AM, Jonathan Lebon <jlebon@redhat.com> wrote:
> Hi Brian,
>
> Thanks for the patch. I had some difficulties applying it. Can you make
> sure that your mail client does not modify whitespaces? If you use the
> git send-mail command you should have no issues.

I'll verify my mailing before sending up the next patch.  I've had
issues with this before, my apologies.

>
> Your patch looks OK overall. There are some things however that need
> attention:
>
>> @@ -38,7 +38,7 @@ struct symbol_table;
>>  struct base_query;
>>  struct external_function_query;
>>
>> -enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD };
>> +enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD, ENUMERATED };
>>  enum info_status { info_unknown, info_present, info_absent };
>>
>>  // module -> cu die[]
>
> You've eliminated any way to have a RANGE lineno type, but it is still
> in the lineno_t enum and there are still references to it in many places
> (like iterate_over_labels()). These need to be cleaned up as well to
> instead handle ENUMERATED.

I cleaned up all the RANGE stuff and will include in another patch.

>
>> @@ -1086,9 +1086,9 @@ void
>>  dwarf_query::parse_function_spec(const string & spec)
>>  {
>>    lineno_type = ABSOLUTE;
>> -  linenos[0] = linenos[1] = 0;
>> -
>> -  size_t src_pos, line_pos, dash_pos, scope_pos;
>> +  linenos.push_back(0);
>> +  linenos.push_back(0);
>> +  size_t src_pos, line_pos, scope_pos;
>>
>>    // look for named scopes
>>    scope_pos = spec.rfind("::");
>
> You seed the linenos vector with two '0' elements, but then never add
> the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
> the user entered lineno 0. I suspect that's what's causing all the failures
> in statement.exp.

The seed values were intended to mimic the removed line:
linenos[0] = linenos[1] = 0;

I tried testing with those seeds removed, which failed as well, but
I'm confused about how/where ABSOLUTE/RELATIVE specifications are
parsed.

The two statement.exp test cases which are failing are:
FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got 0)
FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
...
                ===  Summary ===
# of expected passes            32
# of unexpected failures        2


>
>> @@ -1135,18 +1135,26 @@ dwarf_query::parse_function_spec(const string & spec)
>> lex_cast<int>(spec.substr(line_pos + 1));
>> +                // try to parse N, N-M, or N,M,O,P, or combination
>> thereof...
>> +                if (spec.find_first_of(",-", line_pos + 1) != string::npos)
>> +                  {
>> +                    lineno_type = ENUMERATED;
>> +                    linenos.clear();
>> +                    string dash_delimited,num;
>> +                    std::stringstream
>> comma_delimited(spec.substr(line_pos + 1));
>> +                    vector<int> tmp;
>> +                    while (std::getline(comma_delimited, dash_delimited,
>> ','))
>> +                      {
>> +                        tmp.clear();
>> +                        stringstream dash_delimited_stream(dash_delimited);
>> +                        while (std::getline(dash_delimited_stream, num,
>> '-'))
>> +                          tmp.push_back(lex_cast<int>(num));
>> +                        if (tmp.size() > 1)
>> +                            for (int i = tmp.front(); i <= tmp.back(); i++)
>> +                                linenos.push_back(i);
>
> For parsing the spec, I think it would be easier to tokenize on commas,
> and then check the resulting tokens one by one for either a single lineno
> or a range (maybe even factor that part out into its own function). There's
> tokenize() in util.cxx which is useful for this.

I noticed that I misinterpreted what the code was doing with the range before.
This patch simply translates it into an enumeration.
Even using binary_search on an enumeration, a search of a large range
would be inefficient.
I'll have to reformulate this.  I'm not sure it helps to go as far as
interval tree stuff, but tracking ranges properly, rather than
smashing them into enumerations, makes sense to me.

>
>> @@ -1192,6 +1200,10 @@ dwarf_query::parse_function_spec(const string & spec)
>>                clog << linenos[0] << " - " << linenos[1];
>>                break;
>>
>> +            case ENUMERATED:
>> +              clog << linenos[0] << ", ..., " << *(linenos.end());
>> +              break;
>> +
>>              case WILDCARD:
>>                clog << "*";
>>                break;
>
> For testing and debugging, it'd be nice if we actually properly print the
> contents of the linenos vector instead of '...'. It could just print all the
> linenos in the vector, although it'd be nice if it recognized ranges as well.
> (Also, you're dereferencing linenos.end()... did you mean linenos.back()?).
>
> Finally, is there any protection against repeating linenos? E.g. ":5-9,7-10"?
> Maybe linenos ought to be a set instead. Or maybe we should explicitly check
> for this and error out if it happens.

Yeah, it seems like we need to check against a series of items which
could be either individual line numbers or ranges.  This will also
solve the output case as we can just parrot out the items.
I kind-of see where this can change.  Enumerating the range explicitly
was my gut response, but it could be a surprise performance reduction
for a large range in a large file of source.
I'll try to get this all cleaned up in the next day or two.

thanks for your feedback,
Brian

>
>
> Cheers,
>
> Jonathan

Follow-Ups:
- Re: Updated patch adding line number enumeration support to statement probe
  - From: Jonathan Lebon

References:
- Updated patch adding line number enumeration support to statement probe
  - From: BR Chrisman
- Re: Updated patch adding line number enumeration support to statement probe
  - From: Jonathan Lebon

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]