This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Updated patch adding line number enumeration support to statement probe


On Tue, Jun 3, 2014 at 8:10 AM, Jonathan Lebon <jlebon@redhat.com> wrote:
> Hi Brian,
>
> Thanks for the patch. I had some difficulties applying it. Can you make
> sure that your mail client does not modify whitespaces? If you use the
> git send-mail command you should have no issues.

I'll verify my mailing before sending up the next patch.  I've had
issues with this before, my apologies.

>
> Your patch looks OK overall. There are some things however that need
> attention:
>
>> @@ -38,7 +38,7 @@ struct symbol_table;
>>  struct base_query;
>>  struct external_function_query;
>>
>> -enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD };
>> +enum lineno_t { ABSOLUTE, RELATIVE, RANGE, WILDCARD, ENUMERATED };
>>  enum info_status { info_unknown, info_present, info_absent };
>>
>>  // module -> cu die[]
>
> You've eliminated any way to have a RANGE lineno type, but it is still
> in the lineno_t enum and there are still references to it in many places
> (like iterate_over_labels()). These need to be cleaned up as well to
> instead handle ENUMERATED.

I cleaned up all the RANGE stuff and will include in another patch.

>
>> @@ -1086,9 +1086,9 @@ void
>>  dwarf_query::parse_function_spec(const string & spec)
>>  {
>>    lineno_type = ABSOLUTE;
>> -  linenos[0] = linenos[1] = 0;
>> -
>> -  size_t src_pos, line_pos, dash_pos, scope_pos;
>> +  linenos.push_back(0);
>> +  linenos.push_back(0);
>> +  size_t src_pos, line_pos, scope_pos;
>>
>>    // look for named scopes
>>    scope_pos = spec.rfind("::");
>
> You seed the linenos vector with two '0' elements, but then never add
> the parsed linenos for ABSOLUTE or RELATIVE linenos. So stap thinks that
> the user entered lineno 0. I suspect that's what's causing all the failures
> in statement.exp.

The seed values were intended to mimic the removed line:
linenos[0] = linenos[1] = 0;

I tried testing with those seeds removed, which failed as well, but
I'm confused about how/where ABSOLUTE/RELATIVE specifications are
parsed.

The two statement.exp test cases which are failing are:
FAIL: statement (ENUMERATED and RANGE - single func - expected 3 probes, got 0)
FAIL: statement (ENUMERATED and RANGE - wild func - expected 3 probes, got 0)
...
                ===  Summary ===
# of expected passes            32
# of unexpected failures        2


>
>> @@ -1135,18 +1135,26 @@ dwarf_query::parse_function_spec(const string & spec)
>> lex_cast<int>(spec.substr(line_pos + 1));
>> +                // try to parse N, N-M, or N,M,O,P, or combination
>> thereof...
>> +                if (spec.find_first_of(",-", line_pos + 1) != string::npos)
>> +                  {
>> +                    lineno_type = ENUMERATED;
>> +                    linenos.clear();
>> +                    string dash_delimited,num;
>> +                    std::stringstream
>> comma_delimited(spec.substr(line_pos + 1));
>> +                    vector<int> tmp;
>> +                    while (std::getline(comma_delimited, dash_delimited,
>> ','))
>> +                      {
>> +                        tmp.clear();
>> +                        stringstream dash_delimited_stream(dash_delimited);
>> +                        while (std::getline(dash_delimited_stream, num,
>> '-'))
>> +                          tmp.push_back(lex_cast<int>(num));
>> +                        if (tmp.size() > 1)
>> +                            for (int i = tmp.front(); i <= tmp.back(); i++)
>> +                                linenos.push_back(i);
>
> For parsing the spec, I think it would be easier to tokenize on commas,
> and then check the resulting tokens one by one for either a single lineno
> or a range (maybe even factor that part out into its own function). There's
> tokenize() in util.cxx which is useful for this.

I noticed that I misinterpreted what the code was doing with the range before.
This patch simply translates it into an enumeration.
Even using binary_search on an enumeration, a search of a large range
would be inefficient.
I'll have to reformulate this.  I'm not sure it helps to go as far as
interval tree stuff, but tracking ranges properly, rather than
smashing them into enumerations, makes sense to me.

>
>> @@ -1192,6 +1200,10 @@ dwarf_query::parse_function_spec(const string & spec)
>>                clog << linenos[0] << " - " << linenos[1];
>>                break;
>>
>> +            case ENUMERATED:
>> +              clog << linenos[0] << ", ..., " << *(linenos.end());
>> +              break;
>> +
>>              case WILDCARD:
>>                clog << "*";
>>                break;
>
> For testing and debugging, it'd be nice if we actually properly print the
> contents of the linenos vector instead of '...'. It could just print all the
> linenos in the vector, although it'd be nice if it recognized ranges as well.
> (Also, you're dereferencing linenos.end()... did you mean linenos.back()?).
>
> Finally, is there any protection against repeating linenos? E.g. ":5-9,7-10"?
> Maybe linenos ought to be a set instead. Or maybe we should explicitly check
> for this and error out if it happens.

Yeah, it seems like we need to check against a series of items which
could be either individual line numbers or ranges.  This will also
solve the output case as we can just parrot out the items.
I kind-of see where this can change.  Enumerating the range explicitly
was my gut response, but it could be a surprise performance reduction
for a large range in a large file of source.
I'll try to get this all cleaned up in the next day or two.

thanks for your feedback,
Brian

>
>
> Cheers,
>
> Jonathan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]