This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/3] Unit test name-component bounds searching directly


On 11/20/2017 03:16 AM, Simon Marchi wrote:

>> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
>> index b08e81c..53548ca 100644
>> --- a/gdb/dwarf2read.c
>> +++ b/gdb/dwarf2read.c
>> @@ -255,11 +255,24 @@ struct mapped_index
>>    /* The name_component table (a sorted vector).  See name_component's
>>       description above.  */
>>    std::vector<name_component> name_components;
>> +  /* How NAME_COMPONENTS is sorted.  */
>> +  enum case_sensitivity name_components_casing;
> 
> Missing newline above?
> 
> Why is it important to store this in the structure now, rather than

The alternative was to either pass around a name_cmp pointer to the new
methods or to have both new methods look at the "case_sensitive_on" global.
I played with both directions, and given the TOCTOU-like issue you mentioned
below too, it felt like saving the order that the vector is sorted in
was a better direction.

> 
> It's not really related to this patch, but it made me think about this edge
> case.  Given that we build the vector only once, isn't it a problem if the
> user switches case sensitivity during a debug session?  For example, the vector
> is built with case sensitivity one:
> 
> - func_A
> - func_B
> - func_a
> - func_b
> 
> Then we look to complete "func_a": we'll get func_a.  The user then switches
> "set case-sensitive off".  If we look to complete "func_a" in that vector,
> we will search (with lower_bound) with a different sorting criterion that the
> one that was used for sorting, which I guess will give funny results.  I
> haven't actually tried, I'm just speculating.
> 
> If that's indeed a problem, keeping the case_sensitivity that was used to
> sort the vector could be useful, since if we detect that it changed, we
> can re-sort it.

Exactly.

Last I tried, when I was writing this originally, I
recall that performance drops with "set case-sensitive off",
naturally because strcasecmp has to do more work than strcmp
(i.e., call tolower before comparison).  I'm wondering whether we
could/should still always do case-insensitive sorting, as that may help
mixed language scenarios where one of the languages is case
insensitive, like Ada (Ada doesn't support indexes today).  Maybe
we could do something about the performance hit, by say,
avoiding strcasecmp by instead using safe-type.h/TOLOWER, and
also lowercasing the search string only once (with strcasecmp,
we're lowercasing it for each and every comparison).  Maybe with
that the performance hit is less significant.  But maybe not, considering
that strcmp/strncmp tend to be implemented in assembly making use
of vectorization, etc.  If the accel table stored the the strings in
lowercase form already, then we could use the faster strcmp always.
However, the disadvantage is that the current table only stores offsets into
exiting strings, and saving lowercased version of the strings require
deep copies of the symbol names, which may have a noticeable
memory cost...  Trade offs, trade offs...

> Otherwise, the patch LGTM.

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]