This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][v2] Support .debug_macro


Mark Wielaard <mjw@redhat.com> writes:

> On Fri, 2014-09-12 at 20:03 +0200, Petr Machata wrote:
>> - dwarf_getmacros is left to support only .debug_macinfo.  The only
>>   problematic opcode really is DW_MACINFO_vendor_ext
>
> You could suggest that the standard reserves that value to enable
> consumers to use the same parsers for both.

That's worth a try.

> Could you also include a small testcase? That could then also be used as
> an example. I don't think it is immediately clear how you would use the
> dual style die/off accessors. So an test/example would be helpful.

I posted a test case patch with my previous submission.  That's still
active, but I didn't ping it until this one gets through:

        https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-September/004143.html

> And I was wondering if having a readelf --debug-dump=decodedmacro using
> this interface was useful. The raw macro dump doesn't handle transparent
> includes (you don't have to implement that, but maybe it is useful to do
> if you do decide to include a test/example anyway).

Sure, why not.  I actually think readelf should just use libdw/libelf
interfaces instead of duplicating the decoding logic, though that's not
what you are proposing.

> Why you decide to a include the version and line_offset interfaces on
> Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just
> misunderstand the data representation.)

Both version and line offset belong to macro units (not CU's, each macro
unit may set its own .debug_line offset, and macro units have their own
version).  But there's a question of how to get to that value.  You need
version to interpret the opcode, and you need .debug_line offset to
interpret file names.

One alternative is to have an interface like this:

	extern ptrdiff_t dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
					      int (*callback) (Dwarf_Macro *, void *),
					      void *arg, ptrdiff_t token,
        	                              unsigned int *versionp,
                	                      Dwarf_Off *line_offp);

... which would set *VERSIONP and *LINE_OFFP before invoking CALLBACK.
Then use it like this:

	struct data { some stuff; unsigned int version; Dwarf_Off line_off; };

	int callback (Dwarf_Macro *macro, void *user) {
	  // Cast user to struct data, use version and line_off.
	}

        struct data data = {...};
	dwarf_getmacros_off (dbg, off, callback, &data, token,
			     &data.version, &data.line_off);

But arranging this exchange smells of reinventing the wheel every time
you need to use the interface, and the interface is not extensible (if
we need to pass one more thing, we need to add one more parameter).

You could also extend the callback signature and just pass them down
directly.  That's much better in my opinion, but again has the
extensibility problem.

For this reason I thought it best to just carry these per-unit variables
with macros.  That addresses both the extensibility (for each new thing
that you need to pass around, you add one more private field, and one
more public interface), as well as the don't-repeat-yourself concerns.
The obvious downside is that you'll never know what these values are for
macro-less macro unit, but for macro-less unit you don't care.

Yet another possibility is to have a type like Dwarf_Macro_Unit, akin to
Dwarf_CU, which could be accessed from macro, or possibly be passed
together with macro to the callback, and which would carry these things,
and possibly more--like the prototype table.  I didn't think it
necessary to go to these lengths.

>> +static inline Dwarf_Macro_Op_Table *
>> +get_macinfo_table (void)
>> +{
>> +  static Dwarf_Macro_Op_Table *ret = NULL;
>> +  if (unlikely (ret == NULL))
>> +    ret = init_macinfo_table ();
>> +  return ret;
>> +}
>
> The statics here make me a little uncomfortable. What if the user is
> working with multiple Dwarfs at the same time iterating through the
> simultaneously (maybe to compare the macros each defines in their CUs).
> Won't they overwrite each others Macro_Op_Table?

This is a singleton macro prototype table for .debug_macinfo (not
.debug_macro) section.  It's the same in all Dwarf's.  The reason it's
there is just for uniformness of decoding both types of sections.  So
there's no problem if you have more than one Dwarf.

One problem this could cause however is that in a multi-threaded
application, there's a race between the threads for initialization of
this global resource.  If this is deemed a problem, we could do this:

__thread Dwarf_Macro_Op_Table *ret = NULL;
if (unlikely (ret == NULL))
  {
    lock;
    static Dwarf_Macro_Op_Table *global_table = NULL;
    if (global_table == NULL)
      global_table = init_macinfo_table ();
    ret = global_table;
    unlock;
  }

This is essentially double-checked locking, but that's not thread-safe,
so I make ret thread-local, and always lock before touching the global
resource.  That should be both thread-safe, have reasonable performance,
and not waste memory by storing one table for each thread.

Initially I dismissed this problem because we all know that libdw is
thread unsafe anyway.  But here we potentially get thread-unsafe
behavior even if we touch different Dwarf's, which might be unexpected.
May be worth reconsidering.

>> -      switch (opcode)
>> +      /* A fake CU with bare minimum data to fool dwarf_formX into
>> +	 doing the right thing with the attributes that we put
>> +	 out.  */
>> +      Dwarf_CU fake_cu = {
>> +	.dbg = dbg,
>> +	.version = 4,
>> +	.offset_size = table->is_64bit ? 8 : 4,
>> +      };
>
> Do we want/need to match the version here to match the CU DIE this macro
> table came from? Might be hard to get at at this point though. Maybe
> match it to the Macro_Op_Table version?

The problem is that some macro units don't have a related CU per se.
They could be accessed through offset from other macro units.  There
typically will be more than one CU that these independent units
logically belong to, otherwise this transparent inclusion scheme
wouldn't be advantageous.

But I guess we should match the CU version to whatever corresponds to
the .debug_macro version that this macro comes from (the two versions
can in theory evolve independently.  It's hard to imagine something like
that happening, but there could be a version of Dwarf that doesn't
change .debug_info at all, but does change .debug_macro).

>> +      Dwarf_Macro macro = {
>> +	.line_offset = table->line_offset,
>> +	.version = table->version,
>> +	.opcode = opcode,
>> +	.nargs = proto->nforms,
>> +	.attributes = attributes,
>> +      };
>
> Couldn't you just store the table, opcode and attributes in Dwarf_Macro?
> The rest can then be retrieved from the associated table.

Isn't that what I'm doing?  I guess I don't understand what you mean.

> Although it is common for callback based functions in libdw I would
> explicitly say that the Dwarf_Macro is only valid during the callback
> and no information can be retrieved using the pointer after the callback
> returns.

OK, agreed.

Thanks,
PM

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]