This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][v2] Support .debug_macro


On Fri, 2014-09-12 at 20:03 +0200, Petr Machata wrote:
> - This code is based on the following proposal:
> 	http://www.dwarfstd.org/ShowIssue.php?issue=110722.1
> 
> - dwarf_getmacros is left to support only .debug_macinfo.  The only
>   problematic opcode really is DW_MACINFO_vendor_ext, that the new
>   standard lacks, and that could in theory be misinterpreted by extant
>   code as whatever operand a vendor decides to map to 0xff.

Since the standard is not completed yet, and I don't think anybody used
vendor extensions (especially not the last one 0xff and they are
versioned anyway, version 5 is not yet really out there). You could
suggest that the standard reserves that value to enable consumers to use
the same parsers for both. Would it help if the standard says
DW_MACRO_hi_user 0xef and DW_MACRO_reserver_do_not_use 0xff?
Maybe you can suggest that on dwarf-discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
http://dir.gmane.org/gmane.comp.standards.dwarf

> - Instead, two principial iteration interfaces are provided for access
>   to new-style sections: dwarf_getmacros_die and dwarf_getmacros_off.
>   And a suite of new helper interfaces for analysis of individual
>   opcodes: dwarf_macro_version, dwarf_macro_line_offset,
>   dwarf_macro_getparamcnt, dwarf_macro_param.

Could you also include a small testcase? That could then also be used as
an example. I don't think it is immediately clear how you would use the
dual style die/off accessors. So an test/example would be helpful.

And I was wondering if having a readelf --debug-dump=decodedmacro using
this interface was useful. The raw macro dump doesn't handle transparent
includes (you don't have to implement that, but maybe it is useful to do
if you do decide to include a test/example anyway).

The usage of Dwarf_Attribute to return the param value is a nice idea.

Why you decide to a include the version and line_offset interfaces on
Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just
misunderstand the data representation.)

> - The existing interfaces dwarf_macro_opcode, dwarf_macro_param1 and
>   dwarf_macro_param2 remain operational for old- as well as new-style
>   Dwarf macro sections, if applicable.

Nice.

> +static void
> +build_table (Dwarf_Macro_Op_Table *table,
> +	     Dwarf_Macro_Op_Proto op_protos[static 255])
> +{
> +  unsigned ct = 0;
> +  for (unsigned i = 1; i < 256; ++i)
> +    if (op_protos[i - 1].forms != NULL)
> +      table->table[table->opcodes[i - 1] = ct++] = op_protos[i - 1];
> +    else
> +      table->opcodes[i] = 0xff;
> +}
> +
> +#define MACRO_PROTO(NAME, ...)					\
> +  Dwarf_Macro_Op_Proto NAME = ({				\
> +      static const uint8_t proto[] = {__VA_ARGS__};		\
> +      (Dwarf_Macro_Op_Proto) {sizeof proto, proto};		\
> +    })
> +
> +static Dwarf_Macro_Op_Table *
> +init_macinfo_table (void)
> +{
> +  MACRO_PROTO (p_udata_str, DW_FORM_udata, DW_FORM_string);
> +  MACRO_PROTO (p_udata_udata, DW_FORM_udata, DW_FORM_udata);
> +  MACRO_PROTO (p_none);
> +
> +  Dwarf_Macro_Op_Proto op_protos[255] =
> +    {
> +      [DW_MACINFO_define - 1] = p_udata_str,
> +      [DW_MACINFO_undef - 1] = p_udata_str,
> +      [DW_MACINFO_vendor_ext - 1] = p_udata_str,
> +      [DW_MACINFO_start_file - 1] = p_udata_udata,
> +      [DW_MACINFO_end_file - 1] = p_none,
> +    };
> +
> +  static Dwarf_Macro_Op_Table table;
> +  memset (&table, 0, sizeof table);
> +
> +  build_table (&table, op_protos);
> +  return &table;
> +}
> +
> +static inline Dwarf_Macro_Op_Table *
> +get_macinfo_table (void)
> +{
> +  static Dwarf_Macro_Op_Table *ret = NULL;
> +  if (unlikely (ret == NULL))
> +    ret = init_macinfo_table ();
> +  return ret;
> +}

The statics here make me a little uncomfortable. What if the user is
working with multiple Dwarfs at the same time iterating through the
simultaneously (maybe to compare the macros each defines in their CUs).
Won't they overwrite each others Macro_Op_Table?

> +static Dwarf_Macro_Op_Table *
> +get_table_for_offset (Dwarf *dbg, Dwarf_Word macoff,
> +		      const unsigned char *readp,
> +		      const unsigned char *const endp)
> +{
> +  Dwarf_Macro_Op_Table fake = { .offset = macoff };
> +  Dwarf_Macro_Op_Table **found = tfind (&fake, &dbg->macro_ops,
> +					macro_op_compare);
> +  if (found != NULL)
> +    return *found;
> +
> +  const unsigned char *startp = readp;
> +
> +  /* Request at least 3 bytes for header.  */
> +  if (readp + 3 > endp)
> +    {
> +    invalid_dwarf:
> +      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> +      return NULL;
> +    }
>  
> -  Elf_Data *d = die->cu->dbg->sectiondata[IDX_debug_macinfo];
> -  if (unlikely (d == NULL) || unlikely (d->d_buf == NULL))
> +  uint16_t version = read_2ubyte_unaligned_inc (dbg, readp);
> +  if (version != 4)
> +    {
> +      __libdw_seterrno (DWARF_E_INVALID_VERSION);
> +      return NULL;
> +    }
> +
> +  uint8_t flags = *readp++;
> +  bool is_64bit = (flags & 0x1) != 0;
> +
> +  Dwarf_Off line_offset = (Dwarf_Off) -1;
> +  if ((flags & 0x2) != 0)
> +    {
> +      line_offset = read_addr_unaligned_inc (is_64bit ? 8 : 4, dbg, readp);
> +      if (readp > endp)
> +	goto invalid_dwarf;
> +    }
> +
> +  /* """The macinfo entry types defined in this standard may, but
> +     might not, be described in the table""".
> +
> +     I.e. these may be present.  It's tempting to simply skip them,
> +     but it's probably more correct to tolerate that a producer tweaks
> +     the way certain opcodes are encoded, for whatever reasons.  */
> +
> +  MACRO_PROTO (p_udata_str, DW_FORM_udata, DW_FORM_string);
> +  MACRO_PROTO (p_udata_strp, DW_FORM_udata, DW_FORM_strp);
> +  MACRO_PROTO (p_udata_udata, DW_FORM_udata, DW_FORM_udata);
> +  MACRO_PROTO (p_secoffset, DW_FORM_sec_offset);
> +  MACRO_PROTO (p_none);
> +
> +  Dwarf_Macro_Op_Proto op_protos[255] =
> +    {
> +      [DW_MACRO_GNU_define - 1] = p_udata_str,
> +      [DW_MACRO_GNU_undef - 1] = p_udata_str,
> +      [DW_MACRO_GNU_define_indirect - 1] = p_udata_strp,
> +      [DW_MACRO_GNU_undef_indirect - 1] = p_udata_strp,
> +      [DW_MACRO_GNU_start_file - 1] = p_udata_udata,
> +      [DW_MACRO_GNU_end_file - 1] = p_none,
> +      [DW_MACRO_GNU_transparent_include - 1] = p_secoffset,
> +      /* N.B. DW_MACRO_undef_indirectx, DW_MACRO_define_indirectx
> +	 should be added when 130313.1 is supported.  */
> +    };
> +
> +  if ((flags & 0x4) != 0)
> +    {
> +      unsigned count = *readp++;
> +      for (unsigned i = 0; i < count; ++i)
> +	{
> +	  unsigned opcode = *readp++;
> +
> +	  Dwarf_Macro_Op_Proto e;
> +	  get_uleb128 (e.nforms, readp); // XXX checking
> +	  e.forms = readp;
> +	  op_protos[opcode] = e;
> +
> +	  readp += e.nforms;
> +	  if (readp > endp)
> +	    {
> +	      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> +	      return NULL;
> +	    }
> +	}
> +    }
> +
> +  size_t ct = 0;
> +  for (unsigned i = 1; i < 256; ++i)
> +    if (op_protos[i - 1].forms != NULL)
> +      ++ct;
> +
> +  /* We support at most 0xfe opcodes defined in the table, as 0xff is
> +     a value that means that given opcode is not stored at all.  But
> +     that should be fine, as opcode 0 is not allocated.  */
> +  assert (ct < 0xff);
> +
> +  size_t macop_table_size = offsetof (Dwarf_Macro_Op_Table, table[ct]);
> +
> +  Dwarf_Macro_Op_Table *table = libdw_alloc (dbg, Dwarf_Macro_Op_Table,
> +					     macop_table_size, 1);
> +
> +  *table = (Dwarf_Macro_Op_Table) {
> +    .offset = macoff,
> +    .line_offset = line_offset,
> +    .header_len = readp - startp,
> +    .version = version,
> +    .is_64bit = is_64bit,
> +  };
> +  build_table (table, op_protos);
> +
> +  Dwarf_Macro_Op_Table **ret = tsearch (table, &dbg->macro_ops,
> +					macro_op_compare);
> +  if (unlikely (ret == NULL))
> +    {
> +      __libdw_seterrno (DWARF_E_NOMEM);
> +      return NULL;
> +    }
> +
> +  return *ret;
> +}
> +
> +static ptrdiff_t
> +read_macros (Dwarf *dbg, Dwarf_Macro_Op_Table *table, int secindex,
> +	     Dwarf_Off macoff, int (*callback) (Dwarf_Macro *, void *),
> +	     void *arg, ptrdiff_t offset)
> +{
> +  Elf_Data *d = dbg->sectiondata[secindex];
> +  if (unlikely (d == NULL || d->d_buf == NULL))
>      {
>        __libdw_seterrno (DWARF_E_NO_ENTRY);
>        return -1;
>      }
>  
> -  if (offset == 0)
> +  if (unlikely (macoff >= d->d_size))
>      {
> -      /* Get the appropriate attribute.  */
> -      Dwarf_Attribute attr;
> -      if (INTUSE(dwarf_attr) (die, DW_AT_macro_info, &attr) == NULL)
> -	return -1;
> +      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> +      return -1;
> +    }
>  
> -      /* Offset into the .debug_macinfo section.  */
> -      Dwarf_Word macoff;
> -      if (INTUSE(dwarf_formudata) (&attr, &macoff) != 0)
> -	return -1;
> +  const unsigned char *const startp = d->d_buf + macoff;
> +  const unsigned char *const endp = d->d_buf + d->d_size;
>  
> -      offset = macoff;
> +  if (table == NULL)
> +    {
> +      table = get_table_for_offset (dbg, macoff, startp, endp);
> +      if (table == NULL)
> +	return -1;
>      }
> -  if (unlikely (offset > (ptrdiff_t) d->d_size))
> -    goto invalid;
>  
> -  const unsigned char *readp = d->d_buf + offset;
> -  const unsigned char *readendp = d->d_buf + d->d_size;
> +  if (offset == 0)
> +    offset = table->header_len;
>  
> -  if (readp == readendp)
> -    return 0;
> +  assert (offset >= 0);
> +  assert (offset < endp - startp);
> +  const unsigned char *readp = startp + offset;
>  
> -  while (readp < readendp)
> +  while (readp < endp)
>      {
>        unsigned int opcode = *readp++;
> -      unsigned int u128;
> -      unsigned int u128_2 = 0;
> -      const char *str = NULL;
> -      const unsigned char *endp;
> +      if (opcode == 0)
> +	/* Nothing more to do.  */
> +	return 0;
> +
> +      unsigned int idx = table->opcodes[opcode - 1];
> +      if (idx == 0xff)
> +	{
> +	  __libdw_seterrno (DWARF_E_INVALID_OPCODE);
> +	  return -1;
> +	}
> +
> +      Dwarf_Macro_Op_Proto *proto = &table->table[idx];
>  
> -      switch (opcode)
> +      /* A fake CU with bare minimum data to fool dwarf_formX into
> +	 doing the right thing with the attributes that we put
> +	 out.  */
> +      Dwarf_CU fake_cu = {
> +	.dbg = dbg,
> +	.version = 4,
> +	.offset_size = table->is_64bit ? 8 : 4,
> +      };

Do we want/need to match the version here to match the CU DIE this macro
table came from? Might be hard to get at at this point though. Maybe
match it to the Macro_Op_Table version?

> +      Dwarf_Macro macro = {
> +	.line_offset = table->line_offset,
> +	.version = table->version,
> +	.opcode = opcode,
> +	.nargs = proto->nforms,
> +	.attributes = attributes,
> +      };

Couldn't you just store the table, opcode and attributes in Dwarf_Macro?
The rest can then be retrieved from the associated table.

>  /* Call callback function for each of the macro information entry for
> -   the CU.  */
> +   the CU.  Similar in operation to dwarf_getmacros_die, but only
> +   works for .debug_macinfo style macro entries (those where
> +   dwarf_macro_version returns 0.)  */
>  extern ptrdiff_t dwarf_getmacros (Dwarf_Die *cudie,
>  				  int (*callback) (Dwarf_Macro *, void *),
> -				  void *arg, ptrdiff_t offset)
> -     __nonnull_attribute__ (2);
> +				  void *arg, ptrdiff_t token)
> +     __nonnull_attribute__ (2);
> +
> +/* Iterate through the macro section referenced by CUDIE and call
> +   CALLBACK for each macro information entry.  Keeps iterating while
> +   CALLBACK returns DWARF_CB_OK.  If the callback returns
> +   DWARF_CB_ABORT, it stops iterating and returns a continuation
> +   token, which can be used to restart the iteration at the point
> +   where it ended.  Returns -1 for errors or 0 if there are no more
> +   macro entries.
> +
> +   If MACOFFP is non-NULL, offset to macro section referenced by CUDIE
> +   is stored to *MACOFFP.  That can later be passed together with a
> +   continuation token to dwarf_getmacros_off.
> +
> +   If this is called with non-0 TOKEN (i.e. it is a continuation
> +   call), MACOFFP shall be either NULL, or point to a location that
> +   contains the same section offset as was at *MACOFFP of the call
> +   that the passed-in continuation token came from.  */
> +extern ptrdiff_t dwarf_getmacros_die (Dwarf_Die *cudie, Dwarf_Off *macoffp,
> +				      int (*callback) (Dwarf_Macro *, void *),
> +				      void *arg, ptrdiff_t token)
> +     __nonnull_attribute__ (3);
> +
> +/* This is similar in operation to dwarf_getmacros_die, but selects
> +   the section to iterate through by offset instead of by CU.  This
> +   can be used for handling DW_MACRO_GNU_transparent_include's or
> +   similar opcodes, or for continuation calls seeded with MACOFF and
> +   TOKEN that came back from dwarf_getmacros_die (or a previous call
> +   to dwarf_getmacros_off).  Note that with TOKEN of 0, this will
> +   always iterate through DW_AT_GNU_macros style section.  */
> +extern ptrdiff_t dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff,
> +				      int (*callback) (Dwarf_Macro *, void *),
> +				      void *arg, ptrdiff_t token)
> +  __nonnull_attribute__ (3);

Although it is common for callback based functions in libdw I would
explicitly say that the Dwarf_Macro is only valid during the callback
and no information can be retrieved using the pointer after the callback
returns.

Thanks,

Mark

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]