This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [RFA]: LOC_COMPUTED + abstracted dwarf2 evaluator, again

From: Daniel Berlin <dberlin at dberlin dot org>
To: Jim Blandy <jimb at redhat dot com>
Cc: gdb-patches at sources dot redhat dot com
Date: Mon, 8 Jul 2002 19:24:17 -0400 (EDT)
Subject: Re: [RFA]: LOC_COMPUTED + abstracted dwarf2 evaluator, again
On 8 Jul 2002, Jim Blandy wrote:

> 
> Daniel Berlin <dberlin@dberlin.org> writes:
> > For yer convenience,  I rediffed it against the current gdb sources.
> 
> Thanks!
> 
> > I'll update the changelog dates if it ever gets reviewed or
> > approved.
> 
> I do appreciate your re-posting the patch.  I have been trying to be
> more prompt with reviews in the last few weeks, and not fall behind.
> 
> ChangeLog entries like this don't work:
> 
> > 	(Simple cases of *.c containing LOC_): Handle LOC_COMPUTED and
> > 	LOC_COMPUTED_ARG. 
> 
> Nobody reads a ChangeLog from beginning to end; they search it.  So
> even when the list of files and functions is a dozen times longer than
> the actual ChangeLog entry itself, it's best to actually list all the
> files explicitly, with no `*-tdep.c', `mips-{tdep,nat}.c', at all.
> It takes time and it's boring, but I think it's a good rule.

Okeydokey.

> 
> > Slightly updated accompanying text (from the april submission):
> > "
> > The only real things left to do is give some *real* description in 
> > locexpr_describe_location
> 
> Like something that prints out the expression?  Yeah, that would be
> nice, but it can get involved if you've got branches.  Maybe you can
> steal code from readelf; that prints location expressions.
I know, but the problem is pretty printing
> 
> > add tracepoint support
> 
> In general, you're looking at a Dwarf -> Agent Expression compiler.
> Yahoo.  I don't know if anyone's even using the tracepoint stuff; it
> seems okay to me to leave this unsupported for now.  (Although it
> seems to me that tracepoints would be The Thing for debugging the
> kernel.  Just write a kernel module that logs stuff to a buffer
> visible via /proc, write a custom gdbserver, and away you go,
> debugging page fault handlers.)
> 
> > and remove the dwarf2cfi.c evaluator .  None of these should be
> > blockers to committing it or reviewing it.
> 
> No, I agree.
> 
> > It would also be nice to use location expressions in *all* cases 
> > (currently, it only does it for variables) so you can get rid of 
> > decode_loc_desc, but i'll leave that for the next patch.
> 
> Makes sense.
> 
> > There are a few lines that are too long (<10, i *think*), sorry about 
> > that, i couldn't see an easy way to break them up and have them make 
> > sense.
> 
> Well, this does need to be corrected before you commit.  Come on, the
> rest of us all do it, I think you can handle the agony.  :)

It's not that I'm not going to do it, it's that i haven't figured out how 
to make it look nice.

You end up with it (gdb_indent) wanting to indent it like:


bob (fred, 5 *
     f (5 + 9 * 
        8, 7),
        5)

Or something equally silly (IE it makes it very hard to parse)

> 
> > Splitting this patch into loc_computed addition, and dwarf2 implementation 
> > of loc_computed, doesn't make it much smaller (loc_computed_addition is 
> > maybe 5% of the patch), so if size is a concern, sorry, but it's pretty 
> > much unsplittable beyond that (the biggest portion is the addition of the 
> > abstracted evaluator, but the dwarf2 implementation doesn't work without 
> > it).
> 
> It's not so much that a patch is too large or too small, it's the
> additional brain load from having to figure out which change each hunk
> belongs to.  I^HSome of us have limited resources to work with.
> 
> 
> The patch misses some cases:
> 
> - I think ch-exp.c:ch_lex and m2-exp.y:yylex need some additional
>   `case's for LOC_COMPUTED{,_ARG}.  I grant you, Chill is dead, but
>   they're easy tweaks.

Sure.

> - print_frame_args needs a case for LOC_COMPUTED_ARG.
> - print_block_frame_locals needs a case for LOC_COMPUTED.
> - print_frame_arg_vars needs a case for LOC_COMPUTED_ARG.

I could have sworn I had done these.
In fact, I know i did.

I must have had made some changes after posting it in april that were lost 
when i moved and a disk crashed (IE the gdb-patches posting was the only 
copy i had).

I'll go through it again and make sure nothing else got lost.
 > 
> > Index: symmisc.c
> > ===================================================================
> > RCS file: /cvs/src/src/gdb/symmisc.c,v
> > retrieving revision 1.9
> > diff -c -3 -p -w -B -b -r1.9 symmisc.c
> > *** symmisc.c	15 May 2002 21:19:21 -0000	1.9
> > --- symmisc.c	8 Jul 2002 16:52:40 -0000
> > *************** print_partial_symbols (struct partial_sy
> > *** 862,867 ****
> > --- 862,871 ----
> >   	case LOC_OPTIMIZED_OUT:
> >   	  fputs_filtered ("optimized out", outfile);
> >   	  break;
> > + 	case LOC_COMPUTED:
> > + 	case LOC_COMPUTED_ARG:
> > + 	  fputs_filtered ("computed at runtime", outfile);
> > + 	  break;
> >   	default:
> >   	  fputs_filtered ("<invalid location>", outfile);
> >   	  break;
> 
> I guess we can't do anything more detailed here because we only have
> partial symbol table information --- the address class, but not the
> function group or the baton.  That's a pity, but I don't see how to do
> better.

The only time it's printed is if you maint print psymbols, happily, and 
if you are doing that, you've got worse problems :).


> > + /* This is a struct location function, this one returns 1 if we need a
> > +    frame to read the value of the variable. */
> > + static int 
> > + locexpr_read_needs_frame (void *baton)
> > + {
> > +   return 1;
> > + }
> 
> This needs to be smarter, in light of this code in find_var_value:
We can't know if it's got a constant address without looking through the 
location codes.
But since the expression evaluator is opaque, *it* is what looks at the 
actual location expressions, we (the reader) treat them as magic blocks of 
data.
:)
The simple workaround is to have a function in the same file as the 
evaluator that tells us if a location contains a register reference, since 
it has access to the data in question.


> 
> > +      case LOC_COMPUTED:
> > +      case LOC_COMPUTED_ARG:
> > +        {
> > +  	struct location_funcs *funcs = SYMBOL_LOCATION_FUNCS (var);
> > +  	void *baton = SYMBOL_LOCATION_BATON (var);
> > +  	
> > +  	if (frame == 0 && (funcs->read_needs_frame)(baton))
> > +  	  return 0;
> > +  	return (funcs->read_variable) (baton, frame);
> > +  	
> > +        }
> > +        break;
> 
> So read_var_value will return zero whenever it's given a
> LOC_COMPUTED{,_ARG} symbol and zero for a frame.  But read_var_value
> gets zero for the frame whenever you try to print a global or static
> variable when the program isn't running, even though the lack of a
> frame is harmless.  (value_struct_elt_for_reference seems to pass a
> frame of zero, too, but I'm not sure in what contexts that occurs.)
Wheee.


> 
> > !     LOC_INDIRECT,
> > !     
> > !     /* The variable address is computed by a set of location
> > !        functions.  */
> > !     LOC_COMPUTED,
> > !     
> > !     /* The variable is an argument, and it's address is computed by a
> > !        set of location functions.  */
> > !     LOC_COMPUTED_ARG
> > ! 
> > !   };
> 
> These comments should explicitly say how one computes the address ---
> or at least refer to the comment above `struct location_funcs', just
> below.
OKeydokey.

> 
> > *************** struct symbol
> > *** 660,665 ****
> > --- 712,719 ----
> >         {
> >   	/* Used by LOC_BASEREG and LOC_BASEREG_ARG.  */
> >   	short basereg;
> > + 	/* Location baton to pass to location functions.  */
> > + 	void *locbaton;
> >         }
> >       aux_value;
> >   
> > *************** struct symbol
> > *** 671,676 ****
> > --- 725,734 ----
> >       /* List of ranges where this symbol is active.  This is only
> >          used by alias symbols at the current time.  */
> >       struct range_list *ranges;
> > +     
> > +     /* Location functions for LOC_COMPUTED and LOC_COMPUTED_ARGS.  */
> > +     struct location_funcs *locfuncs;
> > +     
> >     };
> 
> I think both locbaton and locfuncs should be in the union (as a
> struct, of course).

Okey.
> > > *************** read_func_scope (struct die_info *die, s > 
> *** 1886,1891 ****
> > --- 1912,1931 ----
> >   
> >     new = push_context (0, lowpc);
> >     new->name = new_symbol (die, die->type, objfile, cu_header);
> > + 
> > +   /* If it has a location expression for fbreg, record it.  
> > +      Since not all symbols use location expressions exclusively yet,
> > +      we still need to decode it above. */
> > +   if (attr)
> > +     {
> > +       baton = obstack_alloc (&objfile->symbol_obstack, sizeof (struct locexpr_baton));
> > +       baton->sym = new->name;
> > +       baton->obj = objfile;
> > +       memcpy (&baton->locexpr, DW_BLOCK (attr), sizeof (struct dwarf_block));
> > +       SYMBOL_LOCATION_FUNCS (new->name) = &locexpr_funcs;
> > +       SYMBOL_LOCATION_BATON (new->name) = baton;
> > +     }
> > + 
> >     list_in_scope = &local_symbols;
> >   
> >     if (die->has_children)
> 
> Whoa, this is problematic.
> 
> - Big issue: you're assigning a meaning to the aux_value field of a
>   function's symbol, where previously it had none.

So?

The aux value field is there because "Some symbols require an additional 
value to be recorded on a per-symbol basis. Stash those values here".

Function symbols require an additional value to be recorded on a per 
symbol basis.
So I stashed it there.

Seemed to make sense at the time.

If you've got a better idea, i'm happy to implement it.
>   This at the very
>   least needs to be documented --- I almost didn't notice the issue,
>   since it isn't mentioned in the comments for `struct symbol'.
Oversight on my part.

>   And
>   while we certainly need some way to find the fbreg expression, I'm
>   sure that taking over generic fields of function symbols for a
>   rather Dwarf-specific optimization is not the right way to do this.

It's actually neither a generic field, nor dwarf specific, 
nor an optimization.

The field is per-symbol specific, and has no specific purpose given, other 
than to be a place to stash values.

It's not dwarf specific. Any time we need some function level 
value that is computed at runtime, to compute something *else* at runtime, 
we are gonna have to store the baton somewhere.  It is, 
in the debug info, attached to the function symbols, so it seemed to make 
sense that it should stay attached to the function symbol in gdb.
But, as I said, if you've got a better idea, i'm happy to implement it.
It's no skin off my back, and i'm not religious about where the fbreg is 
stored.
:)



> 
>   I think it would be better if all the batons for a function's
>   locals, etc. had a pointer to a shared thingy for the fbreg
>   expression. 

If you like, but IMHO, it just makes the code more complex, and less 
understandable to debug. It makes less sense from a perspective of someone 
trying to 
understand what's going on, because the pointer would only be different in 
different functions anyway.  And if they looked at the debug info, they 
would see the expression associated only with the function, and only 
appear once.

>    That means the Dwarf reader has to build this new fbreg
>   structure when it enters the function, and just leave a pointer to
>   it lying around for all the new batons to use, which is a bit icky.
>   But that's basically what we do now, with the `frame_base_reg' and
>   `frame_base_offset' global variables.

If you like, i'll do this, but i still don't see it as a better solution.  
It just seems to make it more complex, all for the added benefit of 
every reader with these types of expressions having to redo this type of 
thing.

> - Small issue: DW_BLOCK points into the .dwarf_info buffer, right?

Yup.

>   Are there any other cases where the symbol table points into
>   .dwarf_info?  The names get copied for partial symbols and full
>   symbols.  I think we need to copy the whole location expression into
>   the objfile's obstack.  Similarly for the DW_TAG_variable case.

Okeydokey.

Some year, when bfd supports mmap, ...

> 
> > +   struct dwarf_expr_baton  *debaton = (struct dwarf_expr_baton *)baton;
> 
> (a `debaton' being the fundamental particle of argument...  :) )

dwarf-expr-baton
:)

> 
> > + /* Read a register for the dwarf2 expression evaluator.  */
> > + CORE_ADDR dwarf_expr_read_reg (void *baton, int regnum)
> > + {
> > +   CORE_ADDR result;
> > +   struct dwarf_expr_baton  *debaton = (struct dwarf_expr_baton *)baton;
> > +   char * buf = (char *) alloca (MAX_REGISTER_RAW_SIZE);
> > +   get_saved_register (buf, NULL, NULL, debaton->frame, regnum, NULL);
> > +   result = extract_address  (buf, REGISTER_RAW_SIZE (regnum));
> > +   return result;
> > + }
> 
> Would `read_register' work here?

No, because we read_register gives the register for the *current* 
frame, and the frame *we want* is associated with a given evaluator baton. 
(debaton->frame).
So the answer is "for reading variables, debaton->frame is, AFAIK, always 
going to be the same as the frame read_register would use, but from an 
abstraction perspective, we shouldn't use this fact since it's using magic knowledge 
of how we happened to initalize the  evaluator context below, which is 
bad." 
IIRC, the whole reason for this at all is that it's necessary to be able 
to reuse the evaluator in dwarf2cfi.

> 
> Your post doesn't include dwarf2expr.[ch].

Whoops.
I forgot I didn't readd it with cvs add, so diff didn't diff it.

> 
> If your evaluator is based on the one in dwarf2cfi.c, did you
> incorporate Mark's recent fix?
It's not based on dwarf2cfi.

The routine that reads memory (remember, it's abstracted, as you 
requested) for dwarf2read's usage of it (location expressions) is in 
dwarf_expr_read_mem, which is simply:


/* Read memory for the dwarf2 expression evaluator.  */
CORE_ADDR dwarf_expr_read_mem (void *baton, CORE_ADDR addr, size_t len)
{
  return (CORE_ADDR) read_memory_unsigned_integer (addr, len);
}
Follow-Ups:
- Re: [RFA]: LOC_COMPUTED + abstracted dwarf2 evaluator, again
  - From: Jim Blandy
References:
- Re: [RFA]: LOC_COMPUTED + abstracted dwarf2 evaluator, again
  - From: Jim Blandy
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]