This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/3 v2] Fast tracepoint for powerpc64le


Wei-cheng Wang wrote:

> >> +static void
> >> +ppc64_emit_reg (int reg)
> >> +{
> >> +  unsigned char buf[10 * 4];
> >> +  unsigned char *p = buf;
> >> +
> >> +  p += GEN_LD (p, 3, 31, bc_framesz - 32);
> >> +  p += GEN_LD (p, 3, 3, 48);	/* offsetof (fast_tracepoint_ctx, regs) */
> >
> > This seems a bit fragile, it would be better to determine the offset
> > automatically ...   (I don't quite understand why the x86 code works
> > either, as it is right now ...)
>
>Hi Predro,
>
>I checked the implementation of x86 emit_reg and it seems the implementation
>is wrong.  It assumes the first argument is ctx->regs, but it's actually 'ctx'
>
>if (tpoint->compiled_cond)
>   err = ((condfn) (uintptr_t) (tpoint->compiled_cond)) (ctx, &value);
>
>I think probably either we could pass ctx->regs to compiled_cond instead,
>or move the declarations of fast_tracepoint_ctx (and others) to tracepoint.h,
>so we can use "offsetof (fast_tracepoint_ctx, regs)" instead.
>Any suggestion?

FWIW, passing the regs buffer directly to the compiled routine seems
more straightforward to me ...

>The following are specific to PowerPC.
>
> >> +  if ((imm >> 8) == 0)
> >> +    {
> >> +      /* li	reg, imm[7:0] */
> >> +      p += GEN_LI (p, reg, imm);
> > Actually, you can load values up to 32767 with a single LI.
>
>How about this fix?  So we can load a small negative number with LI.
>
>if ((imm + 32768) < 65536)
>   {
>     /* li     reg, imm[7:0] */
>     p += GEN_LI (p, reg, imm);
>   }

That looks good to me.

> >> p += gen_atomic_xchg (p, lockaddr, 0, 1);
> >> /* Call to collector.  */
> >> p += gen_call (p, collector);
> >> p += gen_atomic_xchg (p, lockaddr, 1, 0);
> > This seems wrong.  Shouldn't *lockaddr be set to the address
> > of a collecting_t object, and not just "1"?
>
>AFAIK, lockaddr only matters to the two lines above,
>so simply put '1' for LOCKED should be fine.  Or am I missing anything?

Yes, the lockaddr is used from the gdbserver side.  See the uses of
ipa_sym_addrs.addr_collecting in tracepoint.c; it is expected that
this value is either NULL or else points to a collecting_t object
that can be read/written by gdbserver.

> >> +  /* Now, insert the original instruction to execute in the jump pad.  */
> >> +  *adjusted_insn_addr = buildaddr + (p - buf);
> >> +  *adjusted_insn_addr_end = *adjusted_insn_addr;
> >> +  relocate_instruction (adjusted_insn_addr_end, tpaddr);
> > Hmm.  This calls back to GDB to perform the relocation of the
> > original instruction.  On PowerPC, there are only a handful of
> > instructions that need to be relocated; I'm not sure it is really
> > necessary to call back to GDB.  Can't those just be relocated
> > directly here?   This might even make the code simpler overall.
>
>I just follow the design of Predo.
>I could move the code to gdbserver side if you suggest.

I think if there is no benefit in having this code on the GDB side,
it would be better to move it to gdbsever.  (Potential benefits could
be: we need information that isn't available in gdbserver, like symbol
data; or the code can be shared with other users if in GDB.  But on
PowerPC, I think none of this applies.)

> > Note that the stack frame layout as above only applies to ELFv1, but
> > you're actually only supporting ELFv2 at the moment.  For ELFv2, there
> > is no parameter save area (for this specific call), there is no compiler
> > or linker doubleword, and the TOC save area is at SP + 24.  (So this
> > location probably shouldn't be used to save something else ...)
> > This is also a bit bigger than required for ELFv2.  On the other hand,
> > having a larger buffer doesn't hurt.
>
>Oops, I have to admit I looked into the wrong ABI document.
>Hopefully we will support ppc64be soon, so I suggest still use 112-byte for
>minimual frame size for simplicity?

Yes, as I said, just having a larger frame is OK.  However, there are
other places where the ABI matters (like not overwriting the TOC save
area, or like respecting the (lack of) stack red zone ...).

> >> +/* Return true if ADDR is a valid address for tracepoint.  Set *ISZIE
> >> +   to the number of bytes the target should copy elsewhere for the
> >> +   tracepoint.  */
> >> +
> >> +static int
> >> +ppc_fast_tracepoint_valid_at (struct gdbarch *gdbarch,
> >> +			      CORE_ADDR addr, int *isize, char **msg)
> >> +{
> >> +  if (isize)
> >> +    *isize = gdbarch_max_insn_length (gdbarch);
> >> +  if (msg)
> >> +    *msg = NULL;
> >> +  return 1;
> >> +}
> >
> > Should/can we check here where the jump to the jump pad will be in
> > range?  Might be better to detect this early ...
>
>Client has no idea about where the jump pad will be installed.
>If it's out of range, gdbserver will report it right after user
>entered 'tstart' command

Well, but we know the logic the stub uses.  For example, we know that
we certainly cannot install a fast tracepoint in any shared library code,
since the jump pad will definitely be too far away.  We can check for
this condition here.  (We could also check for tracepoints in executables
that have a text section larger than 32 MB ...)

> > Well, we really should handle the other cases too; there's no reason
> > to simply fail if this happens to be a branch on count or such ...
>
>In the new patch, I will handle other cases as such,
>
>   bdnz   .Lgoto
>1:INSN1
>
>is transform to
>
>   bdz
>   b      .Lgoto
>1:INSN1
>
>   and
>
>   bdnzt  eq, .Lgoto
>
>is transfrom to
>
>   bdz    1f (+12)
>   bf     eq, 1f (+8)
>   b      .Lgoto
>1:INSN1
>
>Is it right?

That looks OK, thanks.

> >> +      frame.frameless = 0;
> >> +      frame.lr_offset = 16;
> >
> > This isn't correct for ppc32 ...
>
>frame.lr_offset = 4;
>Right?

Correct.

> > In any case, this code makes many assumptions that may not always be
> > true.  But then again, the same is true for the i386 case, so maybe this
> > is OK for now ...   In general, if we have DWARF CFI for the function,
> > it would be much preferable to refer to that in order to determine the
> > exact stack layout.
>
>This function will only be used if user want to collect return address
>(using 'collect $_ret' action command)
>
>I agree that we should use DWARF CFI, but I have no idea how can I
>get the information in this function.  Any suggestion where I can look into?

This is probably going to require major changes, so it looks like something
for a separate project.  In general, using DWARF CFI to unwind a frame is
currently done in dwarf2-frame.c, with the bulk of the work done in
dwarf2_frame_cache and dwarf2_frame_prev_register.  This conceptually
involves two stages: looking up the FDE from the current PC (which can
be done using just the executable file) to determine the "recipe" how
to unwind at this location, and then implement the recipe, which involves
using current register values and memory content.  Unfortunately, those
two stages are somewhat interwoven and not cleanly separated in the
current implemention, which makes it not directly usable for tracepoints.

What we should be doing is to actually implement a clean separation,
so that the work to determine the unwind "recipe" can be shared between
live unwinding and tracepoints, and then implement a tracepoint-specific
step that uses the recipe to generate a series of agent commands that
will implement "running" the recipe in the context of the agent.  This
has already been done for the use of DWARF that describes variable
locations (see locexpr_tracepoint_var_ref in dwarf2loc.c), but not yet
for the use of DWARF to unwind.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]