This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [MI] core awareness

From: Pedro Alves <pedro at codesourcery dot com>
To: Vladimir Prus <vladimir at codesourcery dot com>
Cc: gdb-patches at sourceware dot org
Date: Fri, 8 Jan 2010 20:30:20 +0000
Subject: Re: [MI] core awareness
References: <200912171533.21699.pedro@codesourcery.com> <200912311150.57763.pedro@codesourcery.com> <201001041811.04781.vladimir@codesourcery.com>
On Monday 04 January 2010 15:11:04, Vladimir Prus wrote:
> On Thursday 31 December 2009 14:50:57 Pedro Alves wrote:

> > > > > +      /* Update the core associated with a thread when we process stop
> > > > > +        event in that thread.  */
> > > > > +      info = find_thread_ptid (ptid);
> > > > > +      if (info && info->private)
> > > > > +       info->private->core = stop_reply->core;
> > > > 
> > > > Consider all-stop, multi-threaded applications:
> > > > 
> > > >  What about the core of the other threads?
> > > >  Shouldn't they all be invalidated?
> > > >  Won't then be stale?
> > > > 
> > > > (see comment below)
> > > 
> > > Yes, it will be stale. However, it's approximation only, and the only two
> > > places where cores number is used is *stopped response (which uses core of
> > > the stopped thread, updated above) and -list-thread-groups (which updates
> > > this information anyway). It does not seem right to additionally update
> > > this even if nothing will use it.
> > 
> > Not update, _invalidate_: tag that the core is unknown and needs fetching
> > from the target somehow (say, adding a info->private->core_p field?
> > Or setting it to info->private->core = -2?).   The invalidation could be
> > done on resume instead of on stop (e.g., linux-nat.c:resume_callback).  I
> > believe it is bad design practice to design an API and assume
> > what it's callers do with it.  
> 
> I think it's perfectly valid to optimize the API for the use cases we expect.

I agree, but I'm not talking about optimization, I'm talking about
not handling valid input, or relying on undocumented conditions of
when the API can and should be used.  See below for the current
example of what I'm talking about.

> Otherwise, it's easy to introduce complexity and pessimization that is not
> necessary.

This is not a valid argument for not getting things done right in
the first place.  That's only a guideline.  Certainly considering all
valid inputs does not necessarily mean complexity and pessimization.
I'll try to explain what I think isn't correct with either the
new target method's remote implementation and its assumptions
a bit further below, or with the documentation of the new target
hook.

> 
> > We can certainly come up with other
> > target_core_of_thread callers in the near future, and at that point,
> > we'll hit this issue.  But all that lead me to the real question I have in
> > my mind: how will we update the core of the thread, when we implement what
> > I just described?  Will we make remote_core_of_thread update all
> > threads, or come up with a special packet to get at the core of
> > a thread?  Probably the former is okay (albeit, slow, although
> > done only at most once), but if not, I'd rather consider the
> > packet upfront, than leave a hole in the design --- if we
> > considered such packet, the case for qXfer:threads would get
> > smaller (it becomes only justifiable due to supposedly fewer
> > packet roundtrips to list all threads and their extra info).
> 
> For a start, I certainly find that a single packet that returns all information
> about threads -- which a frontend might be requesting on each step -- is a good
> idea regardless of "core awareness". As for invalidation, I think it's reasonable
> to fetch information about all threads, just because the time to get that information
> is probably less than communication latency for a single packet.

Right, agreed on that then.  Good that it's considered.

> 
> So, for avoidance of doubt, how about this scheme:
> - core information of each thread is invalidated when that thread is resumed
>   (where resumption is reported by the 'target_resumed' observer)
> - remote_core_of_thread checks if it's asked for a core of the most recently
>   stopped thread. If so, it reports it. Otherwise, it refreshes the thread list, 
>   and then reports core it has fetched
> - this all is for all-stop only. For non-stop, we get stop packet of each stopped
>   thread and therefore remote_core_of_thread never needs to fetch information
>   about all threads.
> 
> If this sounds OK, I'll implement such a scheme.

Sounds good, although I still don't see why you need to special
case the last-reported-thread.  One concern about non-stop and
running threads though.  Looking at the new documentation of
the new target interface makes it clearer what I'm talking about:

    /* Return the core that thread PTID is on.  For a stopped thread, should 
       return the core the thread was last running on.  For a running thread, 
                                                        ^^^^^^^^^^^^^^^^^^^^^
       should return one of the cores that the thread was running between
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       the call to this function and return -- and if it was running on
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       several cores, any other may be returned.  
       If the core cannot be determined -- either for the specified thread, or
       right now, or in this debug session, or for this target -- return -1.  */

   int (*to_core_of_thread) (struct target_ops *, ptid_t ptid);


Specifically, the underlined bit.  If we look at the current 
remote.c implementation of that method,

 static int
 remote_core_of_thread (struct target_ops *ops, ptid_t ptid)
 {
   struct thread_info *info = find_thread_ptid (ptid);
   if (info && info->private)
     return info->private->core;
   else if (ptid_equal (ptid, thread_of_last_core))
     return last_core;
   return -1;
 }

we see that there's nothing here that considers the fact
that `info->private->core' gets stale real quick, and does
not actually represent "one of the cores that the thread
was running between the call to this function and return".
This is relying on the callers of target_core_of_thread
having refreshed the thread list, but there's
no reason a native target would need to update the
thread list to query the core of a single thread
when it has direct access to that info.

So, we either update the documentation of
target_core_of_thread's assumptions to match reality,
which feels a bit hackish given that it is a remote.c
related assumption, or we change the remote
implementation to refetch the data if the thread is
running, which has the problem that with the
current API, for every running thread, you'd re-update
the whole thread list (and this is what made me
wonder before how would you consider the protocol
should handle updating the core data of a
single thread).

> Well, while I presumably could have split /proc/cpuinfo on empty lines,
> and somehow print that in 'info os cores' I am unsure if we can promise
> any fields to be present in the output, given that man:proc does not
> mention any field except of "processor" and "bogomips". So, all that
> "info os cores" can meaningfully print is the numeric ids of cores,
> and this does not add very much to the information already available
> to the frontend.

E.g., if all threads in a system are pinned to / running on only one
of multiple cores, the frontend does not have have any way to know
about all the cores on the target system.  No way to show to the
user 40 threads on core 1, and 0 on core 2.  It simply has no
info about core 2.  That's fine, I was only wondering if you
had considered it; we can add this at any time.

> > The function inits the buffer, finishes the buffer and appends \0 to it, so
> > I'm confused by that comment.
> 
> The function does not init or finish BUFFER. It does init/finish a second buffer, but 
> that buffer is only used internally. Am I missing something?

No.  I was.

> > Could you please add an empty new line between function describing
> > comment and function definition (here and elsewhere)?  We're trying
> > to follow that style everywhere in gdb.
> 
> Is that for function definitions only? Or also for function declarations
> and variable declarations/definitions?

AFAIK, we've only been trying to standardize that on function
definitions.

> The primary problem is that when we process a stop reply
> for a new thread, the thread is not created and added to gdb
> thread table yet, so it's not possible to set its info->private.
> I don't see any way around this. 

The remote_notice_new_inferior calls always return
with the possibly new thread added to the list already.  From
the patched code:

remote_threads_info:
...
		      remote_notice_new_inferior (item->ptid, running);

		      info = find_thread_ptid (item->ptid);
		      if (info)
			{
			  if (!info->private) {
			    info->private = (struct private_thread_info *)
			      xmalloc (sizeof (struct private_thread_info));
			    info->private_dtor = free_private_thread_info;
			  }

			  info->private->extra = item->extra;
			  item->extra = 0;
			  info->private->core = item->core;
			}


process_stop_reply:
...
      /* Update the core associated with a thread when we process stop
	 event in that thread.  */
      info = find_thread_ptid (ptid);
      if (info && info->private)
	{
	  info->private->core = stop_reply->core;
	}
      else
	{
	  last_core = stop_reply->core;
	  thread_of_last_core = ptid;
	}

      remote_notice_new_inferior (ptid, 0);

You can, for example, pass the core to remote_notice_new_inferior,
and have remote_notice_new_inferior itself update the core field.
All the remote_notice_new_inferior calls other than the
one from within record_currthread have the new `core' handy,
and, I think the remote_notice_new_inferior call within
record_currthread is now always a nop, since the only case that
mattered (remote_wait_as) already calls remote_notice_new_inferior
from within process_stop_reply when handling 'T', etc. packets,
so it could be removed.  Do you confirm this?

> We can, in theory, make 
> remote_core_of_thread request the list of all threads even when
> called for a thread that has just stopped, but that sounds wasteful.

That would indeed be wasteful.

> > I think info->private->extra is leaking when threads are deleted,
> > because gdb/threads.c simply xfree's ->private.  I guess you're the
> > lucky first to need a destructor for thread private data.
> > nto-procfs.c was close, but avoids it by using a poor man's flexible
> > array member (nto-tdep.h). 
> 
> Done.

Thanks.  Only a couple of small comments to the patch itself:

> -several threads in the list.
> +several threads in the list.  The @var{core} field reports the
> +processor core on which the stop   event has happened.  This field may be absent
> +if such information is not available.

s/stop   event/stop event/


Formatting here is wrong:

> +                         if (!info->private) {
> +                           info->private = (struct private_thread_info *)
> +                             xmalloc (sizeof (struct private_thread_info));
> +                           info->private_dtor = free_private_thread_info;
> +                         }


( If you write that as `info->private = xmalloc (sizeof (*info->private))'
it's a shorter line. )


Also:

$ quilt refresh
Warning: trailing whitespace in line 118 of gdb/thread.c
Warning: trailing whitespace in lines 602,603,606 of gdb/target.h
Refreshed patch core3.diff

-- 
Pedro Alves
Follow-Ups:
- Re: [MI] core awareness
  - From: Vladimir Prus
References:
- Re: [MI] core awareness
  - From: Vladimir Prus
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]