This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Linux kernel thread runtime support


Hi Philipp,

On Thu, 12 Jan 2017, Philipp Rudo wrote:

> Hi Peter
> 
> i finally managed to send my patches. I would appreciate if could have
> a look at them. Any feedback is welcome.

OK great, I will take a look at them. OOI can you post a GIT tree link
so I can easily pull your patches and have a play around?

> 
> 
> On Tue, 10 Jan 2017 15:55:56 +0000
> Peter Griffin <peter.griffin@linaro.org> wrote:
> 
> [...]
> 
> > > > void
> > > > lkthread_get_per_cpu_offsets(int numcores)
> > > > {
> > > >   enum bfd_endian byte_order = gdbarch_byte_order (target_gdbarch
> > > > ()); int length = TYPE_LENGTH (builtin_type (target_gdbarch
> > > > ())->builtin_data_ptr); CORE_ADDR curr_addr = ADDR
> > > > (__per_cpu_offset); int core;
> > > > 
> > > > 
> > > >   if (!HAS_ADDR (__per_cpu_offset))
> > > >     {
> > > >       if (debug_linuxkthread_threads)
> > > > 	fprintf_unfiltered (gdb_stdlog, "Assuming non-SMP
> > > > kernel.\n");
> > > > 
> > > >       return;
> > > >     }
> > > > 
> > > >   for (core=0; core < numcores; core++)  
> > > 
> > > This iteration does not work. As CPUs can be added/removed the
> > > logical CPU-ids needn't be continuous, e.g. with two CPUs the ids
> > > can be 0 and 2. You have to walk the cpu_online_mask to see which
> > > CPUs are used at the moment. The iteration appears several times in
> > > the code.  
> > 
> > Good point, that is something I had not considered! Will fix this
> > in V2, although the code makes this assumption in quite a few
> > places :(
> > 
> > To get this working nicely I think either GDB will need a
> > delete_thread() API which also decrements the thread number. As with
> > the physical CPU being added / removed debug usecase we will need to
> > add/remove a different number of swapper threads, and we don't wish
> > for the GDB thread numbering to be 'ever increasing' in such a
> > situation throughout the debug session.
> 
> I don't see why this should be an problem to simply add additional
> swapper tasks. I don't expect too many CPUs to added/removed and the
> kernel tries to recycle the CPU ids when possible. Thus i'd prefer this
> method ...

The problem comes with preserving the thread numbering throughout the debug
session. You probably won't have experienced this problem if you only enumerate
the thread list once in the core dump use case.

Essentially atm in linux-kthread, all swappers are added first, and then
the Linux threads, with the (incorrect) assumption that the number of physical
CPU's stays the same throughout the debug session.

This is done to preserve the thread numbering of the Linux threads when the
user issues 'info threads'.

For example: -

(gdb) info threads
  Id   Target Id         Frame 
* 1    [swapper/0] (pid: 0 tgid: 0 <C0>) cpu_v7_do_idle ()
    at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
  2    init (pid: 1 tgid: 1) context_switch (cookie=..., next=<optimized out>, prev=<optimized out>, 
    rq=<optimized out>) at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
  3    [kthreadd] (pid: 2 tgid: 2) context_switch (cookie=..., next=<optimized out>, prev=<optimized out>, 
    rq=<optimized out>) at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
<snip>
  92   getty (pid: 2243 tgid: 2243) context_switch (cookie=..., next=<optimized out>, prev=<optimized out>, 
    rq=<optimized out>) at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902

Now if we continue the target and halt sometime later and we now have 2 CPU's
online, then we need 2 swapper threads which if added first would alter the
thread numbering of all the Linux threads.

e.g.

(gdb) info threads
  Id   Target Id         Frame 
* 1    [swapper/0] (pid: 0 tgid: 0 <C0>) cpu_v7_do_idle ()
    at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
  2    [swapper/1] (pid: 0 tgid: 0 <C1>) cpu_v7_do_idle ()
    at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
  3  init (pid: 1 tgid: 1) context_switch (cookie=..., next=<optimized out>, prev=<optimized out>, 
    rq=<optimized out>) at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
  4    [kthreadd] (pid: 2 tgid: 2) context_switch (cookie=..., next=<optimized out>, prev=<optimized out>, 
    rq=<optimized out>) at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
<snip>
  93   getty (pid: 2243 tgid: 2243) context_switch (cookie=..., next=<optimized out>, prev=<optimized out>, 
    rq=<optimized out>) at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902

Where 'getty' is now thread 93 instead of thread 92 previously. Having the
thread numbering move around like that will be confusing for the user I think
which is where the suggestion of adding all possible swappers came from.

Alternatively we could always add the swappers *after* adding all the Linux threads
which would mean the Linux thread numbering would remain constant, but the
swapper thread number would jump around as cpu's come online / offline and
more Linux threads are created over time.

I think adding the swappers *after* the Linux threads would be preferable. Either
approach requires a change in core GDB thread numbering logic, which currently
only ever 'increases' (the thread number is not decremented when a thread is
deleted).

> > Alternatively (maybe better) I could create dummy swapper entries for
> > every 'possible CPU', of the target, if some CPU's are offline these
> > will remain dummy swapper threads. This has the benefit that thread
> > numbering of all the Linux threads which are added after the swappers
> > will keep the same thread numbering in GDB regardless of what CPU's
> > are brought online / offline.
> 
> ... over your alternative. At least for S390 the max number of possible
> CPUs is 256 which would add quite an overhead...

Eeek! Ok so that isn't really a feasible option then ;)

> 
> > > >   {
> > > >       if (!lkthread_h->per_cpu_offset[core])
> > > > 	lkthread_h->per_cpu_offset[core] =
> > > > 	  read_memory_unsigned_integer (curr_addr, length,
> > > > byte_order);
> > > > 
> > > >       curr_addr += (CORE_ADDR) length;
> > > > 
> > > >       if (!lkthread_h->per_cpu_offset[core])
> > > > 	{
> > > > 	  warning ("Suspicious null per-cpu offsets,"
> > > > 		   " or wrong number of detected cores:\n"
> > > > 		   "ADDR (__per_cpu_offset) = %s\nmax_cores = %d",
> > > > 		   phex (ADDR (__per_cpu_offset),4), max_cores);
> > > > 
> > > > 	  break;
> > > > 	}
> > > >     }
> > > > 
> > > >   if (debug_linuxkthread_threads)
> > > >     fprintf_unfiltered (gdb_stdlog, "SMP kernel. %d cores
> > > > detected\n", /numcores);
> > > > }  
> > > 
> > > [...]
> > >   
> > > > static void
> > > > lkthread_init (void)
> > > > {
> > > >   struct thread_info *th = NULL;
> > > >   struct cleanup *cleanup;
> > > >   int size =
> > > >     TYPE_LENGTH (builtin_type (target_gdbarch
> > > > ())->builtin_unsigned_long);
> > > > 
> > > >   /* Ensure thread list from beneath target is up to date.  */
> > > >   cleanup = make_cleanup_restore_integer (&print_thread_events);
> > > >   print_thread_events = 0;
> > > >   update_thread_list ();
> > > >   do_cleanups (cleanup);
> > > > 
> > > >   /* Count the h/w threads.  */
> > > >   max_cores = thread_count ();  
> > > 
> > > The number of CPUs does not have to be constant. Especially when you
> > > are running in a VM you can add/remove CPUs. Thus max_cores has to
> > > be determined dynamically or set to the max number of CPUs, e.g. by
> > > taking the size (in bits) of cpu_online_mask.  
> > 
> > Good point, will fix in V2 most likely by taking max possible CPU's
> > from cpu_online_mask, and adding dummy swappers for each (as
> > suggested above).
> > 
> > Although I'm wondering what different GDB remote stubs do in this
> > situation when a CPU has been taken offline. In Qemu at least it is
> > still reported by the GDB remote stub, and we can still get a
> > backtrace from it even when offline.
> > 
> > I suspect with a OpenOCD setup when CPU is powered off OpenOCD will
> > loose access to the debug unit and SWD/JTAG connection will drop
> > (certainly my experience debugging with OpenOCD in the past with
> > cpuidle enabled in the kernel).
> 
> For me the Qemu behavior sounds like a bug. Qemu shouldn't report any
> off line CPU. The OpenOCD behavior sounds much more reasonable. Anyway
> your code shouldn't have a need to access this data as it doesn't
> contain any information you need. However i haven't worked with the
> remote target and gdbserver yet and don't know their exact behavior.
> 
> > > >   gdb_assert (max_cores);
> > > > 
> > > >   if (debug_linuxkthread_threads)
> > > >     {
> > > >       fprintf_unfiltered (gdb_stdlog, "lkthread_init() cores(%d)
> > > > GDB" "HW threads\n", max_cores);
> > > >       iterate_over_threads (thread_print_info, NULL);
> > > >     }
> > > > 
> > > >   /* Allocate per cpu data.  */
> > > >   lkthread_alloc_percpu_data(max_cores);  
> > > 
> > > Together with the comment above, allocating the percpu_data to a
> > > fixed size can cause problems when the number of CPUs is
> > > increased.  
> > 
> > Will fix in v2.
> > 
> > >   
> > > >   lkthread_get_per_cpu_offsets(max_cores);
> > > > 
> > > >   if (!lkthread_get_runqueues_addr () && (max_cores > 1))
> > > >     fprintf_unfiltered (gdb_stdlog, "Could not find the address of
> > > > CPU" " runqueues current context information maybe "
> > > > 			"less precise\n.");
> > > > 
> > > >   /* Invalidate the linux-kthread cached list.  */
> > > >   lkthread_invalidate_threadlist ();
> > > > }  
> > > 
> > > [...]
> > >   
> > > > static linux_kthread_info_t **
> > > > lkthread_get_threadlist_helper (linux_kthread_info_t ** ps)
> > > > {
> > > >   struct linux_kthread_arch_ops *arch_ops =
> > > >     gdbarch_linux_kthread_ops (target_gdbarch ());
> > > >   CORE_ADDR rq_idle_taskstruct;
> > > >   CORE_ADDR g, t, init_task_addr;
> > > >   int core = 0, i;
> > > > 
> > > >   if (debug_linuxkthread_threads)
> > > >     fprintf_unfiltered (gdb_stdlog,
> > > > "lkthread_get_threadlist_helper\n");
> > > > 
> > > >   init_task_addr = ADDR (init_task);
> > > >   g = init_task_addr;
> > > > 
> > > >   do
> > > >     {
> > > >       t = g;
> > > >       do
> > > >         {
> > > > 
> > > > 	  if (!arch_ops->is_kernel_address(t))
> > > > 	    {
> > > > 	      warning ("parsing of task list stopped because of
> > > > invalid address" "%s", phex (t, 4));
> > > >               break;
> > > > 	    }
> > > > 
> > > >           lkthread_get_task_info (t, ps, core /*zero-based */ );
> > > >           core = CORE_INVAL;
> > > > 
> > > >           if (ptid_get_tid (PTID_OF (*ps)) == 0)
> > > >             {
> > > >               /* This is init_task, let's insert the other cores
> > > > swapper now.  */
> > > >               for (i = 1; i < max_cores; i++)
> > > >                 {
> > > >                   ps = &((*ps)->next);
> > > >                   rq_idle_taskstruct = lkthread_get_rq_idle (i);
> > > >                   lkthread_get_task_info (rq_idle_taskstruct, ps,
> > > > i); }
> > > >             }  
> > > 
> > > How does the setting to a core work? For me it looks like only for
> > > init_task and the swapper tasks core != CORE_INVAL. Is that the
> > > intended behavior?  
> > 
> > Not sure if I have understood your question correctly or not but...
> > 
> > It does look like a bug there though as these two lines
> > 
> > > >           lkthread_get_task_info (t, ps, core /*zero-based */ );
> > > >           core = CORE_INVAL;  
> > 
> > Should be swapped, rather than adding all threads with core=0 they
> > should be added with core=CORE_INVAL.
> 
> Well not all threads are added with core=0, just init_task is...
> The only other line where core is used is at the beginning of the
> function where it is defined and initialized to zero. Thus after the
> first loop (with init_task) core will never change again to a value
> other than CORE_INVAL. If you swap those two lines you can get rid of
> core all together and simply pass CORE_INVAL to lkthread_get_task_info.
>  
> > The user is informed which thread is running on which physical CPU
> > core by linux_kthread_extra_thread_info() callback. This calls
> > lkthread_is_curr_task() for each GDB thread, which calls
> > lkthread_get_running() which updates 'current->core'.
> > 
> > If thread is currently executing on the CPU then
> > linux_kthread_extra_thread_info() adds a 'C<num>'
> > suffix to thread name, so it is clear to the user when issuing
> > a 'info threads' command.
> > 
> > e.g
> > 
> > ^C[New getty]
> > 
> > Thread 1 received signal SIGINT, Interrupt.
> > cpu_v7_do_idle ()
> > at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
> > 75		ret	lr (gdb) info threads
> >   Id   Target Id         Frame 
> > * 1    [swapper/0] (pid: 0 tgid: 0 <C0>) cpu_v7_do_idle ()
> >                                    ^^^^
> > 
> > at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
> >   2    init (pid: 1 tgid: 1) context_switch (cookie=...,
> > next=<optimized out>, prev=<optimized out>, rq=<optimized out>)
> > at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
> > 3    [kthreadd] (pid: 2 tgid: 2) context_switch (cookie=...,
> > next=<optimized out>, prev=<optimized out>, rq=<optimized out>)
> > at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
> > 
> > When target halted CPU0 was executing in swapper.
> > 
> > Did that answer your question?
> 
> thanks that helped. Although i still don't fully understand how the
> initialization works. I find lkthread_get_task_info quite confusing.

I agree that function could do with being simplified / rewritten, so it is easier
to follow.

regards,

Peter.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]