This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Linux kernel thread runtime support

From: Philipp Rudo <prudo at linux dot vnet dot ibm dot com>
To: Peter Griffin <peter dot griffin at linaro dot org>
Cc: gdb-patches at sourceware dot org, palves at redhat dot com, brobecker at adacore dot com, kevinb at redhat dot com, cagney at gnu dot org, dje at google dot com, drow at false dot org, kettenis at gnu dot org, yao dot qi at arm dot com, stanshebs at google dot com, Ulrich dot Weigand at de dot ibm dot com, elena dot zannoni at oracle dot com, eliz at gnu dot org, arnez at linux dot vnet dot ibm dot com
Date: Thu, 12 Jan 2017 14:02:09 +0100
Subject: Re: [PATCH] Linux kernel thread runtime support
Authentication-results: sourceware.org; auth=none
References: <1482427864-4317-1-git-send-email-peter.griffin@linaro.org> <20170103135514.0dad741e@ThinkPad> <20170110155556.GA15774@griffinp-ThinkPad-X1-Carbon-2nd>

Hi Peter

i finally managed to send my patches. I would appreciate if could have
a look at them. Any feedback is welcome.


On Tue, 10 Jan 2017 15:55:56 +0000
Peter Griffin <peter.griffin@linaro.org> wrote:

[...]

> > > void
> > > lkthread_get_per_cpu_offsets(int numcores)
> > > {
> > >   enum bfd_endian byte_order = gdbarch_byte_order (target_gdbarch
> > > ()); int length = TYPE_LENGTH (builtin_type (target_gdbarch
> > > ())->builtin_data_ptr); CORE_ADDR curr_addr = ADDR
> > > (__per_cpu_offset); int core;
> > > 
> > > 
> > >   if (!HAS_ADDR (__per_cpu_offset))
> > >     {
> > >       if (debug_linuxkthread_threads)
> > > 	fprintf_unfiltered (gdb_stdlog, "Assuming non-SMP
> > > kernel.\n");
> > > 
> > >       return;
> > >     }
> > > 
> > >   for (core=0; core < numcores; core++)  
> > 
> > This iteration does not work. As CPUs can be added/removed the
> > logical CPU-ids needn't be continuous, e.g. with two CPUs the ids
> > can be 0 and 2. You have to walk the cpu_online_mask to see which
> > CPUs are used at the moment. The iteration appears several times in
> > the code.  
> 
> Good point, that is something I had not considered! Will fix this
> in V2, although the code makes this assumption in quite a few
> places :(
> 
> To get this working nicely I think either GDB will need a
> delete_thread() API which also decrements the thread number. As with
> the physical CPU being added / removed debug usecase we will need to
> add/remove a different number of swapper threads, and we don't wish
> for the GDB thread numbering to be 'ever increasing' in such a
> situation throughout the debug session.

I don't see why this should be an problem to simply add additional
swapper tasks. I don't expect too many CPUs to added/removed and the
kernel tries to recycle the CPU ids when possible. Thus i'd prefer this
method ...

> Alternatively (maybe better) I could create dummy swapper entries for
> every 'possible CPU', of the target, if some CPU's are offline these
> will remain dummy swapper threads. This has the benefit that thread
> numbering of all the Linux threads which are added after the swappers
> will keep the same thread numbering in GDB regardless of what CPU's
> are brought online / offline.

... over your alternative. At least for S390 the max number of possible
CPUs is 256 which would add quite an overhead...

> > >   {
> > >       if (!lkthread_h->per_cpu_offset[core])
> > > 	lkthread_h->per_cpu_offset[core] =
> > > 	  read_memory_unsigned_integer (curr_addr, length,
> > > byte_order);
> > > 
> > >       curr_addr += (CORE_ADDR) length;
> > > 
> > >       if (!lkthread_h->per_cpu_offset[core])
> > > 	{
> > > 	  warning ("Suspicious null per-cpu offsets,"
> > > 		   " or wrong number of detected cores:\n"
> > > 		   "ADDR (__per_cpu_offset) = %s\nmax_cores = %d",
> > > 		   phex (ADDR (__per_cpu_offset),4), max_cores);
> > > 
> > > 	  break;
> > > 	}
> > >     }
> > > 
> > >   if (debug_linuxkthread_threads)
> > >     fprintf_unfiltered (gdb_stdlog, "SMP kernel. %d cores
> > > detected\n", /numcores);
> > > }  
> > 
> > [...]
> >   
> > > static void
> > > lkthread_init (void)
> > > {
> > >   struct thread_info *th = NULL;
> > >   struct cleanup *cleanup;
> > >   int size =
> > >     TYPE_LENGTH (builtin_type (target_gdbarch
> > > ())->builtin_unsigned_long);
> > > 
> > >   /* Ensure thread list from beneath target is up to date.  */
> > >   cleanup = make_cleanup_restore_integer (&print_thread_events);
> > >   print_thread_events = 0;
> > >   update_thread_list ();
> > >   do_cleanups (cleanup);
> > > 
> > >   /* Count the h/w threads.  */
> > >   max_cores = thread_count ();  
> > 
> > The number of CPUs does not have to be constant. Especially when you
> > are running in a VM you can add/remove CPUs. Thus max_cores has to
> > be determined dynamically or set to the max number of CPUs, e.g. by
> > taking the size (in bits) of cpu_online_mask.  
> 
> Good point, will fix in V2 most likely by taking max possible CPU's
> from cpu_online_mask, and adding dummy swappers for each (as
> suggested above).
> 
> Although I'm wondering what different GDB remote stubs do in this
> situation when a CPU has been taken offline. In Qemu at least it is
> still reported by the GDB remote stub, and we can still get a
> backtrace from it even when offline.
> 
> I suspect with a OpenOCD setup when CPU is powered off OpenOCD will
> loose access to the debug unit and SWD/JTAG connection will drop
> (certainly my experience debugging with OpenOCD in the past with
> cpuidle enabled in the kernel).

For me the Qemu behavior sounds like a bug. Qemu shouldn't report any
off line CPU. The OpenOCD behavior sounds much more reasonable. Anyway
your code shouldn't have a need to access this data as it doesn't
contain any information you need. However i haven't worked with the
remote target and gdbserver yet and don't know their exact behavior.

> > >   gdb_assert (max_cores);
> > > 
> > >   if (debug_linuxkthread_threads)
> > >     {
> > >       fprintf_unfiltered (gdb_stdlog, "lkthread_init() cores(%d)
> > > GDB" "HW threads\n", max_cores);
> > >       iterate_over_threads (thread_print_info, NULL);
> > >     }
> > > 
> > >   /* Allocate per cpu data.  */
> > >   lkthread_alloc_percpu_data(max_cores);  
> > 
> > Together with the comment above, allocating the percpu_data to a
> > fixed size can cause problems when the number of CPUs is
> > increased.  
> 
> Will fix in v2.
> 
> >   
> > >   lkthread_get_per_cpu_offsets(max_cores);
> > > 
> > >   if (!lkthread_get_runqueues_addr () && (max_cores > 1))
> > >     fprintf_unfiltered (gdb_stdlog, "Could not find the address of
> > > CPU" " runqueues current context information maybe "
> > > 			"less precise\n.");
> > > 
> > >   /* Invalidate the linux-kthread cached list.  */
> > >   lkthread_invalidate_threadlist ();
> > > }  
> > 
> > [...]
> >   
> > > static linux_kthread_info_t **
> > > lkthread_get_threadlist_helper (linux_kthread_info_t ** ps)
> > > {
> > >   struct linux_kthread_arch_ops *arch_ops =
> > >     gdbarch_linux_kthread_ops (target_gdbarch ());
> > >   CORE_ADDR rq_idle_taskstruct;
> > >   CORE_ADDR g, t, init_task_addr;
> > >   int core = 0, i;
> > > 
> > >   if (debug_linuxkthread_threads)
> > >     fprintf_unfiltered (gdb_stdlog,
> > > "lkthread_get_threadlist_helper\n");
> > > 
> > >   init_task_addr = ADDR (init_task);
> > >   g = init_task_addr;
> > > 
> > >   do
> > >     {
> > >       t = g;
> > >       do
> > >         {
> > > 
> > > 	  if (!arch_ops->is_kernel_address(t))
> > > 	    {
> > > 	      warning ("parsing of task list stopped because of
> > > invalid address" "%s", phex (t, 4));
> > >               break;
> > > 	    }
> > > 
> > >           lkthread_get_task_info (t, ps, core /*zero-based */ );
> > >           core = CORE_INVAL;
> > > 
> > >           if (ptid_get_tid (PTID_OF (*ps)) == 0)
> > >             {
> > >               /* This is init_task, let's insert the other cores
> > > swapper now.  */
> > >               for (i = 1; i < max_cores; i++)
> > >                 {
> > >                   ps = &((*ps)->next);
> > >                   rq_idle_taskstruct = lkthread_get_rq_idle (i);
> > >                   lkthread_get_task_info (rq_idle_taskstruct, ps,
> > > i); }
> > >             }  
> > 
> > How does the setting to a core work? For me it looks like only for
> > init_task and the swapper tasks core != CORE_INVAL. Is that the
> > intended behavior?  
> 
> Not sure if I have understood your question correctly or not but...
> 
> It does look like a bug there though as these two lines
> 
> > >           lkthread_get_task_info (t, ps, core /*zero-based */ );
> > >           core = CORE_INVAL;  
> 
> Should be swapped, rather than adding all threads with core=0 they
> should be added with core=CORE_INVAL.

Well not all threads are added with core=0, just init_task is...
The only other line where core is used is at the beginning of the
function where it is defined and initialized to zero. Thus after the
first loop (with init_task) core will never change again to a value
other than CORE_INVAL. If you swap those two lines you can get rid of
core all together and simply pass CORE_INVAL to lkthread_get_task_info.
 
> The user is informed which thread is running on which physical CPU
> core by linux_kthread_extra_thread_info() callback. This calls
> lkthread_is_curr_task() for each GDB thread, which calls
> lkthread_get_running() which updates 'current->core'.
> 
> If thread is currently executing on the CPU then
> linux_kthread_extra_thread_info() adds a 'C<num>'
> suffix to thread name, so it is clear to the user when issuing
> a 'info threads' command.
> 
> e.g
> 
> ^C[New getty]
> 
> Thread 1 received signal SIGINT, Interrupt.
> cpu_v7_do_idle ()
> at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
> 75		ret	lr (gdb) info threads
>   Id   Target Id         Frame 
> * 1    [swapper/0] (pid: 0 tgid: 0 <C0>) cpu_v7_do_idle ()
>                                    ^^^^
> 
> at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/arch/arm/mm/proc-v7.S:75
>   2    init (pid: 1 tgid: 1) context_switch (cookie=...,
> next=<optimized out>, prev=<optimized out>, rq=<optimized out>)
> at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
> 3    [kthreadd] (pid: 2 tgid: 2) context_switch (cookie=...,
> next=<optimized out>, prev=<optimized out>, rq=<optimized out>)
> at /home/griffinp/Software/gdb/lkd/lkd/sources/linux/kernel/sched/core.c:2902
> 
> When target halted CPU0 was executing in swapper.
> 
> Did that answer your question?

thanks that helped. Although i still don't fully understand how the
initialization works. I find lkthread_get_task_info quite confusing.

Thanks for your answer
Philipp

> > > 	    if (debug_linuxkthread_threads)
> > > 	      fprintf_unfiltered (gdb_stdlog, "Got task info for
> > > %s (%li)\n", (*ps)->comm, ptid_get_lwp (PTID_OF (*ps)));
> > > 
> > >           ps = &((*ps)->next);
> > > 
> > > 	  /* Mark end of chain and remove those threads that
> > > disappeared from the thread_list to avoid any_thread_of_process()
> > > to select a ghost.  */
> > >           if (*ps)
> > >             (*ps)->valid = 0;
> > > 
> > >           t = _next_thread (t);
> > >         } while (t && (t != g));
> > > 
> > >       g = _next_task (g);
> > >     } while (g && (g != init_task_addr));  
> > 
> > Although pretty unlikely, the do {} while can end in an endless loop
> > when you have a memory corruption compromising one of the
> > list_heads. Thus there should be more sanity checks than just the
> > one for NULL. I use list_head->next->prev == list_head.  
> 
> Ok, thanks for the suggestion, will fix in v2.
> 
> regards,
> 
> Peter.
>

Follow-Ups:
- Re: [PATCH] Linux kernel thread runtime support
  - From: Peter Griffin

References:
- Re: [PATCH] Linux kernel thread runtime support
  - From: Philipp Rudo
- Re: [PATCH] Linux kernel thread runtime support
  - From: Peter Griffin

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]