This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: collecting data from a coring process


From: Samuel Bronson [mailto:naesten@gmail.com] 

 
> On Mon, Sep 5, 2016 at 7:19 PM, Paul Marquess <Paul.Marquess@owmobility.com> wrote:
> > From: Samuel Bronson [mailto:naesten@gmail.com]
> >
> >> On Mon, Sep 5, 2016 at 7:09 AM, Paul Marquess <Paul.Marquess@owmobility.com> wrote:
> >> > From: Dmitry Samersoff [mailto:dms@samersoff.net]
> >> >
> >> >> Paul,
> >> >>
> >> >> >> 1) Why not dump the information that you are looking for into a 
> >> >> >> file in the process signal handler ?
> >> >> >
> >> >> > Would love to, but I have no idea what state the process is in 
> >> >> > once the SEGV has been triggered.
> >> [...]
> >> > I know we've had problems with signal handlers causing problems, thus my preference to find a way to have the signal handler code do as little as possible and get all the data collection handled at arm's length by gdb.
> >>
> >> You could just spawn (and wait for) your GDB-launching script from 
> >> the signal handler; then, the process & stack will still be around for GDB.  I think this is even legal!
> >
> > That's one of the approaches I'm thinking of. I need to check if the fork/exec & wait use malloc.
> 
> I think it should suffice for them to be "async-signal-safe "?  It looks like signal(7) documents which functions several 
> versions of POSIX require to be async-signal-safe, and it looks like there are two versions of exec*() on there as well 
> as fork() and wait().  Which is basically what I meant by "I think this is even legal!" :-).

I agree that "async-signal-safe " is something that needs to be considered, but it isn't the only thing. I've seen plenty of cores where corruption of a data structure inside malloc itself was the trigger for the SEGV. That's why I need to be sure that any code executed in the signal handler isn't going to blow up.

I've had success with a toy setup that checks if the following scenario will work.

I have a Parent process that spawns a Child process. The child process contains a deliberate SEGV error.

In the Child process I get the signal handler to send USR1 to the parent process, then send SIGSTOP to itself. Once the SIGSTOP is released I get the process to exit.

The Parent process has a handler to catch the USR1 signal. I use this to trigger the execution of gdb.  When I get gdb triggered it seems to be working fine -- stack is still present & I can access data structures. Exiting gdb must send a CONT to the process because it the child process then exits normally.

Still early days, but I like this approach because it means I only need to add a small amount of code in the signal handler of the coring process.

Paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]