This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: collecting data from a coring process

From: Paul Marquess <Paul dot Marquess at owmobility dot com>
To: Samuel Bronson <naesten at gmail dot com>
Cc: Dmitry Samersoff <dms at samersoff dot net>, vijay nag <vijunag at gmail dot com>, "gdb at sourceware dot org" <gdb at sourceware dot org>
Date: Wed, 7 Sep 2016 22:10:28 +0000
Subject: RE: collecting data from a coring process
Authentication-results: sourceware.org; auth=none
Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Paul dot Marquess at owmobility dot com;
References: <CY1PR0501MB11783F479AF7D639A82FE02F95EC0@CY1PR0501MB1178.namprd05.prod.outlook.com> <CAKhyrx_9GnLTBDKkhW_y4QG+f3xV_SL-Vtg0WN+vU6UXnY-qLA@mail.gmail.com> <CY1PR0501MB1178A955FBE2AAAE65655EAB95EC0@CY1PR0501MB1178.namprd05.prod.outlook.com> <87b59611-f5d1-628d-fd41-85ce6c6eb50b@samersoff.net> <CY1PR0501MB117800AACB41115C303EB9D495E60@CY1PR0501MB1178.namprd05.prod.outlook.com> <CAJYzjmefda8F9zLbr0FNXChogLMLF2TMHZATFCK+tiAirG1ahg@mail.gmail.com> <CY1PR0501MB117886690FFE9F9EE007155095E60@CY1PR0501MB1178.namprd05.prod.outlook.com> <CAJYzjmf0a2Dd8XbOQaO3937Bcab1AW9gVp=r3mKSgUq_27G8ow@mail.gmail.com> <CY1PR0501MB117850CB88D6675A3162551395F90@CY1PR0501MB1178.namprd05.prod.outlook.com> <CAJYzjmfgpaE66XKMy0v1fLRqSkahKqwZ7-YTyHZfwkekXD3oCw@mail.gmail.com>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

From: Samuel Bronson [mailto:naesten@gmail.com] 

> On Tue, Sep 6, 2016 at 12:40 PM, Paul Marquess <Paul.Marquess@owmobility.com> wrote:
> > From: Samuel Bronson [mailto:naesten@gmail.com]
> >
> >
> >> On Mon, Sep 5, 2016 at 7:19 PM, Paul Marquess <Paul.Marquess@owmobility.com> wrote:
> >> > From: Samuel Bronson [mailto:naesten@gmail.com]
> >> >
> >> >> On Mon, Sep 5, 2016 at 7:09 AM, Paul Marquess <Paul.Marquess@owmobility.com> wrote:
> >> >> > From: Dmitry Samersoff [mailto:dms@samersoff.net]
> >> >> >
> >> >> >> Paul,
> >> >> >>
> >> >> >> >> 1) Why not dump the information that you are looking for 
> >> >> >> >> into a file in the process signal handler ?
> >> >> >> >
> >> >> >> > Would love to, but I have no idea what state the process is 
> >> >> >> > in once the SEGV has been triggered.
> >> >> [...]
> >> >> > I know we've had problems with signal handlers causing problems, thus my preference to find a way to have the signal handler code do as little as possible and get all the data collection handled at arm's length by gdb.
> >> >>
> >> >> You could just spawn (and wait for) your GDB-launching script from 
> >> >> the signal handler; then, the process & stack will still be around for GDB.  I think this is even legal!
> >> >
> >> > That's one of the approaches I'm thinking of. I need to check if the fork/exec & wait use malloc.
> >>
> >> I think it should suffice for them to be "async-signal-safe "?  It 
> >> looks like signal(7) documents which functions several versions of 
> >> POSIX require to be async-signal-safe, and it looks like there are two versions of exec*() on there as well as fork() and wait().  Which is basically what I meant by "I think this is even legal!" :-).
> >
> > I agree that "async-signal-safe " is something that needs to be considered, but it isn't the only thing. I've seen plenty of cores where corruption of a data structure inside malloc itself was the trigger for the SEGV. That's why I need to be sure that any code executed in the signal handler isn't going to blow up.
> 
> Hmm.  I had not really considered that it might technically be possible to have an async-signal-safe implementation of malloc(), and was 
> therefore operating under the assumption that it was impossible for an async-signal-safe function to rely on malloc().  So, that leaves a few 
> questions:
> 
>   1. Would it actually be a problem for an sync-signal-safe implementation of malloc() to be called in this scenario?
> 
>   2. Is such an implementation even possible?
> 
>   3. Are you willing to take the chance that anyone would actually ship one AND dare to use it in any of POSIX's mandated async-signal-safe functions?

This feels like it is getting into uncharted waters, so, no, I wouldn't want to have the risk of shipping something like this unless it was already mature code that’s had all the issues sorted out.

> (Also, it has come to my attention that s*printf() are actually functions which are not on the list -- somehow, the nature of their task had gotten 
> them past my radar -- so it's presumably simplest to have the helper script get the parent PID on its own, rather than passing it on the command 
> line as I had initially imagined.)

Given that I've now got a working prototype where the signal handler for SEGV ultimately just sends a USR1 signal to a parent process, I don't think I'm prepared to take the risk of getting a process that is about to core to do a fork & exec. My current approach is very simple (which is always good) and means that all complexity (and risk) is moved to the parent process.

Just need to check that kill doesn't use malloc :-)

Paul

References:
- RE: collecting data from a coring process
  - From: Paul Marquess
- Re: collecting data from a coring process
  - From: Samuel Bronson
- RE: collecting data from a coring process
  - From: Paul Marquess
- Re: collecting data from a coring process
  - From: Samuel Bronson
- RE: collecting data from a coring process
  - From: Paul Marquess
- Re: collecting data from a coring process
  - From: Samuel Bronson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]