This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Deadlock in multithreaded application when doing IO andfork.


On 01/31/2013 07:05 PM, Roland McGrath wrote:
> I think it certainly makes sense for the malloc atfork handlers always to
> run last.  Otherwise a user atfork handler can produce deadlock with the
> user's own code in other threads in ways that don't really seem to be under
> the user's control.  For example:
> 
> T1					T2
> 					takes user lock L
> fork					
> 	malloc atfork takes malloc locks
> 	user atfork blocks locking L
> 					calls malloc
> 						blocks on malloc locks
> 
> In fact, it's pretty easy to set this up so it always deadlocks, not even
> needing a race (e.g. T2 creates T1 after locking L and then uses a
> pthread_barrier to wait for T2 to enter its atfork handler before T2 calls
> malloc).  That seems like a good test case to write, since I think you can
> write one like this and pretty easily see that it is POSIX-compliant.

Agreed. I wouldn't dream of submitting a fix for this without also having
a reliable test case.

> With user atfork handlers that call malloc themselves, it can probably get
> even weirder.  (Calling malloc in an atfork handler seems like a bad idea
> all around, but AFAICT it is kosher under POSIX.)

Agreed. We do support calling malloc from atfork handlers with some special
code in malloc, which we might be able remove if we just force malloc to 
run last and ensure we don't allocate anything before we unlock the arenas.

> It's less clear whether that's really sufficient for all kosher scenarios.
> If I understood you correctly, the scenario you cited is one that in the
> best case would lead to a crash.  That is, the user has provoked undefined
> behavior.  In that case, it's as kosher to deadlock as it is to crash
> coherently with malloc assertions, albeit much less useful.  So we don't
> have a hard mandate to avoid those deadlock cases.  Hence it's a tradeoff
> of difficulty, maintenance burden, and performance hits vs being extra nice
> in helping people diagnose their own bugs.  Similarly, setting malloc hooks
> is something that really requires knowing about subtleties and internal
> implementation issues already and probably always will, so putting the onus
> on people who write their own malloc hooks (and especially people who think
> that using malloc hooks in a multithreaded program is something they should
> be doing) is fine.

You are correct, that the fix I propose opens up a deadlock in what is
effectively undefined behaviour e.g. you're on your way to abort().

I would like the *default* implementation to try and avoid the deadlock,
and I would like the documentation of the malloc hooks to provide
sufficient explanation of the situation that you could write your own
malloc hook that operates correctly under multi-threaded fork.

I feel like providing a trivial example that works is our responsibility.

> I'm not all that clear on the details of the further mitigations you
> suggest after the atfork change.  I think the right things to do are
> (in this order):

Thanks, I'll break it up like this.

> 1. write the aforementioned test and verify it always deadlocks
> 2. fix that test by making the malloc atfork handlers always run last

The fix actually requires two things:
(a) Run all atfork prepare handlers.
(b) Lock list_all_lock to prevent more IO.
(c) After (b) run malloc atfork handler e.g. run handler last.

Note that (a) can be rolled into the start of malloc's atfork
handler. Though at that point it's not just a malloc atfork handler,
it's realy a libc generic atfork handler that is doing the work of
shutting down IO *and* memory allocations in order to ensure
consistency after the fork.

> 3. commit those
> 4. reconsider remaining undesireable scenarios in that context

Right, which is "Default malloc implementation must avoid IO in
undefined scenario or deadlock." Which is a "nice to users"
situation.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]