This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: mutexes acquired before fork() remain to be acquired by the parent process after fork()


On Thu, 2016-12-22 at 13:38 +0100, Florian Weimer wrote:
> On 12/20/2016 06:02 PM, Torvald Riegel wrote:
> > I'm looking for discussion and consensus about what happens when a
> > process is forked that has acquired mutexes.  I'm not aware of a
> > definite answer given by POSIX, and this affects robust mutexes in
> > particular.
> > I'm first looking for opinions and hopefully consensus within glibc, and
> > would then follow up with the Austin Group if necessary.
> >
> > I think the most practical choice would be one of these two requirements
> > (I prefer R1 for reasons I'll mention below):
> >
> > (R1) Any interaction of the child process with mutexes that are in an
> > acquired state when fork() was called in the parent is undefined.
> 
> Can you clarify if this refers to robust mutexes, or ordinary mutexes alone?

The most pressing need to do something about this is for robust mutexes.
For everything else, there is a status quo that is not obviously broken,
but still makes specific choices regarding ownership.

> Forking with acquired locks is quite common

Do you have examples?  Note that R1 and R2 do not forbid forking when
having acquired locks; even R1 just forbids to touch them after forking.

> and we do it in glibc as 
> well, to support malloc after fork even if the parent process was 
> multi-threaded at the time of the fork.

What we do in glibc is an entirely different question.  This RFC is
about what we need to guarantee to the user.
I'm not aware of any uses of robust mutexes in glibc.

> > (R2) Any mutexes that are in an acquired state when fork() was called in
> > the parent remain to be locked by the parent process.
> >
> >
> > If a mutex is process-shared, it should not have two owners after fork()
> > because this is against the whole principle of exclusive-ownership
> > mutexes.
> 
> I think you refer to the effectively process-shared case here, and not 
> just mutexes with a process-shared attribute.

No, generally.  See this paragraph in my previous email:

That leaves non-process-shared mutexes and mutexes that are of the
process-shared kind but are not actually shared.  However, I think we
should discard the latter distinction because it's too hard for
implementations to efficiently track which mutexes are actually shared
and which aren't; a process-shared kind mutex should just be assumed to
be process-shared.  So, the remaining question is whether there is a
need to treat process-private mutexes differently.

Do you think it's practical to figure out which regions of memory are
shared?  We'd really have to query the kernel to figure this out.

> With effectively 
> process-shared I mean that the mutex object resides on a shared mapping, 
> for which fork does not make a copy, and the mapping is shared between 
> parent and child process.
> 
> Theoretically, we could traverse the robust list, find those mutexes 
> which are on MAP_SHARED pages (using information from /proc) and build a 
> new list of robust mutexes containing only those mutexes which are not 
> effectively process-shared.  We would only have to update the list 
> pointers of mutexes which have been copied into the child, so this is 
> feasible conceptually.
> 
> But /proc might not be available at this point, or there might not be 
> enough file descriptors to open the files etc., so this approach would 
> be quite brittle.

Yes, I think it's way too much complexity to cater to what's arguably a
minor use case.

> > POSIX states in the rationale that fork() is only used to either create
> > (something like) a new thread or to call exec().  Both align well with
> > letting only the parent be the owner of a mutex.
> > It also states: "When a programmer is writing a multi-threaded program,
> > the first described use of fork(), creating new threads in the same
> > program, is provided by the pthread_create() function. The fork()
> > function is thus used only to run new programs, and the effects of
> > calling functions that require certain resources between the call to
> > fork() and the call to an exec function are undefined."  Even though
> > this ignores the possibility of acquiring mutexes in a single-threaded
> > program, it states that requiring resources (eg, attempting to lock a
> > mutex) between fork() and exec() is undefined -- which would align well
> > with R1.
> > It also explains that a forkall() idea was rejected that would have
> > "allow[ed] locks and the state to be preserved without explicit
> > pthread_atfork() code"; this is again an indication that R1 is the
> > intent or compatible with the intent.
> 
> I disagree with this conclusion.  pthread_atfork handlers typically 
> acquire locks to ensure that the parent process is in a specific state, 
> and release them in the parent and child after the fork.  With R1, this 
> common pattern is suddenly undefined, which is not what we want, I think.

So, this must be a multi-threaded parent or you'd know exactly which
state you're in.  POSIX states this in the fork() description:

If a multi-threaded process calls fork(), the new process shall contain
a replica of the calling thread and its entire address space, possibly
including the states of mutexes and other resources. Consequently, to
avoid errors, the child process may only execute async-signal-safe
operations until such time as one of the exec functions is called.  Fork
handlers may be established by means of the pthread_atfork() function in
order to maintain application invariants across fork() calls.

This already says that you can't unlock nor lock a mutex in the child
process.

R1 does not requires us to break programs that happen to work today.
But it makes it clear that we are allowed to break those that are
obviously not working (eg, the process-shared case).

> Maybe we could introduce a new mutex type specifically for this purpose,

Please no.  We have enough mutex types already.  We also have
semaphores, which do not promise a certain thread to own something, and
with which you can do mutual exclusion too.

> or declare that a PTHREAD_MUTEX_NORMAL, non-robust, non-pshared mutex is 
> an exception here.

POSIX could do that.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]