This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Toward Shareable POSIX Signals


On 03/09/2018 12:25 PM, Rich Felker wrote:
On Fri, Mar 09, 2018 at 02:30:51PM -0500, Zack Weinberg wrote:
"Just use glib" is of course fundamentally unacceptable. But the
obvious solution is "just use threads" and I don't see why that's not
acceptable. The cost of a thread is miniscule compared to the cost of
a child process, and threads performing synchronous waitpid can
convert the result into whatever type of notification (poll wakeup,
cond var, synchronous handling, etc.) you like.

The main problem I see with this idea is, a thread waiting for _any_
process can steal the event from a thread waiting for a specific
process; this makes it nonviable for any situation where you don't

I never proposed using a thread that calls wait or waidpid with a
negative argument, rather one thread per child.

Understood.

As long as there is no
rogue thread in the program doing wait-any, the thread-per-child
approach lets you emulate pdfork pretty well; programs written around
this model can use pdfork as a drop-in replacement and eliminate the
cost of the thread.

My contention is that a thread per child process is infeasible from a resource POV and that major subsystem authors will never adopt this approach.

Let's be realistic here: lots of systems behave badly due to the inadequacies of the wait API. If a better alternative doesn't appear, these systems are going to continue behaving badly.

On Fri, Mar 09, 2018 at 05:58:51PM +0100, Florian Weimer wrote:
But [threads] only works for asynchronous signals.  It's reasonable
for an application to want to catch synchronous signals (SIGBUS
when dealing with file mappings, SIGFPE for arithmetic), and there
is currently no thread-safe or library-safe way at all to do that.

Yes, as I noted each use case needs to be considered separately to
determine if there's some other better/more-portable/whatnot way it
could be done already. The above applies only to SIGCHLD.

FWIW I'm rather skeptical of many of the usage cases for synchronous
signals (most are dangerous papering-over of UB for dubious
performance reasons; never-taken "test reg,reg;jz" takes essentially 0
cycles on a modern uarch) but SIGBUS makes it hard to use mmap safely
to begin with. So there's still a lot of material to consider here.

If I remember correctly, GCJ tried to use signal handlers to generate
NullPointerExceptions not for speed reasons, but for code-size and
exception-precision reasons.  But it was never 100% reliable and it
might have been better to go with "test reg,reg;jz" + lean harder on
proving pointers couldn't be null.

This is my view. Null checks/proofs should be maximally hoisted and
explicitly emitted in the output rather than relying on traps.

Every major managed code runtime team disagrees with you.

It's not productive for low-level infrastructure maintainers to claim that a universal practice is somehow illegitimate. This attitude is not going to convince people doing the supposedly illegitimate thing to stop doing it, but it will block progress that leads to improvement of the system as a whole.

That's the only case I'm personally familiar with where a serious
application tried to _recover from_ synchronous signals.  I've also
dug into Breakpad a little, but that is a debugger at heart, and it
would be well-served by a mechanism where the kernel would
automatically spawn a ptrace supervisor instead of delivering a fatal
signal.  (This would also allow us to kick core dump generation out of
the kernel.)

This is a very bad idea. Introspective crash logging/reporting is a
huge gift to attackers. If an attacker has compromised a process in a
manner to cause it to segfault, they almost surely have enough control
over the process state to force the handler to perform code execution
for them. There have been real-world CVEs along these lines.

I've hacked on crash reporters for a while now. Reporting a crash in a damaged process environment is undesirable, but unavoidable in some cases. For example, on iOS, fork(2) doesn't work. At all. Consequently, breakpad there needs to do its best with the state it has.

Calling fork(2) in a SIGSEGV handler and immediately execing a crash reporting process is generally safe enough. It's hard for things to go wrong enough that this mechanism doesn't work. That fresh crash reporting process can ptrace its parent and collect what it wants.

While some kernel help in spawning this process wouldn't hurt, I don't think it's particularly necessary. (And I think the existing Linux core_pipe approach is adequate.)

We _do_ need user-space dump collection though. The logic for deciding what information we include in a crash report is too complex to hoist to the kernel, where it'll seldom get updates. The kernel's job should be limited to hooking up a crashing process and a crash-reporting process; I'd get rid of kernel-written core dumps entirely if I had my way.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]