This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3] getrandom system call wrapper [BZ #17252]


On 09/12/2016 11:40 AM, Torvald Riegel wrote:
On Mon, 2016-09-12 at 09:25 +0200, Florian Weimer wrote:
On 09/09/2016 05:23 PM, Torvald Riegel wrote:
On Fri, 2016-09-09 at 16:28 +0200, Florian Weimer wrote:
On 09/09/2016 04:21 PM, Torvald Riegel wrote:
Can't we just let cancellation rot in its corner?

No, we have many customers who use it (and this despite the fact that
the current implementation has a critical race condition).

Usage of it doesn't mean that it has to be the default.

It's not used by default.  Something has to call pthread_cancel.

I do mean the other side.  That is, in all the code that may see a
cancellation request.

Only very little code supports that. Even in glibc, we have many parts that have cancellation points, but do not install cancellation handlers to clean up resources.

Changing that probably needs some form of tools support. Doing this right is mostly a matter of looking at the call graph.

Have we made
other syscall wrappers cancellation points in the past (ie, syscalls
that don't already have a matching POSIX function that is specified to
be a cancellation point too)?

I found open_by_handle.

OK, though that's much like open(), which is a cancellation point, so
making the syscall a cancellation point too would make sense.
The pseudo-RNG functions are not cancellation points.

I see getrandom more like read, or ioctl.  It's a matter of perspective.

arc4random would be more like a PRNG (can't fail, no short reads, no blocking (although this part may be difficult), no cancellation point, hundreds of megabytes per second throughput on current machines). getentropy is somewhere in the middle.

I'm worried about people who just want to use the syscall but don't know
that much about POSIX cancellation.  They couldn't use the syscall
safely in a library without also being aware of POSIX cancellation, and
I'm concerned that they might just forget to disable cancellation around
the syscall, thus creating resource leaks, deadlocks (eg, cancellation
handler doesn't release locks), etc.  If this is primarily a Linux API
currently (ignoring the Solaris case for a while), then marrying it to
POSIX seems wrong.

If we add getentropy, I suggest that it will not be a cancellation point
(even if it can still block indefinitely).

Can you elaborate on your reasoning?

getentropy is supposed to be the simple interface cryptographic libraries should use to obtain a seed for their own PRNG. Not making it a cancellation point is part of keeping the interface simple.

I looked at quite a few getrandom emulations using /dev/urandom, and not
one of them was cancellation-aware (it leaked the file descriptor on
cancellation, for example).  Based on that, I really doubt getrandom
would introduce an unexpected cancellation point that causes actual
problems.

Interesting, thanks.  That might be one interpretation of the situation
(ie, that users know that they don't have to worry about cancellation
requests concurrent or pending while getting a random number).
However, it might also mean that what I worry about is actually
realistic (ie, that user code should be cancellation-aware but isn't).

It matters only if the code can run in a thread which is canceled.

I have to admit that references to pthread_cancel are more widespread than I assumed. Yet we see few bug reports related to cancellation, despite that both glibc's implementation and library awareness of cancellation leave a lot to be desired.

Anyway, this is probably material for another thread. :)

Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]