This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/14942] New: File corruption bug in AIO with close()
- From: "bugdal at aerifal dot cx" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sources dot redhat dot com
- Date: Mon, 10 Dec 2012 22:24:40 +0000
- Subject: [Bug nptl/14942] New: File corruption bug in AIO with close()
- Auto-submitted: auto-generated
http://sourceware.org/bugzilla/show_bug.cgi?id=14942
Bug #: 14942
Summary: File corruption bug in AIO with close()
Product: glibc
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: nptl
AssignedTo: unassigned@sourceware.org
ReportedBy: bugdal@aerifal.cx
CC: drepper.fsp@gmail.com
Classification: Unclassified
Created attachment 6778
--> http://sourceware.org/bugzilla/attachment.cgi?id=6778
demonstration of the bug
Per POSIX, close() is valid on a file descriptor with pending AIO operations:
"When there is an outstanding cancelable asynchronous I/O operation against
fildes when close() is called, that I/O operation may be canceled. An I/O
operation that is not canceled completes as if the close() operation had not
yet occurred. All operations that are not canceled shall complete as if the
close() blocked until the operations completed. The close() operation itself
need not block awaiting such I/O completion. Whether any I/O operation is
canceled, and which I/O operation may be canceled upon close(), is
implementation-defined."
My reading of this text is that you cannot assume anything about the integrity
of data pending for write on a given file descriptor if you close that file
descriptor, but that the behavior of calling close in this situation is not
undefined, and certainly is not permitted to corrupt other files.
However, as the attached test program shows, glibc's AIO implementation DOES
corrupt other files when close is called on a file descriptor with pending AIO
operations and the file descriptor number gets reused. I've used pipes to
control the timing in this example (and sometimes it still requires a few tries
to hit the bug), but it could happen just as well with regular files.
As long as AIO is being implemented with threads on top of regular POSIX file
operations, rather than via direct kernel support, I believe one of the
following two solutions must be used:
1. Modify close() to attempt to cancel any pending AIO requests and block until
they have all successfully completed or cancelled. This is very difficult,
since close() is required to be async-signal-safe.
2. Have the AIO implementation duplicate any file descriptor it's going to work
with, using fcntl with F_DUPFD_CLOEXEC, and always use the duplicate. In this
case, close() must still be responsible for dissociating the file descriptor
number from its AIO work queue so that AIO requests on a new file descriptor
don't get appended to the old work queue but instead result in a new one. This
still sounds difficult to do in a way that's async-signal-safe, however.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.