This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Fix readdir_r with long file names
- From: Florian Weimer <fweimer at redhat dot com>
- To: Andreas Jaeger <aj at suse dot com>
- Cc: Siddhesh Poyarekar <siddhesh at redhat dot com>, libc-alpha <libc-alpha at sourceware dot org>
- Date: Thu, 16 May 2013 14:30:15 +0200
- Subject: Re: [PATCH] Fix readdir_r with long file names
- References: <519220C7 dot 6050705 at redhat dot com> <20130516110136 dot GB11420 at spoyarek dot pnq dot redhat dot com> <5194CDEE dot 4020708 at redhat dot com> <5194CE86 dot 7080000 at suse dot com>
On 05/16/2013 02:18 PM, Andreas Jaeger wrote:
On 05/16/2013 02:15 PM, Florian Weimer wrote:
On 05/16/2013 01:01 PM, Siddhesh Poyarekar wrote:
On Tue, May 14, 2013 at 01:32:23PM +0200, Florian Weimer wrote:
This patch changes readdir_r to return ENAMETOOLONG if the kernel
returns a file name longer than NAME_MAX characters, after the end of
the directory has been reached (so that the directory contents is not
truncated). It also makes the padding compensation code
architecture-agnostic and enables it everywhere.
The specification for readdir/readdir_r does not mention ENAMETOOLONG
as a possible error return[1], so this should at least be mentioned in
the glibc manual, if not also in the man page. So a manual patch is
needed on top of this.
Thanks for your comments.
We could use EOVERFLOW instead. But ENAMETOOLONG is more informative.
I think the description for EOVERFLOW:
"One of the values in the structure to be returned cannot be represented
correctly." fits this case - so, let's use that one to follow POSIX,
Okay, here's the most recent version.
--
Florian Weimer / Red Hat Product Security Team
2013-05-16 Florian Weimer <fweimer@redhat.com>
[BZ #14699]
* sysdeps/posix/dirstream.h (struct __dirstream): Add errcode
member.
* sysdeps/posix/opendir.c (__alloc_dir): Initialize errcode
member.
* sysdeps/posix/rewinddir.c (rewinddir): Reset errcode member.
* sysdeps/posix/readdir_r.c (__READDIR_R): Enforce NAME_MAX limit.
Return delayed error code. Remove GETDENTS_64BIT_ALIGNED
conditional.
* sysdeps/unix/sysv/linux/wordsize-64/readdir_r.c: Do not define
GETDENTS_64BIT_ALIGNED.
* sysdeps/unix/sysv/linux/i386/readdir64_r.c: Likewise.
* manual/filesys.texi (Reading/Closing Directory): Document the
EOVERFLOW return value of readdir_r. Recommend readdir more
strongly.
diff --git a/manual/filesys.texi b/manual/filesys.texi
index 1df9cf2..b227030 100644
--- a/manual/filesys.texi
+++ b/manual/filesys.texi
@@ -444,9 +444,9 @@ symbols are declared in the header file @file{dirent.h}.
@comment POSIX.1
@deftypefun {struct dirent *} readdir (DIR *@var{dirstream})
This function reads the next entry from the directory. It normally
-returns a pointer to a structure containing information about the file.
-This structure is statically allocated and can be rewritten by a
-subsequent call.
+returns a pointer to a structure containing information about the
+file. This structure is associated with the @var{dirstream} handle
+and can be rewritten by a subsequent call.
@strong{Portability Note:} On some systems @code{readdir} may not
return entries for @file{.} and @file{..}, even though these are always
@@ -461,19 +461,26 @@ conditions are defined for this function:
The @var{dirstream} argument is not valid.
@end table
-@code{readdir} is not thread safe. Multiple threads using
-@code{readdir} on the same @var{dirstream} may overwrite the return
-value. Use @code{readdir_r} when this is critical.
+To tell the regular end-of-directory condition and errors apart, you
+need to set @code{errno} to zero directly before calling
+@code{readdir}. To avoid entering an infinite loop, you should stop
+reading from the directory on the first error.
+
+@code{readdir} is thread safe as long as only a single thread accesses
+the @var{dirstream} handle without synchronization. The alternative
+@code{readdir_r} function has significant portability issues.
+Therefore, you should always use @code{readdir} and external locking.
@end deftypefun
@comment dirent.h
@comment GNU
@deftypefun int readdir_r (DIR *@var{dirstream}, struct dirent *@var{entry}, struct dirent **@var{result})
-This function is the reentrant version of @code{readdir}. Like
-@code{readdir} it returns the next entry from the directory. But to
-prevent conflicts between simultaneously running threads the result is
-not stored in statically allocated memory. Instead the argument
-@var{entry} points to a place to store the result.
+This function is version of @code{readdir} which performs internal
+locking. Like @code{readdir} it returns the next entry from the
+directory. But to prevent conflicts between simultaneously running
+threads, the result is not stored inside the @var{dirstream} handle.
+Instead the argument @var{entry} points to a place to store the
+result.
Normally @code{readdir_r} returns zero and sets @code{*@var{result}}
to @var{entry}. If there are no more entries in the directory or an
@@ -481,14 +488,14 @@ error is detected, @code{readdir_r} sets @code{*@var{result}} to a
null pointer and returns a nonzero error code, also stored in
@code{errno}, as described for @code{readdir}.
-@strong{Portability Note:} On some systems @code{readdir_r} may not
-return a NUL terminated string for the file name, even when there is no
-@code{d_reclen} field in @code{struct dirent} and the file
-name is the maximum allowed size. Modern systems all have the
-@code{d_reclen} field, and on old systems multi-threading is not
-critical. In any case there is no such problem with the @code{readdir}
-function, so that even on systems without the @code{d_reclen} member one
-could use multiple threads by using external locking.
+@strong{Portability Note:} On some systems, @code{readdir_r} cannot
+read directory entries with very long names. If such a name is
+encountered, @code{readdir_r} returns with an error code of
+@code{EOVERFLOW} after the final directory entry has been read. On
+other systems, @code{readdir_r} can return successfully, but the
+@code{d_name} member is not NUL-terminated or is otherwise truncated.
+Therefore, you should always prefer @code{readdir} (with external
+locking if necessary) over @code{readdir_r}.
It is also important to look at the definition of the @code{struct
dirent} type. Simply passing a pointer to an object of this type for
diff --git a/sysdeps/posix/dirstream.h b/sysdeps/posix/dirstream.h
index a7a074d..8e8570d 100644
--- a/sysdeps/posix/dirstream.h
+++ b/sysdeps/posix/dirstream.h
@@ -39,6 +39,8 @@ struct __dirstream
off_t filepos; /* Position of next entry to read. */
+ int errcode; /* Delayed error code. */
+
/* Directory block. */
char data[0] __attribute__ ((aligned (__alignof__ (void*))));
};
diff --git a/sysdeps/posix/opendir.c b/sysdeps/posix/opendir.c
index ddfc3a7..fc05b0f 100644
--- a/sysdeps/posix/opendir.c
+++ b/sysdeps/posix/opendir.c
@@ -231,6 +231,7 @@ __alloc_dir (int fd, bool close_fd, int flags, const struct stat64 *statp)
dirp->size = 0;
dirp->offset = 0;
dirp->filepos = 0;
+ dirp->errcode = 0;
return dirp;
}
diff --git a/sysdeps/posix/readdir_r.c b/sysdeps/posix/readdir_r.c
index b5a8e2e..2fb73cb 100644
--- a/sysdeps/posix/readdir_r.c
+++ b/sysdeps/posix/readdir_r.c
@@ -40,6 +40,7 @@ __READDIR_R (DIR *dirp, DIRENT_TYPE *entry, DIRENT_TYPE **result)
DIRENT_TYPE *dp;
size_t reclen;
const int saved_errno = errno;
+ int ret;
__libc_lock_lock (dirp->lock);
@@ -70,10 +71,10 @@ __READDIR_R (DIR *dirp, DIRENT_TYPE *entry, DIRENT_TYPE **result)
bytes = 0;
__set_errno (saved_errno);
}
+ if (bytes < 0)
+ dirp->errcode = errno;
dp = NULL;
- /* Reclen != 0 signals that an error occurred. */
- reclen = bytes != 0;
break;
}
dirp->size = (size_t) bytes;
@@ -106,29 +107,46 @@ __READDIR_R (DIR *dirp, DIRENT_TYPE *entry, DIRENT_TYPE **result)
dirp->filepos += reclen;
#endif
- /* Skip deleted files. */
+#ifdef NAME_MAX
+ if (reclen > offsetof (DIRENT_TYPE, d_name) + NAME_MAX + 1)
+ {
+ /* The record is very long. It could still fit into the
+ caller-supplied buffer if we can skip padding at the
+ end. */
+ size_t namelen = strlen(dp->d_name);
+ if (namelen <= NAME_MAX)
+ reclen = offsetof (DIRENT_TYPE, d_name) + namelen + 1;
+ else
+ {
+ /* The name is too long. Ignore this file. */
+ dirp->errcode = EOVERFLOW;
+ dp->d_ino = 0;
+ continue;
+ }
+ }
+#endif
+
+ /* Skip deleted and ignored files. */
}
while (dp->d_ino == 0);
if (dp != NULL)
{
-#ifdef GETDENTS_64BIT_ALIGNED
- /* The d_reclen value might include padding which is not part of
- the DIRENT_TYPE data structure. */
- reclen = MIN (reclen,
- offsetof (DIRENT_TYPE, d_name) + sizeof (dp->d_name));
-#endif
*result = memcpy (entry, dp, reclen);
-#ifdef GETDENTS_64BIT_ALIGNED
+#ifdef _DIRENT_HAVE_D_RECLEN
entry->d_reclen = reclen;
#endif
+ ret = 0;
}
else
- *result = NULL;
+ {
+ *result = NULL;
+ ret = dirp->errcode;
+ }
__libc_lock_unlock (dirp->lock);
- return dp != NULL ? 0 : reclen ? errno : 0;
+ return ret;
}
#ifdef __READDIR_R_ALIAS
diff --git a/sysdeps/posix/rewinddir.c b/sysdeps/posix/rewinddir.c
index 2935a8e..d4991ad 100644
--- a/sysdeps/posix/rewinddir.c
+++ b/sysdeps/posix/rewinddir.c
@@ -33,6 +33,7 @@ rewinddir (dirp)
dirp->filepos = 0;
dirp->offset = 0;
dirp->size = 0;
+ dirp->errcode = 0;
#ifndef NOT_IN_libc
__libc_lock_unlock (dirp->lock);
#endif
diff --git a/sysdeps/unix/sysv/linux/i386/readdir64_r.c b/sysdeps/unix/sysv/linux/i386/readdir64_r.c
index 8ebbcfd..a7d114e 100644
--- a/sysdeps/unix/sysv/linux/i386/readdir64_r.c
+++ b/sysdeps/unix/sysv/linux/i386/readdir64_r.c
@@ -18,7 +18,6 @@
#define __READDIR_R __readdir64_r
#define __GETDENTS __getdents64
#define DIRENT_TYPE struct dirent64
-#define GETDENTS_64BIT_ALIGNED 1
#include <sysdeps/posix/readdir_r.c>
diff --git a/sysdeps/unix/sysv/linux/wordsize-64/readdir_r.c b/sysdeps/unix/sysv/linux/wordsize-64/readdir_r.c
index 5ed8e95..290f2c8 100644
--- a/sysdeps/unix/sysv/linux/wordsize-64/readdir_r.c
+++ b/sysdeps/unix/sysv/linux/wordsize-64/readdir_r.c
@@ -1,5 +1,4 @@
#define readdir64_r __no_readdir64_r_decl
-#define GETDENTS_64BIT_ALIGNED 1
#include <sysdeps/posix/readdir_r.c>
#undef readdir64_r
weak_alias (__readdir_r, readdir64_r)