This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Fix readdir_r with long file names
- From: "Michael Kerrisk (man-pages)" <mtk dot manpages at gmail dot com>
- To: Florian Weimer <fweimer at redhat dot com>, Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: mtk dot manpages at gmail dot com, Rich Felker <dalias at aerifal dot cx>, Carlos O'Donell <carlos at redhat dot com>, KOSAKI Motohiro <kosaki dot motohiro at gmail dot com>, libc-alpha <libc-alpha at sourceware dot org>, Roland McGrath <roland at hack dot frob dot com>, linux-man <linux-man at vger dot kernel dot org>
- Date: Tue, 1 Mar 2016 21:14:42 +0100
- Subject: Re: [PATCH] Fix readdir_r with long file names
- Authentication-results: sourceware.org; auth=none
- References: <51B0B39F dot 4060202 at redhat dot com> <51B0BD36 dot 3030202 at redhat dot com> <CAHGf_=r9Rz63pho+84ORk0a_oDyJSj-MCnZ56uPrT3L6sVEfeQ at mail dot gmail dot com> <20130607013024 dot GO29800 at brightrain dot aerifal dot cx> <51B19203 dot 3070307 at redhat dot com> <20130607144143 dot GQ29800 at brightrain dot aerifal dot cx> <51B57E35 dot 4080403 at redhat dot com> <51B65EA7 dot 2020402 at redhat dot com> <20130611011324 dot GT29800 at brightrain dot aerifal dot cx> <51B8702D dot 2060505 at redhat dot com> <20130813040038 dot GE21795 at spoyarek dot pnq dot redhat dot com> <520C88A6 dot 9070501 at redhat dot com> <56D54DAD dot 1040306 at gmail dot com> <56D5CA79 dot 9030204 at redhat dot com>
Hi Florian,
On 03/01/2016 05:59 PM, Florian Weimer wrote:
> On 03/01/2016 09:07 AM, Michael Kerrisk (man-pages) wrote:
>
>> I see that glibc 2.23 deprecates readdir_r(), which prompted me to catch
>> up on this thread. I'd like to see the points you make documented in the
>> readdir_r(3) man page also. Would you be willing to allow that text to
>> be reused / reworked for the page, under that page's existing "verbatim"
>> license (https://www.kernel.org/doc/man-pages/licenses.html#verbatim)?
>
> Hi Michael,
>
> thanks for keeping an eye on deprecations. The deprecation happened for
> glibc 2.24 (unrelased).
Ah yes, I was getting ahead of myself. Fixed that in the page text below.
> I'm happy to report that I may grant your request.
Thanks!
>> The text I'd propose to add to the man page would be (new material
>> starting at ===>):
>
> It may make sense to move this documentation to a separate manual page,
> specific to readdir_r. This will keep the readdir documentation nice
> and crisp. Most programmers will never have to consult all these details.
Yes, seems reasonable. Done.
> You should remove the example using pathconf because it is not correct.
Done.
> The kernel does not return valid values for _PC_NAME_MAX and some file
> systems (such as CIFS, and CD-ROMs with Joliet extensions once a kernel
> bug is fixed). The CIFS limit is somewhere around 765, and not 255 as
> reported by the kernel. If I recall correctly, Windows SMB servers can
> actually exceed the 255 byte limit. The reason is that Windows NTFS has
> a limit based on 16-bit UCS-2 characters, and after UTF-8 conversion,
> the maximum length is more than 255 bytes.
What happens with readdir() when it gets a filename that is larger
than 255 characters?
>
>> ===> However, the above approach has problems, and it is recommended
>> that applications use readdir() instead of readdir_r(). Furâ
>> thermore, since version 2.23, glibc deprecates readdir_r().
s/23/24/
>> The reasons are as follows:
>>
>> * On systems where NAME_MAX is undefined, calling readdir_r()
>> may be unsafe because the interface does not allow the callâ
>> er to specify the length of the buffer used for the returned
>> directory entry.
>>
>> * On some systems, readdir_r() can't read directory entries
>> with very long names. When the glibc implementation encounâ
>> ters such a name, readdir_r() fails with the error ENAMETOOâ
>> LONG after the final directory entry has been read. On some
>> other systems, readdir_r() may return a success status, but
>> the returned d_name field may not be null terminated or may
>> be truncated.
>>
>> * In the current POSIX.1 specification (POSIX.1-2008), readâ
>> dir_r() is not required to be thread-safe. However, in modâ
>> ern implementations (including the glibc implementation),
>> concurrent calls to readdir_r() that specify different
>> directory streams are thread-safe. Therefore, the use of
>
> These two references to readdir_r should be to readdir instead.
Fixed.
>
> I believe there was a historic implementation which implemented
> fdopendir (fd) as (DIR *) fd, and used a global static buffer for
> readdir. This is about the only way readdir can be non-thread-safe.
>
>> readdir_r() is generally unnecessary in multithreaded proâ
>> grams. In cases where multiple threads must read from the
>> same directory stream, using readdir() with external synâ
>> chronization is still preferable to the use of readdir_r(),
>> for the reasons given in the points above.
>>
>> * It is expected that a future version of POSIX.1 will make
>> readdir_r() obsolete, and require that readdir() be thread-
>> safe when concurrently employed on different directory
>> streams.
Thanks for all of the feedback Florian! The current versions of the
readdir(3) and readdir_r(3) have been pushed to the repo.
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/