This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: [RFC]setlocale() race condition
- From: "Carlos O'Donell" <carlos at systemhalted dot org>
- To: "Sharyathi Nagesh" <sharyath at in dot ibm dot com>
- Cc: "Mark Brown" <bmark at us dot ibm dot com>, libc-help at sourceware dot org, sripathik at in dot ibm dot com, suzuki at in dot ibm dot com, tim_preece at uk dot ibm dot com
- Date: Wed, 18 Jun 2008 12:22:06 -0400
- Subject: Re: [RFC]setlocale() race condition
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=I0ZjXAePQZcF+XlK91/xyzZ1vy2aP2BBuV6jYRj06xk=; b=IZ9iQoeWDnyNOBZYKndSEYYnuCMSHX8twLzKSEKF82zsOFIBEqTXf+Z6qSOkh9cnb0 mc5ORS/EcEqN7jwYaaz3SWA17RKTI5fnYXwUgBc2u9y9Vb5bmJsB6YYFuWBBUlP6EehV sC+VG+f2BqrirEo7JJ6jQHVfXgk1xgygotamc=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=IS3vwaiRXVDbKUWj83wWwPPvnH5+mXFwsd252E2k0fJBFE8EDLBTkd09rmMOl5/ndN xIiv395buYfEBBH44/BAxdsYSHAKlevCha0LG6U5Mf0xtQ6EPRedIXZkCkduBKM0oyxV hjmobYyofW4GKde5+GHvt5feYbqQMkbp4SzDE=
- References: <48509BC2.8040102@in.ibm.com> <119aab440806120700o5de4cb95w441937930b3dfeb@mail.gmail.com> <OF63289371.657FC9A8-ON86257466.00590DCB-86257466.005A1854@us.ibm.com> <48520904.8020406@in.ibm.com> <119aab440806130651g4b598fa7wdf2c589276163f9a@mail.gmail.com> <119aab440806130654l65085bfdlc7ce66e6c9f743d7@mail.gmail.com> <485755DE.6060608@in.ibm.com> <119aab440806170448r4be70351pa70aa9559fe19322@mail.gmail.com> <4858D94F.3070800@in.ibm.com>
On Wed, Jun 18, 2008 at 5:45 AM, Sharyathi Nagesh <sharyath@in.ibm.com> wrote:
> So does it look better ?
Yes and no.
> This problem was noticed with glibc shipped with distro, with older
> version of glibc 2.5-12, the problem is noticed after 8 hours of testing and
> application crashed with SIGSEGV. My efforts to replicate the problem with
> main line glibc was not successful, but I still feel the problem is there
> even with main line glibc and wish to know your thoughts on this issue
> -----------------------------------------------------------------
> Explanation:
> This problem was noticed during PHP engine development, current
> implementation calls setlocale() every time a page is requested. The problem
> is noticed during stress test of this PHP engine
> setlocale() is being called on multiple threads. The exact API calls are
> as follows
> ....
> setlocale(LC_ALL,'C');
> ....
> setlocale(LC_TYPE,'');
> ....
> setlocale(LC_CTYPE,'C');
> ....
> It was observed that after ~8 hours of testing, application crashed at
> strcmp() call made from setlocale(), when I analyzed the dump it showed that
> _nl_global_locale.__names[category] pointer was corrupted.
> Code analysis showed a window for race - when one thread calls
> strcmp()(with in setlocale()) with current value of
> _nl_global_locale.__names[category] passed as argument and another thread
> goes ahead and frees the string pointer pointed by
> _nl_global_locale.__names[category].
>
> _nl_global_locale.__names[category] is protected through the lock
> libc_setlocale_lock, but the lock is taken only while writing to the data
> and not while reading from the global variable;
> This lock is taken in freelocale() and setlocale() functions
Simplify your explanation, and if you don't have a testcase then
include an analysis of the cvs head code.
e.g.
~~~
During PHP engine development it was observed that a thread calling
strcmp() would crash if a thread was also calling setlocale().
The problem is difficult to reproduce, but the following is a
line-by-line analysis of where the race condition exists.
... Point out line-by-line race condition in cvs head code ...
~~~
> Though setlocale() is not on the POSIX.1 list of async-signal safe
> functions as in section 2.4.3
> http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html#tag_02_04
> It still needs to be thread safe according to section 2.9.1 in
> http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_09.html
OK.
> -----------------------------------------------------------------
> Testing:
> Didn't notice any regression with testing. Testing was done under x86-64
> box where the application was built as 32bit.
> -----------------------------------------------------------------
What do you mean by "didn't notice?" either there was or was not a regression.
The target x86-64 is not a gnu-triplet, please use the canonical
triplet name e.g. x86_64-linux-gnu?
> Fix:
> Similar fix, fixed the problem with distro glibc, where
> __libc_lock_lock() is used instead of __libc_rwlock_rdlock()
We don't care about your distro glibc. We care about how your patch
fixes the race condition.
Please provide a ChangeLog entry, and the patch. Lastly, you should
mention the status of your FSF copyright assignment.
Cheers,
Carlos.