This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: setenv problems


     While the general topic of problems with setenv() and getenv() is
being discussed, I think that I might as well mention another related
problem that I noticed a few months ago.  That is that putenv() does not
comply with the POSIX definition for it.  Refer to
http://www.opengroup.org/onlinepubs/009695399/functions/putenv.html
     Specifically, putenv() makes a copy of the string given to it to
add to the environment rather than entering the given pointer to the
environment vector as required.  (The definition does not use those
exact words, but it is what is needed:  "the string pointed to by string
shall become part of the environment, so altering the string shall
change
the environment.")  The copy is done indirectly because putenv() ends
up calling _setenv_r(), which is defined to put copies of the strings
into
the environment.
     According the the change log, this incorrect behavior has been
there
since the beginning (as is also implied by the copyright notices without
added comments as to the changes).  A question arises as to whether the
behavior ought to be changed to match POSIX or if it should be left as
it is in case people are counting on what it has been doing for years,
in which case a note could be added to the man information pointing out
the discrepancy.
     There is also one more item.  I noticed when looking over the code
related to the problems reported by Pawel Veselov that started this
chain
that the implementations have a memory leak problem, too.  _unsetenv_r()
deletes pointers from the environment, but it does not free the memory
associated with the variable being deleted.  At the moment this is the
right behavior as there is nothing to track whether each entry has been
malloced or not.  If one were to assume that only setenv() were used to
create the environment, then free() could be used--given the present
incorrect implementation of putenv().  But if putenv() were to be fixed,
then the present not-free approach is the only valid one unless
additional
book-keeping were added.
     It does seem like the problems that Pawel points out ought to be
fixed.  The ones that I'm pointing out are trickier, making me wonder if
the sleeping dog should be left to lie or not.  (As I pointed out above,
there's a question of the paper definition versus the definition of how
it has been working.  I have not done any tests to see what actually is
done by some of the systems that I have access to.)  What approach ought
to be taken?  (I could supply patches for as many of the problems in
this
chain as needed, if need be.)
				Craig Howland

-----Original Message-----
From: newlib-owner@sourceware.org [mailto:newlib-owner@sourceware.org]
On Behalf Of Jeff Johnston
Sent: Monday, September 22, 2008 3:13 PM
To: Pawel Veselov
Cc: Joel Sherrill; newlib@sourceware.org
Subject: Re: setenv problems

Pawel Veselov wrote:
> On Mon, Sep 22, 2008 at 10:18 AM, Joel Sherrill
> <joel.sherrill@oarcorp.com> wrote:
>   
>> Hi,
>>
>> It is not directly stated on the getenv() page at opengroup.org but
is
>> in the section on environment variables that '=' is not to appear in
>> an environment variable name.
>>
http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html
>>     
>>> These strings have the form /name/=/value/; /name/s shall not
contain the
>>> character '='. For values to be portable across systems conforming
to IEEE
>>> Std 1003.1-2001, the value shall be composed of characters from the
portable
>>> character set (except NUL and as indicated below). There is no
meaning
>>> associated with the order of strings in the environment. If more
than one
>>> string in a process' environment has the same /name/, the
consequences are
>>> undefined.
>>>       
>
> http://www.opengroup.org/onlinepubs/000095399/functions/setenv.html
> says that setenv() should fail with EINVAL if name contains an equal
> sign.
>
>   
Ok, also considering the linux man pages state the same and the function

isn't specified by ANSI.  Would you like to try your hand at a patch?

-- Jeff J.
>> I don't see any requirement on what getenv() should
>> do if the name string contains an equal. The case where
>> the first character is '=' could just as easily be interpreted
>> as an empty name string and thus an error.
>>     
>
> getenv() with the name that contains an equal sign would inadvertently
> fail by returning an empty string because no name can contain an equal
> sign, as there is no specification of an alternative behavior in such
> case, the string can be interpreted literally in all cases (at least
> it's allowed to). However, the only problem with getenv() that I found
> that it would succeed, in case it both contains an equal sign, and the
> character sequence before the equal sign contains a string that exists
> in the environment, which I believe is improper.
>
> Regarding the "first character being an equal sign" issue, the only
> problem with that is that for some reason setenv won't accept values
> with such characters (well, it will accept them, but will eat up first
> and only first equal sign).
>
>   
>> As far as I can tell the behavior is undefined.
>>
>> Jeff?
>>
>> --joel
>>
>> Pawel Veselov wrote:
>>     
>>> Hi,
>>>
>>> while looking through the cegcc project, I discovered a few issues
>>> with the setenv() (_setenv_r) and getenv() (_getenv_r) functions:
>>>
>>> 1. In the beginning of the function, the pointer to the value string
>>> is shifted if the string starts with '='. The comment says that is
to
>>> prevent values to start from '='. I couldn't find anywhere in the
>>> definition of setenv() that a value may not start with equal
>>> character. Any reason for eating first equal character up?
>>>
>>> 2. The setenv() man pages seem to ask for returning EINVAL in case
>>> there is an equal character inside the name of the variable. It also
>>> says that glibc versions do allow to have environment variable names
>>> with equal signs in them, however, the current implementation of
>>> environment variables in newlib doesn't seem to be able to
distinguish
>>> between those. (So, if you set variable named (a=b) with value (c),
>>> the getenv for (a) will return (b=c)). So I believe newlibs setenv
>>> should return EINVAL.
>>>
>>> 3. The getenv() implementation searches for the variables that have
>>> the name as passed, unless the name contains an equal sign in it, in
>>> which case, the only part that is searched for is the characters
>>> before the equal sign. So if you call setenv("foo", "bar", 1) and
then
>>> call getenv("foo=grape"), you will get "foo=bar" as the result of
that
>>> getenv call.
>>>
>>> Since cegcc project uses newlib for base library support, I'm
>>> cross-posting to both lists. Would appreciate any comments on the
>>> above. I can produce a patch that restricts both functions to POSIX
>>> behavior.
>>>
>>> Thanks,
>>>  Pawel.
>>>
>>>
>>> --
>>> With best of best regards
>>> Pawel S. Veselov
>>>
>>>       
>> --
>> Joel Sherrill, Ph.D.             Director of Research & Development
>> joel.sherrill@OARcorp.com        On-Line Applications Research
>> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>  Support Available             (256) 722-9985
>>
>>
>>
>>     
>
> Thanks,
>   Pawel.
>   


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]