This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3] C++ify gdb/common/environ.c


On 04/18/2017 04:03 AM, Sergio Durigan Junior wrote:

> -  if (e->allocated < i)
> +const char *
> +gdb_environ::get (const std::string &var) const
> +{
> +  try
>      {
> -      e->allocated = std::max (i, e->allocated + 10);
> -      e->vector = (char **) xrealloc ((char *) e->vector,
> -				      (e->allocated + 1) * sizeof (char *));
> +      return (char *) this->m_environ_map.at (var).c_str ();
>      }
> -
> -  memcpy (e->vector, environ, (i + 1) * sizeof (char *));
> -
> -  while (--i >= 0)
> +  catch (const std::out_of_range &ex)

Please use unordered_map::find instead.  In general,
use "at" with exceptions when you're "sure" the element
exists.  It'll be simpler, more efficient, and less
roundabout, since at() essentially does a find() and then
throws if not found.

>      {
> -      int len = strlen (e->vector[i]);
> -      char *newobj = (char *) xmalloc (len + 1);
> -
> -      memcpy (newobj, e->vector[i], len + 1);
> -      e->vector[i] = newobj;
> +      return NULL;
>      }
>  }
>  

>  void
> -set_in_environ (struct gdb_environ *e, const char *var, const char *value)
> +gdb_environ::set (const std::string &var, const std::string &value)
>  {
> -  int i;
> -  int len = strlen (var);
> -  char **vector = e->vector;
> -  char *s;
> +  std::unordered_map<std::string, std::string>::iterator el;
>  
> -  for (i = 0; (s = vector[i]) != NULL; i++)
> -    if (strncmp (s, var, len) == 0 && s[len] == '=')
> -      break;
> +  el = this->m_environ_map.find (var);

In C++, it's a good practice to avoid first initializing,
and then assigning separately, because the default constructor
will then do useless work.  (And in some cases, there may not even
be a default constructor.)

I'm not a fan or "auto" everywhere, but with iterators,
I find it OK:

   auto el = this->m_environ_map.find (var);

>  
> -  if (s == 0)
> +  if (el != this->m_environ_map.end ())
>      {
> -      if (i == e->allocated)
> +      if (el->second == value)
> +	return;
> +      else
>  	{
> -	  e->allocated += 10;
> -	  vector = (char **) xrealloc ((char *) vector,
> -				       (e->allocated + 1) * sizeof (char *));
> -	  e->vector = vector;
> +	  /* If we found this item, it means that we have to update
> +	     its value on the map.  */
> +	  el->second = value;
> +	  /* And we also have to update its value on the
> +	     environ_vector.  For that, we just delete the item here
> +	     and recreate it (with the new value) later.  */
> +	  unset_in_vector (this->m_environ_vector, var);
>  	}
> -      vector[i + 1] = 0;
>      }
>    else
> -    xfree (s);
> -
> -  s = (char *) xmalloc (len + strlen (value) + 2);
> -  strcpy (s, var);
> -  strcat (s, "=");
> -  strcat (s, value);
> -  vector[i] = s;
> -
> -  /* This used to handle setting the PATH and GNUTARGET variables
> -     specially.  The latter has been replaced by "set gnutarget"
> -     (which has worked since GDB 4.11).  The former affects searching
> -     the PATH to find SHELL, and searching the PATH to find the
> -     argument of "symbol-file" or "exec-file".  Maybe we should have
> -     some kind of "set exec-path" for that.  But in any event, having
> -     "set env" affect anything besides the inferior is a bad idea.
> -     What if we want to change the environment we pass to the program
> -     without afecting GDB's behavior?  */
> -
> -  return;
> +    this->m_environ_map.insert (std::make_pair (var, value));
> +
> +  this->m_environ_vector.insert (this->m_environ_vector.end () - 1,
> +				 xstrdup (std::string (var
> +						       + '='
> +						       + value).c_str ()));

I don't really understand the "m_environ_vector.end () - 1" here.

Also, that's a lot of unnecessary copying/duping in that last
statement.  Using concat instead of xstrdup + operator+ would
easily avoid it.

And you could easily avoid having to dup the string into both the
map and the vector by making the map's value be a const char *
instead of a std::string:

  std::unordered_map<std::string, const char *>

and then make a map's value point directly into the value
part of the string that is owned by the vector (as in,
the substring that starts at VALUE in "VAR=VALUE").

>  }
>  
> -/* Remove the setting for variable VAR from environment E.  */
> +/* See common/environ.h.  */
>  
>  void
> -unset_in_environ (struct gdb_environ *e, const char *var)
> +gdb_environ::unset (const std::string &var)
>  {
> -  int len = strlen (var);
> -  char **vector = e->vector;
> -  char *s;
> +  this->m_environ_map.erase (var);
> +  unset_in_vector (this->m_environ_vector, var);
> +}


> -extern struct gdb_environ *make_environ (void);
> +class gdb_environ
> +{
> +public:
> +  /* Regular constructor and destructor.  */
> +  gdb_environ ();
> +  ~gdb_environ ();
>  
> -extern void free_environ (struct gdb_environ *);
> +  /* Reinitialize the environment stored.  This is used when the user
> +     wants to delete all environment variables added by him/her.  */
> +  void reinit ();

This should mention that the initial state is copied from the host's
environ.   I think the default ctor could use a similar comment.
I was going to suggest to call this clear() instead, to go
with the standard containers' terminology, until I realized what
"init" really does.  Or maybe even find a more explicit name,
reset_from_host_environ or some such.

Perhaps an even clearer approach would be to make the default ctor
not do a deep copy of the host's "environ", but instead add a
static factor method that returned such a new gdb_environ, like:

static gdb_environ
gdb_environ::from_host_environ ()
{
   // build/return a gdb_environ that wraps the host's environ global.
}

Not sure.  Only experimenting would tell.

Speaking of copying the host environ:

>  _initialize_mi_cmd_env (void)
>  {
> -  struct gdb_environ *environment;
> +  gdb_environ environment;
>    const char *env;
>  
>    /* We want original execution path to reset to, if desired later.
> @@ -278,13 +278,10 @@ _initialize_mi_cmd_env (void)
>       current_inferior ()->environment.  Also, there's no obvious
>       place where this code can be moved such that it surely run
>       before any code possibly mangles original PATH.  */
> -  environment = make_environ ();
> -  init_environ (environment);
> -  env = get_in_environ (environment, path_var_name);
> +  env = environment.get (path_var_name);
>  
>    /* Can be null if path is not set.  */
>    if (!env)
>      env = "";
>    orig_path = xstrdup (env);
> -  free_environ (environment);
>  }

This usage of gdb_environ looks like pointless
wrapping / dupping / freeing to me -- I don't see why we
need to dup the whole host environment just to get at some
env variable.  Using good old getenv(3) directly should do,
and end up trimming off a bit of work from gdb's startup.

I could see perhaps wanting to avoid / optimize the linear
walks that getenv must do, if we have many getenv calls in
gdb, but then that suggests keeping a gdb_environ global that
is initialized early on and is reused.  But that would miss
any setenv call that changes the environment since that
gdb_environ is created, so I'd prefer not do that unless
we find a real need.

> +
> +private:
> +  /* Helper function that initializes our data structures with the
> +     environment variables.  */
> +  void init ();
> +
> +  /* Helper function that clears our data structures.  */
> +  void clear ();

s/function/method/g throughout (ChangeLog too).

> +
> +  /* A map representing the environment variables and their values.
> +     It is easier to store them using a map because of the faster
> +     lookups.  */

It's not that it's easier to store them.  Not having a map
would be even easier.  Just do a linear walk on the vector,
just like getenv does.  You want to instead say that we're
optimizing for get operations.  (I'm not sure this is the
right trade off for this class, but I'll go along with it.)

> +  std::unordered_map<std::string, std::string> m_environ_map;
> +
> +  /* A vector containing the environment variables.  This is useful
> +     for when we need to obtain a 'char **' with all the existing
> +     variables.  */

This should say that the entries are in VAR=VALUE form, and what
is the typical user that needs that format.

> +  std::vector<char *> m_environ_vector;
> +};
>  
>  #endif /* defined (ENVIRON_H) */

>    else
>      {
> -      char **vector = environ_vector (current_inferior ()->environment);
> +      const std::vector<char *> &vector =
> +	current_inferior ()->environment.get_vector ();

= goes on the next line.

>  
> -      while (*vector)
> +      for (char *elem : vector)
>  	{
> -	  puts_filtered (*vector++);
> +	  puts_filtered (elem);
>  	  puts_filtered ("\n");
>  	}
>      }

I'm not sure it's worth it to expose the std::vector implementation
detail just for this loop.  I don't really understand how this
can work though, given that the vector is NULL terminated.

I agree with Simon -- this is begging for some unit tests.  :-)

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]