[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.2 A Simple Shell Builders Library

An application which most developers try their hand at sooner or later is a Unix shell. There is a lot of functionality common to all traditional command line shells, which I thought I would push into a portable library to get you over the first hurdle when that moment is upon you. Before elaborating on any of this I need to name the project. I’ve called it sic, from the Latin so it is, because like all good project names it is somewhat pretentious and it lends itself to the recursive acronym sic is cumulative.

The gory detail of the minutiae of the source is beyond the scope of this book, but to convey a feel for the need for Sic, some of the goals which influenced the design follow:

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.2.1 Portability Infrastructure

As I explained in Project Directory Structure, I’ll first create the project directories, a toplevel directory and a subdirectory to put the library sources into. I want to install the library header files to ‘/usr/local/include/sic’, so the library subdirectory must be named appropriately. See section C Header Files.

$ mkdir sic
$ mkdir sic/sic
$ cd sic/sic

I will describe the files I add in this section in more detail than the project specific sources, because they comprise an infrastructure that I use relatively unchanged for all of my GNU Autotools projects. You could keep an archive of these files, and use them as a starting point each time you begin a new project of your own.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] Error Management

A good place to start with any project design is the error management facility. In Sic I will use a simple group of functions to display simple error messages. Here is ‘sic/error.h’:

#ifndef SIC_ERROR_H
#define SIC_ERROR_H 1

#include <sic/common.h>


extern const char *program_name;
extern void set_program_name (const char *argv0);

extern void sic_warning      (const char *message);
extern void sic_error        (const char *message);
extern void sic_fatal        (const char *message);


#endif /* !SIC_ERROR_H */

This header file follows the principles set out in C Header Files.

I am storing the program_name variable in the library that uses it, so that I can be sure that the library will build on architectures that don’t allow undefined symbols in libraries(12).

Keeping those preprocessor macro definitions designed to aid code portability together (in a single file), is a good way to maintain the readability of the rest of the code. For this project I will put that code in ‘common.h’:

#ifndef SIC_COMMON_H
#define SIC_COMMON_H 1

#  include <config.h>

#include <stdio.h>
#include <sys/types.h>

#  include <stdlib.h>
#  include <string.h>
#  include <strings.h>
#endif /*STDC_HEADERS*/

#  include <unistd.h>

#  include <errno.h>
#endif /*HAVE_ERRNO_H*/
#ifndef errno
/* Some systems #define this! */
extern int errno;

#endif /* !SIC_COMMON_H */

You may recognise some snippets of code from the Autoconf manual here— in particular the inclusion of the project ‘config.h’, which will be generated shortly. Notice that I have been careful to conditionally include any headers which are not guaranteed to exist on every architecture. The rule of thumb here is that only ‘stdio.h’ is ubiquitous (though I have never heard of a machine that has no ‘sys/types.h’). You can find more details of some of these in (autoconf)Existing Tests section ‘Existing Tests’ in The GNU Autoconf Manual.

Here is a little more code from ‘common.h’:

#  define EXIT_SUCCESS  0
#  define EXIT_FAILURE  1

The implementation of the error handling functions goes in ‘error.c’ and is very straightforward:

#  include <config.h>

#include "common.h"
#include "error.h"

static void error (int exit_status, const char *mode, 
                   const char *message);

static void
error (int exit_status, const char *mode, const char *message)
  fprintf (stderr, "%s: %s: %s.\n", program_name, mode, message);

  if (exit_status >= 0)
    exit (exit_status);

sic_warning (const char *message)
  error (-1, "warning", message);

sic_error (const char *message)
  error (-1, "ERROR", message);

sic_fatal (const char *message)
  error (EXIT_FAILURE, "FATAL", message);

I also need a definition of program_name; set_program_name copies the filename component of path into the exported data, program_name. The xstrdup function just calls strdup, but aborts if there is not enough memory to make the copy:

const char *program_name = NULL;

set_program_name (const char *path)
  if (!program_name)
    program_name = xstrdup (basename (path));

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] Memory Management

A useful idiom common to many GNU projects is to wrap the memory management functions to localise out of memory handling, naming them with an ‘x’ prefix. By doing this, the rest of the project is relieved of having to remember to check for ‘NULL’ returns from the various memory functions. These wrappers use the error API to report memory exhaustion and abort the program. I have placed the implementation code in ‘xmalloc.c’:

#  include <config.h>

#include "common.h"
#include "error.h"

void *
xmalloc (size_t num)
  void *new = malloc (num);
  if (!new)
    sic_fatal ("Memory exhausted");
  return new;

void *
xrealloc (void *p, size_t num)
  void *new;

  if (!p)
    return xmalloc (num);

  new = realloc (p, num);
  if (!new)
    sic_fatal ("Memory exhausted");

  return new;

void *
xcalloc (size_t num, size_t size)
  void *new = xmalloc (num * size);
  bzero (new, num * size);
  return new;

Notice in the code above, that xcalloc is implemented in terms of xmalloc, since calloc itself is not available in some older C libraries. Also, the bzero function is actually deprecated in favour of memset in modern C libraries – I’ll explain how to take this into account later in Beginnings of a ‘configure.in.

Rather than create a separate ‘xmalloc.h’ file, which would need to be #included from almost everywhere else, the logical place to declare these functions is in ‘common.h’, since the wrappers will be called from most everywhere else in the code:

#ifdef __cplusplus
#  define BEGIN_C_DECLS         extern "C" {
#  define END_C_DECLS           }
#  define BEGIN_C_DECLS
#  define END_C_DECLS

#define XCALLOC(type, num)                                  \
        ((type *) xcalloc ((num), sizeof(type)))
#define XMALLOC(type, num)                                  \
        ((type *) xmalloc ((num) * sizeof(type)))
#define XREALLOC(type, p, num)                              \
        ((type *) xrealloc ((p), (num) * sizeof(type)))
#define XFREE(stale)                            do {        \
        if (stale) { free (stale);  stale = 0; }            \
                                                } while (0)


extern void *xcalloc    (size_t num, size_t size);
extern void *xmalloc    (size_t num);
extern void *xrealloc   (void *p, size_t num);
extern char *xstrdup    (const char *string);
extern char *xstrerror  (int errnum);


By using the macros defined here, allocating and freeing heap memory is reduced from:

char **argv = (char **) xmalloc (sizeof (char *) * 3);
do_stuff (argv);
if (argv)
  free (argv);

to the simpler and more readable:

char **argv = XMALLOC (char *, 3);
do_stuff (argv);
XFREE (argv);

In the same spirit, I have borrowed ‘xstrdup.c’ and ‘xstrerror.c’ from project GNU’s libiberty. See section Fallback Function Implementations.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] Generalised List Data Type

In many C programs you will see various implementations and re-implementations of lists and stacks, each tied to its own particular project. It is surprisingly simple to write a catch-all implementation, as I have done here with a generalised list operation API in ‘list.h’:

#ifndef SIC_LIST_H
#define SIC_LIST_H 1

#include <sic/common.h>


typedef struct list {
  struct list *next;    /* chain forward pointer*/
  void *userdata;       /* incase you want to use raw Lists */
} List;

extern List *list_new       (void *userdata);
extern List *list_cons      (List *head, List *tail);
extern List *list_tail      (List *head);
extern size_t list_length   (List *head);


#endif /* !SIC_LIST_H */

The trick is to ensure that any structures you want to chain together have their forward pointer in the first field. Having done that, the generic functions declared above can be used to manipulate any such chain by casting it to List * and back again as necessary.

For example:

struct foo {
  struct foo *next;

  char *bar;
  struct baz *qux;

  struct foo *foo_list = NULL;

  foo_list = (struct foo *) list_cons ((List *) new_foo (),
                                       (List *) foo_list);

The implementation of the list manipulation functions is in ‘list.c’:

#include "list.h"

List *
list_new (void *userdata)
  List *new = XMALLOC (List, 1);

  new->next = NULL;
  new->userdata = userdata;

  return new;

List *
list_cons (List *head, List *tail)
  head->next = tail;
  return head;

List *
list_tail (List *head)
  return head->next;

list_length (List *head)
  size_t n;
  for (n = 0; head; ++n)
    head = head->next;

  return n;

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.2.2 Library Implementation

In order to set the stage for later chapter which expand upon this example, in this subsection I will describe the purpose of the sources that combine to implement the shell library. I will not dissect the code introduced here—you can download the sources from the book’s webpages at http://sources.redhat.com/autobook/.

The remaining sources for the library, beyond the support files described in the previous subsection, are divided into four pairs of files:

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] ‘sic.c’ & ‘sic.h

Here are the functions for creating and managing sic parsers.

#ifndef SIC_SIC_H
#define SIC_SIC_H 1

#include <sic/common.h>
#include <sic/error.h>
#include <sic/list.h>
#include <sic/syntax.h>

typedef struct sic {
  char *result;                 /* result string */
  size_t len;                   /* bytes used by result field */
  size_t lim;                   /* bytes allocated to result field */
  struct builtintab *builtins;  /* tables of builtin functions */
  SyntaxTable **syntax;         /* dispatch table for syntax of input */
  List *syntax_init;            /* stack of syntax state initialisers */
  List *syntax_finish;          /* stack of syntax state finalizers */
  SicState *state;              /* state data from syntax extensions */
} Sic;

#endif /* !SIC_SIC_H */

This structure has fields to store registered command (builtins) and syntax (syntax) handlers, along with other state information (state) that can be used to share information between various handlers, and some room to build a result or error string (result).

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] ‘builtin.c’ & ‘builtin.h

Here are the functions for managing tables of builtin commands in each Sic structure:

typedef int (*builtin_handler) (Sic *sic,
                                int argc, char *const argv[]);

typedef struct {
  const char *name;
  builtin_handler func;
  int min, max;
} Builtin;

typedef struct builtintab BuiltinTab;

extern Builtin *builtin_find (Sic *sic, const char *name);
extern int builtin_install   (Sic *sic, Builtin *table);
extern int builtin_remove    (Sic *sic, Builtin *table);

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] ‘eval.c’ & ‘eval.h

Having created a Sic parser, and populated it with some Builtin handlers, a user of this library must tokenize and evaluate its input stream. These files define a structure for storing tokenized strings (Tokens), and functions for converting char * strings both to and from this structure type:

#ifndef SIC_EVAL_H
#define SIC_EVAL_H 1

#include <sic/common.h>
#include <sic/sic.h>


typedef struct {
  int  argc;            /* number of elements in ARGV */
  char **argv;          /* array of pointers to elements */
  size_t lim;           /* number of bytes allocated */
} Tokens;

extern int eval       (Sic *sic, Tokens *tokens);
extern int untokenize (Sic *sic, char **pcommand, Tokens *tokens);
extern int tokenize   (Sic *sic, Tokens **ptokens, char **pcommand);


#endif /* !SIC_EVAL_H */

These files also define the eval function, which examines a Tokens structure in the context of the given Sic parser, dispatching the argv array to a relevant Builtin handler, also written by the library user.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ] ‘syntax.c’ & ‘syntax.h

When tokenize splits a char * string into parts, by default it breaks the string into words delimited by whitespace. These files define the interface for changing this default behaviour, by registering callback functions which the parser will run when it meets an ‘interesting’ symbol in the input stream. Here are the declarations from ‘syntax.h’:


typedef int SyntaxHandler (struct sic *sic, BufferIn *in,
                           BufferOut *out);

typedef struct syntax {
  SyntaxHandler *handler;
  char *ch;
} Syntax;

extern int syntax_install (struct sic *sic, Syntax *table);
extern SyntaxHandler *syntax_handler (struct sic *sic, int ch);


A SyntaxHandler is a function called by tokenize as it consumes its input to create a Tokens structure; the two functions associate a table of such handlers with a given Sic parser, and find the particular handler for a given character in that Sic parser, respectively.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.2.3 Beginnings of a ‘configure.in

Now that I have some code, I can run autoscan to generate a preliminary
configure.in’. autoscan will examine all of the sources in the current directory tree looking for common points of non-portability, adding macros suitable for detecting the discovered problems. autoscan generates the following in ‘configure.scan’:

# Process this file with autoconf to produce a configure script.

# Checks for programs.

# Checks for libraries.

# Checks for header files.
AC_CHECK_HEADERS(strings.h unistd.h)

# Checks for typedefs, structures, and compiler characteristics.

# Checks for library functions.


Since the generated ‘configure.scan’ does not overwrite your project’s ‘configure.in’, it is a good idea to run autoscan periodically even in established project source trees, and compare the two files. Sometimes autoscan will find some portability issue you have overlooked, or weren’t aware of.

Looking through the documentation for the macros in this ‘configure.scan’,
AC_C_CONST and AC_TYPE_SIZE_T will take care of themselves (provided I ensure that ‘config.h’ is included into every source file), and AC_HEADER_STDC and AC_CHECK_HEADERS(unistd.h) are already taken care of in ‘common.h’.

autoscan is no silver bullet! Even here in this simple example, I need to manually add macros to check for the presence of ‘errno.h’:

AC_CHECK_HEADERS(errno.h strings.h unistd.h)

I also need to manually add the Autoconf macro for generating ‘config.h’; a macro to initialise automake support; and a macro to check for the presence of ranlib. These should go close to the start of ‘configure.in’:



Recall that the use of bzero in Memory Management is not entirely portable. The trick is to provide a bzero work-alike, depending on which functions Autoconf detects, by adding the following towards the end of ‘configure.in’:

AC_CHECK_FUNCS(bzero memset, break)

With the addition of this small snippet of code to ‘common.h’, I can now make use of bzero even when linking with a C library that has no implementation of its own:

# define bzero(buf, bytes)      ((void) memset (buf, 0, bytes))

An interesting macro suggested by autoscan is AC_CHECK_FUNCS(strerror). This tells me that I need to provide a replacement implementation of strerror for the benefit of architectures which don’t have it in their system libraries. This is resolved by providing a file with a fallback implementation for the named function, and creating a library from it and any others that ‘configure’ discovers to be lacking from the system library on the target host.

You will recall that ‘configure’ is the shell script the end user of this package will run on their machine to test that it has all the features the package wants to use. The library that is created will allow the rest of the project to be written in the knowledge that any functions required by the project but missing from the installers system libraries will be available nonetheless. GNUlibiberty’ comes to the rescue again – it already has an implementation of ‘strerror.c’ that I was able to use with a little modification.

Being able to supply a simple implementation of strerror, as the ‘strerror.c’ file from ‘libiberty’ does, relies on there being a well defined sys_errlist variable. It is a fair bet that if the target host has no strerror implementation, however, that the system sys_errlist will be broken or missing. I need to write a configure macro to check whether the system defines sys_errlist, and tailor the code in ‘strerror.c’ to use this knowledge.

To avoid clutter in the top-level directory, I am a great believer in keeping as many of the configuration files as possible in their own sub-directory. First of all, I will create a new directory called ‘config’ inside the top-level directory, and put ‘sys_errlist.m4’ inside it:

[AC_CACHE_CHECK([for sys_errlist],
[AC_TRY_LINK([int *p;], [extern int sys_errlist; p = &sys_errlist;],
            sic_cv_var_sys_errlist=yes, sic_cv_var_sys_errlist=no)])
if test x"$sic_cv_var_sys_errlist" = xyes; then
    [Define if your system libraries have a sys_errlist variable.])

I must then add a call to this new macro in the ‘configure.in’ file being careful to put it in the right place – somewhere between typedefs and structures and library functions according to the comments in ‘configure.scan’:


GNU Autotools can also be set to store most of their files in a subdirectory, by calling the AC_CONFIG_AUX_DIR macro near the top of ‘configure.in’, preferably right after AC_INIT:


Having made this change, many of the files added by running autoconf and automake --add-missing will be put in the aux_dir.

The source tree now looks like this:

  +-- configure.scan
  +-- config/
  |     +-- sys_errlist.m4
  +-- replace/
  |     +-- strerror.c
  +-- sic/
        +-- builtin.c
        +-- builtin.h
        +-- common.h
        +-- error.c
        +-- error.h
        +-- eval.c
        +-- eval.h
        +-- list.c
        +-- list.h
        +-- sic.c
        +-- sic.h
        +-- syntax.c
        +-- syntax.h
        +-- xmalloc.c
        +-- xstrdup.c
        +-- xstrerror.c

In order to correctly utilise the fallback implementation, AC_CHECK_FUNCS(strerror) needs to be removed and strerror added to AC_REPLACE_FUNCS:

# Checks for library functions.

This will be clearer if you look at the ‘Makefile.am’ for the ‘replace’ subdirectory:

## Makefile.am -- Process this file with automake to produce Makefile.in

INCLUDES                =  -I$(top_builddir) -I$(top_srcdir)

noinst_LIBRARIES        = libreplace.a
libreplace_a_SOURCES        = dummy.c
libreplace_a_LIBADD        = @LIBOBJS@

The code tells automake that I want to build a library for use within the build tree (i.e. not installed – ‘noinst’), and that has no source files by default. The clever part here is that when someone comes to install Sic, they will run configure which will test for strerror, and add ‘strerror.o’ to LIBOBJS if the target host environment is missing its own implementation. Now, when ‘configure’ creates ‘replace/Makefile’ (as I asked it to with AC_OUTPUT), ‘@LIBOBJS@’ is replaced by the list of objects required on the installer’s machine.

Having done all this at configure time, when my user runs make, the files required to replace functions missing from their target machine will be added to ‘libreplace.a’.

Unfortunately this is not quite enough to start building the project. First I need to add a top-level ‘Makefile.am’ from which to ultimately create a top-level ‘Makefile’ that will descend into the various subdirectories of the project:

## Makefile.am -- Process this file with automake to produce Makefile.in

SUBDIRS = replace sic

And ‘configure.in’ must be told where it can find instances of Makefile.in:

AC_OUTPUT(Makefile replace/Makefile sic/Makefile)

I have written a bootstrap script for Sic, for details see Bootstrapping:

#! /bin/sh

autoreconf -fvi

The ‘--foreign’ option to automake tells it to relax the GNU standards for various files that should be present in a GNU distribution. Using this option saves me from having to create empty files as we did in A Minimal GNU Autotools Project.

Right. Let’s build the library! First, I’ll run bootstrap:

$ ./bootstrap
+ aclocal -I config
+ autoheader
+ automake --foreign --add-missing --copy
automake: configure.in: installing config/install-sh
automake: configure.in: installing config/mkinstalldirs
automake: configure.in: installing config/missing
+ autoconf

The project is now in the same state that an end-user would see, having unpacked a distribution tarball. What follows is what an end user might expect to see when building from that tarball:

$ ./configure
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking whether make sets ${MAKE}... yes
checking for working aclocal... found
checking for working autoconf... found
checking for working automake... found
checking for working autoheader... found
checking for working makeinfo... found
checking for gcc... gcc
checking whether the C compiler (gcc  ) works... yes
checking whether the C compiler (gcc  ) is a cross-compiler... no
checking whether we are using GNU C... yes
checking whether gcc accepts -g... yes
checking for ranlib... ranlib
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for unistd.h... yes
checking for errno.h... yes
checking for string.h... yes
checking for working const... yes
checking for size_t... yes
checking for strerror... yes
updating cache ./config.cache
creating ./config.status
creating Makefile
creating replace/Makefile
creating sic/Makefile
creating config.h

Compare this output with the contents of ‘configure.in’, and notice how each macro is ultimately responsible for one or more consecutive tests (via the Bourne shell code generated in ‘configure’). Now that the ‘Makefile’s have been successfully created, it is safe to call make to perform the actual compilation:

$ make
make  all-recursive
make[1]: Entering directory `/tmp/sic'
Making all in replace
make[2]: Entering directory `/tmp/sic/replace'
rm -f libreplace.a
ar cru libreplace.a
ranlib libreplace.a
make[2]: Leaving directory `/tmp/sic/replace'
Making all in sic
make[2]: Entering directory `/tmp/sic/sic'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c builtin.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c error.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c eval.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c list.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c sic.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c syntax.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c xmalloc.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c xstrdup.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I..    -g -O2 -c xstrerror.c
rm -f libsic.a
ar cru libsic.a builtin.o error.o eval.o list.o sic.o syntax.o xmalloc.o
xstrdup.o xstrerror.o
ranlib libsic.a
make[2]: Leaving directory `/tmp/sic/sic'
make[1]: Leaving directory `/tmp/sic'

On this machine, as you can see from the output of configure above, I have no need of the fallback implementation of strerror, so ‘libreplace.a’ is empty. On another machine this might not be the case. In any event, I now have a compiled ‘libsic.a’ – so far, so good.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]

This document was generated by Ben Elliston on July 10, 2015 using texi2html 1.82.