This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] add memrchr(3)


On Wed, 2012-05-09 at 08:23 -0600, Eric Blake wrote: 
> sizeof(unsigned char) is defined by C to be exactly 1; I always
> question code that spells it out longhand instead of using 1.

But, but, magic numbers! :-) Not that I care either way here.

> >>       if (src > src_end) >> 	break;
> 
> Also, src will never be > src_end - memchr returns NULL rather than
> reading beyond the bounds of length.

Actually, it *was* returning values beyond src_end and the effects
weren't pretty, hence the check.

> Additionally, I think that searching forwards through the array via one
> function call per occurrence of the byte in question is wasteful -
> since we already know the array bounds, we might as well search in
> reverse by doing a single C loop that iterates backwards over a word at
> a time. strrchr must search forwards, because it is also searching for
> the terminating NUL and doesn't know the length in advance, but memrchr
> should be faster.

Fair enough, so code duplication it is then.  Revised source file
attached.


Yaakov
Cygwin/X

/*
FUNCTION
	<<memrchr>>---reverse search for character in memory

INDEX
	memrchr

ANSI_SYNOPSIS
	#include <string.h>
	void *memrchr(const void *<[src]>, int <[c]>, size_t <[length]>);

TRAD_SYNOPSIS
	#include <string.h>
	void *memrchr(<[src]>, <[c]>, <[length]>)
	void *<[src]>;
	void *<[c]>;
	size_t <[length]>;

DESCRIPTION
	This function searches memory starting at <[length]> bytes
	beyond <<*<[src]>>> backwards for the character <[c]>.
	The search only ends with the first occurrence of <[c]>; in
	particular, <<NUL>> does not terminate the search.

RETURNS
	If the character <[c]> is found within <[length]> characters
	of <<*<[src]>>>, a pointer to the character is returned. If
	<[c]> is not found, then <<NULL>> is returned.

PORTABILITY
<<memrchr>> is a GNU extension.

<<memrchr>> requires no supporting OS subroutines.

QUICKREF
	memrchr
*/

#include <_ansi.h>
#include <string.h>
#include <limits.h>

/* Nonzero if X is not aligned on a "long" boundary.  */
#define UNALIGNED(X) ((long)(X + 1) & (sizeof (long) - 1))

/* How many bytes are loaded each iteration of the word copy loop.  */
#define LBLOCKSIZE (sizeof (long))

/* Threshhold for punting to the bytewise iterator.  */
#define TOO_SMALL(LEN)  ((LEN) < LBLOCKSIZE)

#if LONG_MAX == 2147483647L
#define DETECTNULL(X) (((X) - 0x01010101) & ~(X) & 0x80808080)
#else
#if LONG_MAX == 9223372036854775807L
/* Nonzero if X (a long int) contains a NULL byte. */
#define DETECTNULL(X) (((X) - 0x0101010101010101) & ~(X) & 0x8080808080808080)
#else
#error long int is not a 32bit or 64bit type.
#endif
#endif

#ifndef DETECTNULL
#error long int is not a 32bit or 64bit byte
#endif

/* DETECTCHAR returns nonzero if (long)X contains the byte used
   to fill (long)MASK. */
#define DETECTCHAR(X,MASK) (DETECTNULL(X ^ MASK))

_PTR
_DEFUN (memrchr, (src_void, c, length),
	_CONST _PTR src_void _AND
	int c _AND
	size_t length)
{
  _CONST unsigned char *src = (_CONST unsigned char *) src_void + length - 1;
  unsigned char d = c;

#if !defined(PREFER_SIZE_OVER_SPEED) && !defined(__OPTIMIZE_SIZE__)
  unsigned long *asrc;
  unsigned long  mask;
  int i;

  while (UNALIGNED (src))
    {
      if (!length--)
        return NULL;
      if (*src == d)
        return (void *) src;
      src--;
    }

  if (!TOO_SMALL (length))
    {
      /* If we get this far, we know that length is large and src is
         word-aligned. */
      /* The fast code reads the source one word at a time and only
         performs the bytewise search on word-sized segments if they
         contain the search character, which is detected by XORing
         the word-sized segment with a word-sized block of the search
         character and then detecting for the presence of NUL in the
         result.  */
      asrc = (unsigned long *) (src - LBLOCKSIZE + 1);
      mask = d << 8 | d;
      mask = mask << 16 | mask;
      for (i = 32; i < LBLOCKSIZE * 8; i <<= 1)
        mask = (mask << i) | mask;

      while (length >= LBLOCKSIZE)
        {
          if (DETECTCHAR (*asrc, mask))
            break;
          length -= LBLOCKSIZE;
          asrc--;
        }

      /* If there are fewer than LBLOCKSIZE characters left,
         then we resort to the bytewise loop.  */

      src = (unsigned char *) asrc + LBLOCKSIZE - 1;
    }

#endif /* not PREFER_SIZE_OVER_SPEED */

  while (length--)
    {
      if (*src == d)
        return (void *) src;
      src--;
    }

  return NULL;
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]