This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: [PATCH] add memrchr(3)
On Wed, 2012-05-09 at 08:23 -0600, Eric Blake wrote:
> sizeof(unsigned char) is defined by C to be exactly 1; I always
> question code that spells it out longhand instead of using 1.
But, but, magic numbers! :-) Not that I care either way here.
> >> if (src > src_end) >> break;
>
> Also, src will never be > src_end - memchr returns NULL rather than
> reading beyond the bounds of length.
Actually, it *was* returning values beyond src_end and the effects
weren't pretty, hence the check.
> Additionally, I think that searching forwards through the array via one
> function call per occurrence of the byte in question is wasteful -
> since we already know the array bounds, we might as well search in
> reverse by doing a single C loop that iterates backwards over a word at
> a time. strrchr must search forwards, because it is also searching for
> the terminating NUL and doesn't know the length in advance, but memrchr
> should be faster.
Fair enough, so code duplication it is then. Revised source file
attached.
Yaakov
Cygwin/X
/*
FUNCTION
<<memrchr>>---reverse search for character in memory
INDEX
memrchr
ANSI_SYNOPSIS
#include <string.h>
void *memrchr(const void *<[src]>, int <[c]>, size_t <[length]>);
TRAD_SYNOPSIS
#include <string.h>
void *memrchr(<[src]>, <[c]>, <[length]>)
void *<[src]>;
void *<[c]>;
size_t <[length]>;
DESCRIPTION
This function searches memory starting at <[length]> bytes
beyond <<*<[src]>>> backwards for the character <[c]>.
The search only ends with the first occurrence of <[c]>; in
particular, <<NUL>> does not terminate the search.
RETURNS
If the character <[c]> is found within <[length]> characters
of <<*<[src]>>>, a pointer to the character is returned. If
<[c]> is not found, then <<NULL>> is returned.
PORTABILITY
<<memrchr>> is a GNU extension.
<<memrchr>> requires no supporting OS subroutines.
QUICKREF
memrchr
*/
#include <_ansi.h>
#include <string.h>
#include <limits.h>
/* Nonzero if X is not aligned on a "long" boundary. */
#define UNALIGNED(X) ((long)(X + 1) & (sizeof (long) - 1))
/* How many bytes are loaded each iteration of the word copy loop. */
#define LBLOCKSIZE (sizeof (long))
/* Threshhold for punting to the bytewise iterator. */
#define TOO_SMALL(LEN) ((LEN) < LBLOCKSIZE)
#if LONG_MAX == 2147483647L
#define DETECTNULL(X) (((X) - 0x01010101) & ~(X) & 0x80808080)
#else
#if LONG_MAX == 9223372036854775807L
/* Nonzero if X (a long int) contains a NULL byte. */
#define DETECTNULL(X) (((X) - 0x0101010101010101) & ~(X) & 0x8080808080808080)
#else
#error long int is not a 32bit or 64bit type.
#endif
#endif
#ifndef DETECTNULL
#error long int is not a 32bit or 64bit byte
#endif
/* DETECTCHAR returns nonzero if (long)X contains the byte used
to fill (long)MASK. */
#define DETECTCHAR(X,MASK) (DETECTNULL(X ^ MASK))
_PTR
_DEFUN (memrchr, (src_void, c, length),
_CONST _PTR src_void _AND
int c _AND
size_t length)
{
_CONST unsigned char *src = (_CONST unsigned char *) src_void + length - 1;
unsigned char d = c;
#if !defined(PREFER_SIZE_OVER_SPEED) && !defined(__OPTIMIZE_SIZE__)
unsigned long *asrc;
unsigned long mask;
int i;
while (UNALIGNED (src))
{
if (!length--)
return NULL;
if (*src == d)
return (void *) src;
src--;
}
if (!TOO_SMALL (length))
{
/* If we get this far, we know that length is large and src is
word-aligned. */
/* The fast code reads the source one word at a time and only
performs the bytewise search on word-sized segments if they
contain the search character, which is detected by XORing
the word-sized segment with a word-sized block of the search
character and then detecting for the presence of NUL in the
result. */
asrc = (unsigned long *) (src - LBLOCKSIZE + 1);
mask = d << 8 | d;
mask = mask << 16 | mask;
for (i = 32; i < LBLOCKSIZE * 8; i <<= 1)
mask = (mask << i) | mask;
while (length >= LBLOCKSIZE)
{
if (DETECTCHAR (*asrc, mask))
break;
length -= LBLOCKSIZE;
asrc--;
}
/* If there are fewer than LBLOCKSIZE characters left,
then we resort to the bytewise loop. */
src = (unsigned char *) asrc + LBLOCKSIZE - 1;
}
#endif /* not PREFER_SIZE_OVER_SPEED */
while (length--)
{
if (*src == d)
return (void *) src;
src--;
}
return NULL;
}