This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Do not multiply by zero in bzero.


On 03/21/2013 12:20 PM, OndÅej BÃlka wrote:
> Hello,
> 
> I noticed that bzero jumps directly at memset. Memset then computes mask
> by 64bit multiplication. This is suboptimal in bzero case as mask iz
> zero.
> 
> I emulating first few instructions of memset and simplified it. 
> 
> Related question is that we first check if n==1. In my profile this
> happens only with 2% of cases. Is there special reason to keep it?

I agree that it seems odd to have a special case for n==1 with
no comment or indication why that might be better.

Please post as distinct patch with that change to remove the
n==1 check and provide some performance numbers if it actually
makes a difference. If it doesn't then at least it's a code size
win, which is usually a performance win by reducing cache size.

> Passes test on x86-64. OK to commit?
> 
> 	* sysdeps/x86_64/memset.S (memset): Move entry from bzero.
> 	(bzero): Skip multiplication by zero in memset.
> 
> ---
>  sysdeps/x86_64/memset.S |   14 ++++++++++----
>  1 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S
> index b393efe..375c284 100644
> --- a/sysdeps/x86_64/memset.S
> +++ b/sysdeps/x86_64/memset.S
> @@ -25,9 +25,15 @@
>  	.text
>  #if !defined NOT_IN_libc
>  ENTRY(__bzero)
> -	mov	%rsi,%rdx	/* Adjust parameter.  */
> -	xorl	%esi,%esi	/* Fill with 0s.  */
> -	jmp	L(memset_entry)
> +	mov    %rsi,%r8		/* Adjust parameter.  */
> +	xor    %rdx,%rdx	/* Fill with 0s.  */
> +
> +	cmp    $0x1,%r8
> +	mov    %rdi,%rax  /* memset returns the dest address.  */
> +	jne    L(entry_from_bzero)
> +	mov    %dl,(%rdi)
> +	retq
> +
>  END(__bzero)
>  weak_alias (__bzero, bzero)
>  #endif
> @@ -39,7 +45,6 @@ ENTRY_CHK (__memset_chk)
>  END_CHK (__memset_chk)
>  #endif
>  ENTRY (memset)
> -L(memset_entry):
>  	cmp    $0x1,%rdx
>  	mov    %rdi,%rax	/* memset returns the dest address.  */
>  	jne    L(ck2)
> @@ -50,6 +55,7 @@ L(ck2):
>  	mov    %rdx,%r8
>  	movzbq %sil,%rdx
>  	imul   %r9,%rdx
> +L(entry_from_bzero):
>  L(now_dw_aligned):
>  	cmp    $0x90,%r8
>  	ja     L(ck_mem_ops_method)
> 

Richard has already commented on the assembly.

Post a follow-up version.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]