This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
[PATCH v2] ARM: Improve armv7 memcpy performance.
- From: Will Newton <will dot newton at linaro dot org>
- To: libc-ports at sourceware dot org
- Cc: patches at linaro dot org
- Date: Fri, 30 Aug 2013 16:09:21 +0100
- Subject: [PATCH v2] ARM: Improve armv7 memcpy performance.
- Authentication-results: sourceware.org; auth=none
Only enter the aligned copy loop with buffers that can be 8-byte
aligned. This improves performance slightly on Cortex-A9 and
Cortex-A15 cores for large copies with buffers that are 4-byte
aligned but not 8-byte aligned.
ports/ChangeLog.arm:
2013-08-30 Will Newton <will.newton@linaro.org>
* sysdeps/arm/armv7/multiarch/memcpy_impl.S: Tighten check
on entry to aligned copy loop to improve performance.
---
ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Changes in v2:
- Improved description
diff --git a/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S b/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S
index 3decad6..6e84173 100644
--- a/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S
+++ b/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S
@@ -369,8 +369,8 @@ ENTRY(memcpy)
cfi_adjust_cfa_offset (FRAME_SIZE)
cfi_rel_offset (tmp2, 0)
cfi_remember_state
- and tmp2, src, #3
- and tmp1, dst, #3
+ and tmp2, src, #7
+ and tmp1, dst, #7
cmp tmp1, tmp2
bne .Lcpy_notaligned
--
1.8.1.4