This is the mail archive of the glibc-cvs@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
GNU C Library master sources branch master updated. glibc-2.21-462-g4b9c2b7

From: andros at sourceware dot org
To: glibc-cvs at sourceware dot org
Date: 11 Jun 2015 14:13:33 -0000
Subject: GNU C Library master sources branch master updated. glibc-2.21-462-g4b9c2b7
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  4b9c2b707b1383b4e3b3c50e445afd0af8922788 (commit)
      from  0724d898bb1c15872b1b59c01a9e9d9d74bb4f56 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=4b9c2b707b1383b4e3b3c50e445afd0af8922788

commit 4b9c2b707b1383b4e3b3c50e445afd0af8922788
Author: Andrew Senkevich <andrew.senkevich@intel.com>
Date:   Thu Jun 11 17:12:38 2015 +0300

    Vector sin for x86_64 and tests.
    
    Here is implementation of vectorized sin containing SSE, AVX,
    AVX2 and AVX512 versions according to Vector ABI
    <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
    
        * bits/libm-simd-decl-stubs.h: Added stubs for sin.
        * math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC.
        * sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
        * sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin.
        * sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
        * sysdeps/x86_64/fpu/Versions: New versions added.
        * sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
        * sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
        build of SSE, AVX2 and AVX512 IFUNC versions.
        * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file.
        * sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file.
        * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file.
        * sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file.
        * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file.
        * sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file.
        * sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file.
        * sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file.
        * sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file.
        * sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file.
        * sysdeps/x86_64/fpu/svml_d_sin_data.S: New file.
        * sysdeps/x86_64/fpu/svml_d_sin_data.h: New file.
        * sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test.
        * sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
        * sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
        * sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
        * sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
        * sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
        * sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
        * sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
        * NEWS: Mention addition of x86_64 vector sin.

diff --git a/ChangeLog b/ChangeLog
index 72db3d8..1e0ec7e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -3,6 +3,37 @@
 	* configure.ac: More strict check for AVX512 assembler support.
 	* configure: Regenerated.
 
+	* bits/libm-simd-decl-stubs.h: Added stubs for sin.
+	* math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC.
+	* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
+	* sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin.
+	* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
+	* sysdeps/x86_64/fpu/Versions: New versions added.
+	* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
+	* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
+	build of SSE, AVX2 and AVX512 IFUNC versions.
+	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file.
+	* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file.
+	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file.
+	* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file.
+	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file.
+	* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file.
+	* sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file.
+	* sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file.
+	* sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file.
+	* sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file.
+	* sysdeps/x86_64/fpu/svml_d_sin_data.S: New file.
+	* sysdeps/x86_64/fpu/svml_d_sin_data.h: New file.
+	* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test.
+	* sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
+	* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
+	* sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
+	* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
+	* sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
+	* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
+	* sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
+	* NEWS: Mention addition of x86_64 vector sin.
+
 2015-06-11  Florian Weimer  <fweimer@redhat.com>
 
 	* nptl/pthread_key_create.c (__pthread_key_create): Fix typo in
diff --git a/NEWS b/NEWS
index 218bf3a..423d315 100644
--- a/NEWS
+++ b/NEWS
@@ -52,7 +52,7 @@ Version 2.22
   condition in some applications.
 
 * Added vector math library named libmvec with the following vectorized x86_64
-  implementations: cos, cosf.
+  implementations: cos, cosf, sin.
   The library can be disabled with --disable-mathvec. Use of the functions is
   enabled with -fopenmp -ffast-math starting from -O1 for GCC version >= 4.9.0.
   The library is linked in as needed when using -lm (no need to specify -lmvec
diff --git a/bits/libm-simd-decl-stubs.h b/bits/libm-simd-decl-stubs.h
index b1ba994..50310d6 100644
--- a/bits/libm-simd-decl-stubs.h
+++ b/bits/libm-simd-decl-stubs.h
@@ -37,4 +37,8 @@
 #define __DECL_SIMD_cosf
 #define __DECL_SIMD_cosl
 
+#define __DECL_SIMD_sin
+#define __DECL_SIMD_sinf
+#define __DECL_SIMD_sinl
+
 #endif
diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h
index 85a6a95..fbe7a3a 100644
--- a/math/bits/mathcalls.h
+++ b/math/bits/mathcalls.h
@@ -62,7 +62,7 @@ __MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x));
 /* Cosine of X.  */
 __MATHCALL_VEC (cos,, (_Mdouble_ __x));
 /* Sine of X.  */
-__MATHCALL (sin,, (_Mdouble_ __x));
+__MATHCALL_VEC (sin,, (_Mdouble_ __x));
 /* Tangent of X.  */
 __MATHCALL (tan,, (_Mdouble_ __x));
 
diff --git a/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist b/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist
index acabb8a..1dddacd 100644
--- a/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist
@@ -1,10 +1,14 @@
 GLIBC_2.22
  GLIBC_2.22 A
  _ZGVbN2v_cos F
+ _ZGVbN2v_sin F
  _ZGVbN4v_cosf F
  _ZGVcN4v_cos F
+ _ZGVcN4v_sin F
  _ZGVcN8v_cosf F
  _ZGVdN4v_cos F
+ _ZGVdN4v_sin F
  _ZGVdN8v_cosf F
  _ZGVeN16v_cosf F
  _ZGVeN8v_cos F
+ _ZGVeN8v_sin F
diff --git a/sysdeps/x86/fpu/bits/math-vector.h b/sysdeps/x86/fpu/bits/math-vector.h
index b3ef833..82b7c67 100644
--- a/sysdeps/x86/fpu/bits/math-vector.h
+++ b/sysdeps/x86/fpu/bits/math-vector.h
@@ -32,5 +32,7 @@
 #  define __DECL_SIMD_cos __DECL_SIMD_x86_64
 #  undef __DECL_SIMD_cosf
 #  define __DECL_SIMD_cosf __DECL_SIMD_x86_64
+#  undef __DECL_SIMD_sin
+#  define __DECL_SIMD_sin __DECL_SIMD_x86_64
 # endif
 #endif
diff --git a/sysdeps/x86_64/fpu/Makefile b/sysdeps/x86_64/fpu/Makefile
index 454cfba..25f8e33 100644
--- a/sysdeps/x86_64/fpu/Makefile
+++ b/sysdeps/x86_64/fpu/Makefile
@@ -1,7 +1,9 @@
 ifeq ($(subdir),mathvec)
 libmvec-support += svml_d_cos2_core svml_d_cos4_core_avx \
 		   svml_d_cos4_core svml_d_cos8_core \
-		   svml_d_cos_data svml_s_cosf4_core svml_s_cosf8_core_avx \
+		   svml_d_cos_data svml_d_sin2_core svml_d_sin4_core_avx \
+		   svml_d_sin4_core svml_d_sin8_core svml_d_sin_data \
+		   svml_s_cosf4_core svml_s_cosf8_core_avx \
 		   svml_s_cosf8_core svml_s_cosf16_core svml_s_cosf_data \
 		   init-arch
 endif
diff --git a/sysdeps/x86_64/fpu/Versions b/sysdeps/x86_64/fpu/Versions
index f85c28b..af1769c 100644
--- a/sysdeps/x86_64/fpu/Versions
+++ b/sysdeps/x86_64/fpu/Versions
@@ -1,6 +1,7 @@
 libmvec {
   GLIBC_2.22 {
     _ZGVbN2v_cos; _ZGVcN4v_cos; _ZGVdN4v_cos; _ZGVeN8v_cos;
+    _ZGVbN2v_sin; _ZGVcN4v_sin; _ZGVdN4v_sin; _ZGVeN8v_sin;
     _ZGVbN4v_cosf; _ZGVcN8v_cosf; _ZGVdN8v_cosf; _ZGVeN16v_cosf;
   }
 }
diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps
index ed152d8..d7184d8 100644
--- a/sysdeps/x86_64/fpu/libm-test-ulps
+++ b/sysdeps/x86_64/fpu/libm-test-ulps
@@ -1929,6 +1929,18 @@ idouble: 1
 ildouble: 3
 ldouble: 3
 
+Function: "sin_vlen2":
+double: 2
+
+Function: "sin_vlen4":
+double: 2
+
+Function: "sin_vlen4_avx2":
+double: 2
+
+Function: "sin_vlen8":
+double: 2
+
 Function: "sincos":
 ildouble: 1
 ldouble: 1
diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile
index 6b50475..74da4cd 100644
--- a/sysdeps/x86_64/fpu/multiarch/Makefile
+++ b/sysdeps/x86_64/fpu/multiarch/Makefile
@@ -54,6 +54,8 @@ endif
 
 ifeq ($(subdir),mathvec)
 libmvec-sysdep_routines += svml_d_cos2_core_sse4 svml_d_cos4_core_avx2 \
-			   svml_d_cos8_core_avx512 svml_s_cosf4_core_sse4 \
-			   svml_s_cosf8_core_avx2 svml_s_cosf16_core_avx512
+			   svml_d_cos8_core_avx512 svml_d_sin2_core_sse4 \
+			   svml_d_sin4_core_avx2 svml_d_sin8_core_avx512 \
+			   svml_s_cosf4_core_sse4 svml_s_cosf8_core_avx2 \
+			   svml_s_cosf16_core_avx512
 endif
diff --git a/sysdeps/x86/fpu/bits/math-vector.h b/sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S
similarity index 56%
copy from sysdeps/x86/fpu/bits/math-vector.h
copy to sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S
index b3ef833..29bd0a7 100644
--- a/sysdeps/x86/fpu/bits/math-vector.h
+++ b/sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S
@@ -1,4 +1,4 @@
-/* Platform-specific SIMD declarations of math functions.
+/* Multiple versions of vectorized sin.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,21 +16,23 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#ifndef _MATH_H
-# error "Never include <bits/math-vector.h> directly;\
- include <math.h> instead."
-#endif
+#include <sysdep.h>
+#include <init-arch.h>
 
-/* Get default empty definitions for simd declarations.  */
-#include <bits/libm-simd-decl-stubs.h>
+	.text
+ENTRY (_ZGVbN2v_sin)
+        .type   _ZGVbN2v_sin, @gnu_indirect_function
+        cmpl    $0, KIND_OFFSET+__cpu_features(%rip)
+        jne     1f
+        call    __init_cpu_features
+1:      leaq    _ZGVbN2v_sin_sse4(%rip), %rax
+        testl   $bit_SSE4_1, __cpu_features+CPUID_OFFSET+index_SSE4_1(%rip)
+        jz      2f
+        ret
+2:      leaq    _ZGVbN2v_sin_sse2(%rip), %rax
+        ret
+END (_ZGVbN2v_sin)
+libmvec_hidden_def (_ZGVbN2v_sin)
 
-#if defined __x86_64__ && defined __FAST_MATH__
-# if defined _OPENMP && _OPENMP >= 201307
-/* OpenMP case.  */
-#  define __DECL_SIMD_x86_64 _Pragma ("omp declare simd notinbranch")
-#  undef __DECL_SIMD_cos
-#  define __DECL_SIMD_cos __DECL_SIMD_x86_64
-#  undef __DECL_SIMD_cosf
-#  define __DECL_SIMD_cosf __DECL_SIMD_x86_64
-# endif
-#endif
+#define _ZGVbN2v_sin _ZGVbN2v_sin_sse2
+#include "../svml_d_sin2_core.S"
diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S b/sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S
new file mode 100644
index 0000000..4b4d8be
--- /dev/null
+++ b/sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S
@@ -0,0 +1,229 @@
+/* Function sin vectorized with SSE4.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include "svml_d_sin_data.h"
+
+	.text
+ENTRY (_ZGVbN2v_sin_sse4)
+/* ALGORITHM DESCRIPTION:
+
+      ( low accuracy ( < 4ulp ) or enhanced performance
+       ( half of correct mantissa ) implementation )
+
+      Argument representation:
+      arg = N*Pi + R
+
+      Result calculation:
+      sin(arg) = sin(N*Pi + R) = (-1)^N * sin(R)
+      sin(R) is approximated by corresponding polynomial
+ */
+        pushq     %rbp
+        cfi_adjust_cfa_offset (8)
+        cfi_rel_offset (%rbp, 0)
+        movq      %rsp, %rbp
+        cfi_def_cfa_register (%rbp)
+        andq      $-64, %rsp
+        subq      $320, %rsp
+        movaps    %xmm0, %xmm5
+        movq      __svml_dsin_data@GOTPCREL(%rip), %rax
+        movups __dAbsMask(%rax), %xmm3
+/*
+ * ARGUMENT RANGE REDUCTION:
+ * X' = |X|
+ */
+        movaps    %xmm3, %xmm4
+
+/* SignX - sign bit of X */
+        andnps    %xmm5, %xmm3
+        movups __dInvPI(%rax), %xmm2
+        andps     %xmm5, %xmm4
+
+/* Y = X'*InvPi + RS : right shifter add */
+        mulpd     %xmm4, %xmm2
+        movups __dRShifter(%rax), %xmm6
+
+/* R = X' - N*Pi1 */
+        movaps    %xmm4, %xmm0
+        addpd     %xmm6, %xmm2
+        cmpnlepd __dRangeVal(%rax), %xmm4
+
+/* N = Y - RS : right shifter sub */
+        movaps    %xmm2, %xmm1
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+        psllq     $63, %xmm2
+        subpd     %xmm6, %xmm1
+        movmskpd  %xmm4, %ecx
+        movups __dPI1(%rax), %xmm7
+        mulpd     %xmm1, %xmm7
+        movups __dPI2(%rax), %xmm6
+
+/* R = R - N*Pi2 */
+        mulpd     %xmm1, %xmm6
+        subpd     %xmm7, %xmm0
+        movups __dPI3(%rax), %xmm7
+
+/* R = R - N*Pi3 */
+        mulpd     %xmm1, %xmm7
+        subpd     %xmm6, %xmm0
+        movups __dPI4(%rax), %xmm6
+
+/* R = R - N*Pi4 */
+        mulpd     %xmm6, %xmm1
+        subpd     %xmm7, %xmm0
+        subpd     %xmm1, %xmm0
+
+/*
+ * POLYNOMIAL APPROXIMATION:
+ * R2 = R*R
+ */
+        movaps    %xmm0, %xmm1
+        mulpd     %xmm0, %xmm1
+
+/* R = R^SignRes : update sign of reduced argument */
+        xorps     %xmm2, %xmm0
+        movups __dC7(%rax), %xmm2
+        mulpd     %xmm1, %xmm2
+        addpd __dC6(%rax), %xmm2
+        mulpd     %xmm1, %xmm2
+        addpd __dC5(%rax), %xmm2
+        mulpd     %xmm1, %xmm2
+        addpd __dC4(%rax), %xmm2
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+        mulpd     %xmm1, %xmm2
+        addpd __dC3(%rax), %xmm2
+
+/* Poly = R2*(C1+R2*(C2+R2*Poly)) */
+        mulpd     %xmm1, %xmm2
+        addpd __dC2(%rax), %xmm2
+        mulpd     %xmm1, %xmm2
+        addpd __dC1(%rax), %xmm2
+        mulpd     %xmm2, %xmm1
+
+/* Poly = Poly*R + R */
+        mulpd     %xmm0, %xmm1
+        addpd     %xmm1, %xmm0
+
+/*
+ * RECONSTRUCTION:
+ * Final sign setting: Res = Poly^SignX
+ */
+        xorps     %xmm3, %xmm0
+        testl     %ecx, %ecx
+        jne       .LBL_1_3
+
+.LBL_1_2:
+        cfi_remember_state
+        movq      %rbp, %rsp
+        cfi_def_cfa_register (%rsp)
+        popq      %rbp
+        cfi_adjust_cfa_offset (-8)
+        cfi_restore (%rbp)
+        ret
+
+.LBL_1_3:
+        cfi_restore_state
+        movups    %xmm5, 192(%rsp)
+        movups    %xmm0, 256(%rsp)
+        je        .LBL_1_2
+
+        xorb      %dl, %dl
+        xorl      %eax, %eax
+        movups    %xmm8, 112(%rsp)
+        movups    %xmm9, 96(%rsp)
+        movups    %xmm10, 80(%rsp)
+        movups    %xmm11, 64(%rsp)
+        movups    %xmm12, 48(%rsp)
+        movups    %xmm13, 32(%rsp)
+        movups    %xmm14, 16(%rsp)
+        movups    %xmm15, (%rsp)
+        movq      %rsi, 136(%rsp)
+        movq      %rdi, 128(%rsp)
+        movq      %r12, 168(%rsp)
+        cfi_offset_rel_rsp (12, 168)
+        movb      %dl, %r12b
+        movq      %r13, 160(%rsp)
+        cfi_offset_rel_rsp (13, 160)
+        movl      %ecx, %r13d
+        movq      %r14, 152(%rsp)
+        cfi_offset_rel_rsp (14, 152)
+        movl      %eax, %r14d
+        movq      %r15, 144(%rsp)
+        cfi_offset_rel_rsp (15, 144)
+        cfi_remember_state
+
+.LBL_1_6:
+        btl       %r14d, %r13d
+        jc        .LBL_1_12
+
+.LBL_1_7:
+        lea       1(%r14), %esi
+        btl       %esi, %r13d
+        jc        .LBL_1_10
+
+.LBL_1_8:
+        incb      %r12b
+        addl      $2, %r14d
+        cmpb      $16, %r12b
+        jb        .LBL_1_6
+
+        movups    112(%rsp), %xmm8
+        movups    96(%rsp), %xmm9
+        movups    80(%rsp), %xmm10
+        movups    64(%rsp), %xmm11
+        movups    48(%rsp), %xmm12
+        movups    32(%rsp), %xmm13
+        movups    16(%rsp), %xmm14
+        movups    (%rsp), %xmm15
+        movq      136(%rsp), %rsi
+        movq      128(%rsp), %rdi
+        movq      168(%rsp), %r12
+        cfi_restore (%r12)
+        movq      160(%rsp), %r13
+        cfi_restore (%r13)
+        movq      152(%rsp), %r14
+        cfi_restore (%r14)
+        movq      144(%rsp), %r15
+        cfi_restore (%r15)
+        movups    256(%rsp), %xmm0
+        jmp       .LBL_1_2
+
+.LBL_1_10:
+        cfi_restore_state
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        movsd     200(%rsp,%r15), %xmm0
+
+        call      sin@PLT
+
+        movsd     %xmm0, 264(%rsp,%r15)
+        jmp       .LBL_1_8
+
+.LBL_1_12:
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        movsd     192(%rsp,%r15), %xmm0
+
+        call      sin@PLT
+
+        movsd     %xmm0, 256(%rsp,%r15)
+        jmp       .LBL_1_7
+
+END (_ZGVbN2v_sin_sse4)
diff --git a/sysdeps/x86/fpu/bits/math-vector.h b/sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S
similarity index 54%
copy from sysdeps/x86/fpu/bits/math-vector.h
copy to sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S
index b3ef833..c3a453a 100644
--- a/sysdeps/x86/fpu/bits/math-vector.h
+++ b/sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S
@@ -1,4 +1,4 @@
-/* Platform-specific SIMD declarations of math functions.
+/* Multiple versions of vectorized sin, vector length is 4.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,21 +16,23 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#ifndef _MATH_H
-# error "Never include <bits/math-vector.h> directly;\
- include <math.h> instead."
-#endif
+#include <sysdep.h>
+#include <init-arch.h>
 
-/* Get default empty definitions for simd declarations.  */
-#include <bits/libm-simd-decl-stubs.h>
+	.text
+ENTRY (_ZGVdN4v_sin)
+        .type   _ZGVdN4v_sin, @gnu_indirect_function
+        cmpl    $0, KIND_OFFSET+__cpu_features(%rip)
+        jne     1f
+        call    __init_cpu_features
+1:      leaq    _ZGVdN4v_sin_avx2(%rip), %rax
+        testl   $bit_AVX2_Usable, __cpu_features+FEATURE_OFFSET+index_AVX2_Usable(%rip)
+        jz      2f
+        ret
+2:      leaq    _ZGVdN4v_sin_sse_wrapper(%rip), %rax
+        ret
+END (_ZGVdN4v_sin)
+libmvec_hidden_def (_ZGVdN4v_sin)
 
-#if defined __x86_64__ && defined __FAST_MATH__
-# if defined _OPENMP && _OPENMP >= 201307
-/* OpenMP case.  */
-#  define __DECL_SIMD_x86_64 _Pragma ("omp declare simd notinbranch")
-#  undef __DECL_SIMD_cos
-#  define __DECL_SIMD_cos __DECL_SIMD_x86_64
-#  undef __DECL_SIMD_cosf
-#  define __DECL_SIMD_cosf __DECL_SIMD_x86_64
-# endif
-#endif
+#define _ZGVdN4v_sin _ZGVdN4v_sin_sse_wrapper
+#include "../svml_d_sin4_core.S"
diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S
new file mode 100644
index 0000000..e7e60d4
--- /dev/null
+++ b/sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S
@@ -0,0 +1,210 @@
+/* Function sin vectorized with AVX2.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include "svml_d_sin_data.h"
+
+	.text
+ENTRY (_ZGVdN4v_sin_avx2)
+/* ALGORITHM DESCRIPTION:
+
+      ( low accuracy ( < 4ulp ) or enhanced performance
+      ( half of correct mantissa ) implementation )
+
+     Argument representation:
+     arg = N*Pi + R
+
+     Result calculation:
+     sin(arg) = sin(N*Pi + R) = (-1)^N * sin(R)
+     sin(R) is approximated by corresponding polynomial
+ */
+        pushq     %rbp
+        cfi_adjust_cfa_offset (8)
+        cfi_rel_offset (%rbp, 0)
+        movq      %rsp, %rbp
+        cfi_def_cfa_register (%rbp)
+        andq      $-64, %rsp
+        subq      $448, %rsp
+        movq      __svml_dsin_data@GOTPCREL(%rip), %rax
+        vmovdqa   %ymm0, %ymm4
+        vmovupd __dAbsMask(%rax), %ymm2
+        vmovupd __dInvPI(%rax), %ymm6
+        vmovupd __dRShifter(%rax), %ymm5
+        vmovupd __dPI1_FMA(%rax), %ymm7
+/*
+  ARGUMENT RANGE REDUCTION:
+  X' = |X|
+ */
+        vandpd    %ymm2, %ymm4, %ymm3
+
+/* Y = X'*InvPi + RS : right shifter add */
+        vfmadd213pd %ymm5, %ymm3, %ymm6
+
+/* N = Y - RS : right shifter sub */
+        vsubpd    %ymm5, %ymm6, %ymm1
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+        vpsllq    $63, %ymm6, %ymm5
+
+/* R = X' - N*Pi1 */
+        vmovapd   %ymm3, %ymm0
+        vfnmadd231pd %ymm1, %ymm7, %ymm0
+        vcmpnle_uqpd __dRangeVal(%rax), %ymm3, %ymm3
+
+/* R = R - N*Pi2 */
+        vfnmadd231pd __dPI2_FMA(%rax), %ymm1, %ymm0
+
+/* R = R - N*Pi3 */
+        vfnmadd132pd __dPI3_FMA(%rax), %ymm0, %ymm1
+
+/*
+  POLYNOMIAL APPROXIMATION:
+  R2 = R*R
+ */
+        vmulpd    %ymm1, %ymm1, %ymm0
+
+/* R = R^SignRes : update sign of reduced argument */
+        vxorpd    %ymm5, %ymm1, %ymm6
+        vmovupd __dC7(%rax), %ymm1
+        vfmadd213pd __dC6(%rax), %ymm0, %ymm1
+        vfmadd213pd __dC5(%rax), %ymm0, %ymm1
+        vfmadd213pd __dC4(%rax), %ymm0, %ymm1
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+        vfmadd213pd __dC3(%rax), %ymm0, %ymm1
+
+/* Poly = R2*(C1+R2*(C2+R2*Poly)) */
+        vfmadd213pd __dC2(%rax), %ymm0, %ymm1
+        vfmadd213pd __dC1(%rax), %ymm0, %ymm1
+
+/* SignX - sign bit of X */
+        vandnpd   %ymm4, %ymm2, %ymm7
+        vmulpd    %ymm0, %ymm1, %ymm2
+
+/* Poly = Poly*R + R */
+        vfmadd213pd %ymm6, %ymm6, %ymm2
+        vmovmskpd %ymm3, %ecx
+
+/*
+  RECONSTRUCTION:
+  Final sign setting: Res = Poly^SignX
+ */
+        vxorpd    %ymm7, %ymm2, %ymm0
+        testl     %ecx, %ecx
+        jne       .LBL_1_3
+
+.LBL_1_2:
+        cfi_remember_state
+        movq      %rbp, %rsp
+        cfi_def_cfa_register (%rsp)
+        popq      %rbp
+        cfi_adjust_cfa_offset (-8)
+        cfi_restore (%rbp)
+        ret
+
+.LBL_1_3:
+        cfi_restore_state
+        vmovupd   %ymm4, 320(%rsp)
+        vmovupd   %ymm0, 384(%rsp)
+        je        .LBL_1_2
+
+        xorb      %dl, %dl
+        xorl      %eax, %eax
+        vmovups   %ymm8, 224(%rsp)
+        vmovups   %ymm9, 192(%rsp)
+        vmovups   %ymm10, 160(%rsp)
+        vmovups   %ymm11, 128(%rsp)
+        vmovups   %ymm12, 96(%rsp)
+        vmovups   %ymm13, 64(%rsp)
+        vmovups   %ymm14, 32(%rsp)
+        vmovups   %ymm15, (%rsp)
+        movq      %rsi, 264(%rsp)
+        movq      %rdi, 256(%rsp)
+        movq      %r12, 296(%rsp)
+        cfi_offset_rel_rsp (12, 296)
+        movb      %dl, %r12b
+        movq      %r13, 288(%rsp)
+        cfi_offset_rel_rsp (13, 288)
+        movl      %ecx, %r13d
+        movq      %r14, 280(%rsp)
+        cfi_offset_rel_rsp (14, 280)
+        movl      %eax, %r14d
+        movq      %r15, 272(%rsp)
+        cfi_offset_rel_rsp (15, 272)
+        cfi_remember_state
+
+.LBL_1_6:
+        btl       %r14d, %r13d
+        jc        .LBL_1_12
+
+.LBL_1_7:
+        lea       1(%r14), %esi
+        btl       %esi, %r13d
+        jc        .LBL_1_10
+
+.LBL_1_8:
+        incb      %r12b
+        addl      $2, %r14d
+        cmpb      $16, %r12b
+        jb        .LBL_1_6
+
+        vmovups   224(%rsp), %ymm8
+        vmovups   192(%rsp), %ymm9
+        vmovups   160(%rsp), %ymm10
+        vmovups   128(%rsp), %ymm11
+        vmovups   96(%rsp), %ymm12
+        vmovups   64(%rsp), %ymm13
+        vmovups   32(%rsp), %ymm14
+        vmovups   (%rsp), %ymm15
+        vmovupd   384(%rsp), %ymm0
+        movq      264(%rsp), %rsi
+        movq      256(%rsp), %rdi
+        movq      296(%rsp), %r12
+        cfi_restore (%r12)
+        movq      288(%rsp), %r13
+        cfi_restore (%r13)
+        movq      280(%rsp), %r14
+        cfi_restore (%r14)
+        movq      272(%rsp), %r15
+        cfi_restore (%r15)
+        jmp       .LBL_1_2
+
+.LBL_1_10:
+        cfi_restore_state
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    328(%rsp,%r15), %xmm0
+        vzeroupper
+
+        call      sin@PLT
+
+        vmovsd    %xmm0, 392(%rsp,%r15)
+        jmp       .LBL_1_8
+
+.LBL_1_12:
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    320(%rsp,%r15), %xmm0
+        vzeroupper
+
+        call      sin@PLT
+
+        vmovsd    %xmm0, 384(%rsp,%r15)
+        jmp       .LBL_1_7
+
+END (_ZGVdN4v_sin_avx2)
diff --git a/sysdeps/x86/fpu/bits/math-vector.h b/sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S
similarity index 51%
copy from sysdeps/x86/fpu/bits/math-vector.h
copy to sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S
index b3ef833..ba63102 100644
--- a/sysdeps/x86/fpu/bits/math-vector.h
+++ b/sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S
@@ -1,4 +1,4 @@
-/* Platform-specific SIMD declarations of math functions.
+/* Multiple versions of vectorized sin.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,21 +16,24 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#ifndef _MATH_H
-# error "Never include <bits/math-vector.h> directly;\
- include <math.h> instead."
-#endif
+#include <sysdep.h>
+#include <init-arch.h>
 
-/* Get default empty definitions for simd declarations.  */
-#include <bits/libm-simd-decl-stubs.h>
+	.text
+ENTRY (_ZGVeN8v_sin)
+        .type   _ZGVeN8v_sin, @gnu_indirect_function
+        cmpl    $0, KIND_OFFSET+__cpu_features(%rip)
+        jne     1
+        call    __init_cpu_features
+1:      leaq    _ZGVeN8v_sin_skx(%rip), %rax
+        testl   $bit_AVX512DQ_Usable, __cpu_features+FEATURE_OFFSET+index_AVX512DQ_Usable(%rip)
+        jnz     3
+2:      leaq    _ZGVeN8v_sin_knl(%rip), %rax
+        testl   $bit_AVX512F_Usable, __cpu_features+FEATURE_OFFSET+index_AVX512F_Usable(%rip)
+        jnz     3
+        leaq    _ZGVeN8v_sin_avx2_wrapper(%rip), %rax
+3:      ret
+END (_ZGVeN8v_sin)
 
-#if defined __x86_64__ && defined __FAST_MATH__
-# if defined _OPENMP && _OPENMP >= 201307
-/* OpenMP case.  */
-#  define __DECL_SIMD_x86_64 _Pragma ("omp declare simd notinbranch")
-#  undef __DECL_SIMD_cos
-#  define __DECL_SIMD_cos __DECL_SIMD_x86_64
-#  undef __DECL_SIMD_cosf
-#  define __DECL_SIMD_cosf __DECL_SIMD_x86_64
-# endif
-#endif
+#define _ZGVeN8v_sin _ZGVeN8v_sin_avx2_wrapper
+#include "../svml_d_sin8_core.S"
diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S b/sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S
new file mode 100644
index 0000000..c01ad1f
--- /dev/null
+++ b/sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S
@@ -0,0 +1,465 @@
+/* Function sin vectorized with AVX-512, KNL and SKX versions.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include "svml_d_sin_data.h"
+#include "svml_d_wrapper_impl.h"
+
+	.text
+ENTRY (_ZGVeN8v_sin_knl)
+#ifndef HAVE_AVX512_ASM_SUPPORT
+WRAPPER_IMPL_AVX512 _ZGVdN4v_sin
+#else
+/*
+   ALGORITHM DESCRIPTION:
+
+      ( low accuracy ( < 4ulp ) or enhanced performance
+      ( half of correct mantissa ) implementation )
+
+      Argument representation:
+      arg = N*Pi + R
+
+      Result calculation:
+      sin(arg) = sin(N*Pi + R) = (-1)^N * sin(R)
+      sin(R) is approximated by corresponding polynomial
+ */
+        pushq     %rbp
+        cfi_adjust_cfa_offset (8)
+        cfi_rel_offset (%rbp, 0)
+        movq      %rsp, %rbp
+        cfi_def_cfa_register (%rbp)
+        andq      $-64, %rsp
+        subq      $1280, %rsp
+        movq      __svml_dsin_data@GOTPCREL(%rip), %rax
+        movq      $-1, %rdx
+        vmovups __dAbsMask(%rax), %zmm6
+        vmovups __dInvPI(%rax), %zmm1
+
+/*
+ * ARGUMENT RANGE REDUCTION:
+ * X' = |X|
+ */
+        vpandq    %zmm6, %zmm0, %zmm12
+        vmovups __dPI1_FMA(%rax), %zmm2
+        vmovups __dC7(%rax), %zmm7
+
+/* SignX - sign bit of X */
+        vpandnq   %zmm0, %zmm6, %zmm11
+
+/* R = X' - N*Pi1 */
+        vmovaps   %zmm12, %zmm3
+
+/* Y = X'*InvPi + RS : right shifter add */
+        vfmadd213pd __dRShifter(%rax), %zmm12, %zmm1
+        vcmppd    $22, __dRangeVal(%rax), %zmm12, %k1
+        vpbroadcastq %rdx, %zmm13{%k1}{z}
+
+/* N = Y - RS : right shifter sub */
+        vsubpd __dRShifter(%rax), %zmm1, %zmm4
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+        vpsllq    $63, %zmm1, %zmm5
+        vptestmq  %zmm13, %zmm13, %k0
+        vfnmadd231pd %zmm4, %zmm2, %zmm3
+        kmovw     %k0, %ecx
+        movzbl    %cl, %ecx
+
+/* R = R - N*Pi2 */
+        vfnmadd231pd __dPI2_FMA(%rax), %zmm4, %zmm3
+
+/* R = R - N*Pi3 */
+        vfnmadd132pd __dPI3_FMA(%rax), %zmm3, %zmm4
+
+/*
+ * POLYNOMIAL APPROXIMATION:
+ * R2 = R*R
+ */
+        vmulpd    %zmm4, %zmm4, %zmm8
+
+/* R = R^SignRes : update sign of reduced argument */
+        vpxorq    %zmm5, %zmm4, %zmm9
+        vfmadd213pd __dC6(%rax), %zmm8, %zmm7
+        vfmadd213pd __dC5(%rax), %zmm8, %zmm7
+        vfmadd213pd __dC4(%rax), %zmm8, %zmm7
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+        vfmadd213pd __dC3(%rax), %zmm8, %zmm7
+
+/* Poly = R2*(C1+R2*(C2+R2*Poly)) */
+        vfmadd213pd __dC2(%rax), %zmm8, %zmm7
+        vfmadd213pd __dC1(%rax), %zmm8, %zmm7
+        vmulpd    %zmm8, %zmm7, %zmm10
+
+/* Poly = Poly*R + R */
+        vfmadd213pd %zmm9, %zmm9, %zmm10
+
+/*
+ * RECONSTRUCTION:
+ * Final sign setting: Res = Poly^SignX
+ */
+        vpxorq    %zmm11, %zmm10, %zmm1
+        testl     %ecx, %ecx
+        jne       .LBL_1_3
+
+.LBL_1_2:
+        cfi_remember_state
+        vmovaps   %zmm1, %zmm0
+        movq      %rbp, %rsp
+        cfi_def_cfa_register (%rsp)
+        popq      %rbp
+        cfi_adjust_cfa_offset (-8)
+        cfi_restore (%rbp)
+        ret
+
+.LBL_1_3:
+        cfi_restore_state
+        vmovups   %zmm0, 1152(%rsp)
+        vmovups   %zmm1, 1216(%rsp)
+        je        .LBL_1_2
+
+        xorb      %dl, %dl
+        kmovw     %k4, 1048(%rsp)
+        xorl      %eax, %eax
+        kmovw     %k5, 1040(%rsp)
+        kmovw     %k6, 1032(%rsp)
+        kmovw     %k7, 1024(%rsp)
+        vmovups   %zmm16, 960(%rsp)
+        vmovups   %zmm17, 896(%rsp)
+        vmovups   %zmm18, 832(%rsp)
+        vmovups   %zmm19, 768(%rsp)
+        vmovups   %zmm20, 704(%rsp)
+        vmovups   %zmm21, 640(%rsp)
+        vmovups   %zmm22, 576(%rsp)
+        vmovups   %zmm23, 512(%rsp)
+        vmovups   %zmm24, 448(%rsp)
+        vmovups   %zmm25, 384(%rsp)
+        vmovups   %zmm26, 320(%rsp)
+        vmovups   %zmm27, 256(%rsp)
+        vmovups   %zmm28, 192(%rsp)
+        vmovups   %zmm29, 128(%rsp)
+        vmovups   %zmm30, 64(%rsp)
+        vmovups   %zmm31, (%rsp)
+        movq      %rsi, 1064(%rsp)
+        movq      %rdi, 1056(%rsp)
+        movq      %r12, 1096(%rsp)
+        cfi_offset_rel_rsp (12, 1096)
+        movb      %dl, %r12b
+        movq      %r13, 1088(%rsp)
+        cfi_offset_rel_rsp (13, 1088)
+        movl      %ecx, %r13d
+        movq      %r14, 1080(%rsp)
+        cfi_offset_rel_rsp (14, 1080)
+        movl      %eax, %r14d
+        movq      %r15, 1072(%rsp)
+        cfi_offset_rel_rsp (15, 1072)
+        cfi_remember_state
+
+.LBL_1_6:
+        btl       %r14d, %r13d
+        jc        .LBL_1_12
+
+.LBL_1_7:
+        lea       1(%r14), %esi
+        btl       %esi, %r13d
+        jc        .LBL_1_10
+
+.LBL_1_8:
+        addb      $1, %r12b
+        addl      $2, %r14d
+        cmpb      $16, %r12b
+        jb        .LBL_1_6
+
+        kmovw     1048(%rsp), %k4
+        movq      1064(%rsp), %rsi
+        kmovw     1040(%rsp), %k5
+        movq      1056(%rsp), %rdi
+        kmovw     1032(%rsp), %k6
+        movq      1096(%rsp), %r12
+        cfi_restore (%r12)
+        movq      1088(%rsp), %r13
+        cfi_restore (%r13)
+        kmovw     1024(%rsp), %k7
+        vmovups   960(%rsp), %zmm16
+        vmovups   896(%rsp), %zmm17
+        vmovups   832(%rsp), %zmm18
+        vmovups   768(%rsp), %zmm19
+        vmovups   704(%rsp), %zmm20
+        vmovups   640(%rsp), %zmm21
+        vmovups   576(%rsp), %zmm22
+        vmovups   512(%rsp), %zmm23
+        vmovups   448(%rsp), %zmm24
+        vmovups   384(%rsp), %zmm25
+        vmovups   320(%rsp), %zmm26
+        vmovups   256(%rsp), %zmm27
+        vmovups   192(%rsp), %zmm28
+        vmovups   128(%rsp), %zmm29
+        vmovups   64(%rsp), %zmm30
+        vmovups   (%rsp), %zmm31
+        movq      1080(%rsp), %r14
+        cfi_restore (%r14)
+        movq      1072(%rsp), %r15
+        cfi_restore (%r15)
+        vmovups   1216(%rsp), %zmm1
+        jmp       .LBL_1_2
+
+.LBL_1_10:
+        cfi_restore_state
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    1160(%rsp,%r15), %xmm0
+        call      sin@PLT
+        vmovsd    %xmm0, 1224(%rsp,%r15)
+        jmp       .LBL_1_8
+
+.LBL_1_12:
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    1152(%rsp,%r15), %xmm0
+        call      sin@PLT
+        vmovsd    %xmm0, 1216(%rsp,%r15)
+        jmp       .LBL_1_7
+#endif
+END (_ZGVeN8v_sin_knl)
+
+ENTRY (_ZGVeN8v_sin_skx)
+#ifndef HAVE_AVX512_ASM_SUPPORT
+WRAPPER_IMPL_AVX512 _ZGVdN4v_sin
+#else
+/*
+   ALGORITHM DESCRIPTION:
+
+      ( low accuracy ( < 4ulp ) or enhanced performance
+       ( half of correct mantissa ) implementation )
+
+      Argument representation:
+      arg = N*Pi + R
+
+      Result calculation:
+      sin(arg) = sin(N*Pi + R) = (-1)^N * sin(R)
+      sin(R) is approximated by corresponding polynomial
+ */
+        pushq     %rbp
+        cfi_adjust_cfa_offset (8)
+        cfi_rel_offset (%rbp, 0)
+        movq      %rsp, %rbp
+        cfi_def_cfa_register (%rbp)
+        andq      $-64, %rsp
+        subq      $1280, %rsp
+        movq      __svml_dsin_data@GOTPCREL(%rip), %rax
+        vpbroadcastq .L_2il0floatpacket.14(%rip), %zmm14
+        vmovups __dAbsMask(%rax), %zmm7
+        vmovups __dInvPI(%rax), %zmm2
+        vmovups __dRShifter(%rax), %zmm1
+        vmovups __dPI1_FMA(%rax), %zmm3
+        vmovups __dC7(%rax), %zmm8
+
+/*
+  ARGUMENT RANGE REDUCTION:
+  X' = |X|
+ */
+        vandpd    %zmm7, %zmm0, %zmm13
+
+/* SignX - sign bit of X */
+        vandnpd   %zmm0, %zmm7, %zmm12
+
+/* Y = X'*InvPi + RS : right shifter add */
+        vfmadd213pd %zmm1, %zmm13, %zmm2
+        vcmppd    $18, __dRangeVal(%rax), %zmm13, %k1
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+        vpsllq    $63, %zmm2, %zmm6
+
+/* N = Y - RS : right shifter sub */
+        vsubpd    %zmm1, %zmm2, %zmm5
+
+/* R = X' - N*Pi1 */
+        vmovaps   %zmm13, %zmm4
+        vfnmadd231pd %zmm5, %zmm3, %zmm4
+
+/* R = R - N*Pi2 */
+        vfnmadd231pd __dPI2_FMA(%rax), %zmm5, %zmm4
+
+/* R = R - N*Pi3 */
+        vfnmadd132pd __dPI3_FMA(%rax), %zmm4, %zmm5
+
+/*
+  POLYNOMIAL APPROXIMATION:
+  R2 = R*R
+ */
+        vmulpd    %zmm5, %zmm5, %zmm9
+
+/* R = R^SignRes : update sign of reduced argument */
+        vxorpd    %zmm6, %zmm5, %zmm10
+        vfmadd213pd __dC6(%rax), %zmm9, %zmm8
+        vfmadd213pd __dC5(%rax), %zmm9, %zmm8
+        vfmadd213pd __dC4(%rax), %zmm9, %zmm8
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+        vfmadd213pd __dC3(%rax), %zmm9, %zmm8
+
+/* Poly = R2*(C1+R2*(C2+R2*Poly)) */
+        vfmadd213pd __dC2(%rax), %zmm9, %zmm8
+        vfmadd213pd __dC1(%rax), %zmm9, %zmm8
+        vmulpd    %zmm9, %zmm8, %zmm11
+
+/* Poly = Poly*R + R */
+        vfmadd213pd %zmm10, %zmm10, %zmm11
+
+/*
+  RECONSTRUCTION:
+  Final sign setting: Res = Poly^SignX
+ */
+        vxorpd    %zmm12, %zmm11, %zmm1
+        vpandnq   %zmm13, %zmm13, %zmm14{%k1}
+        vcmppd    $3, %zmm14, %zmm14, %k0
+        kmovw     %k0, %ecx
+        testl     %ecx, %ecx
+        jne       .LBL_2_3
+
+.LBL_2_2:
+        cfi_remember_state
+        vmovaps   %zmm1, %zmm0
+        movq      %rbp, %rsp
+        cfi_def_cfa_register (%rsp)
+        popq      %rbp
+        cfi_adjust_cfa_offset (-8)
+        cfi_restore (%rbp)
+        ret
+
+.LBL_2_3:
+        cfi_restore_state
+        vmovups   %zmm0, 1152(%rsp)
+        vmovups   %zmm1, 1216(%rsp)
+        je        .LBL_2_2
+
+        xorb      %dl, %dl
+        xorl      %eax, %eax
+        kmovw     %k4, 1048(%rsp)
+        kmovw     %k5, 1040(%rsp)
+        kmovw     %k6, 1032(%rsp)
+        kmovw     %k7, 1024(%rsp)
+        vmovups   %zmm16, 960(%rsp)
+        vmovups   %zmm17, 896(%rsp)
+        vmovups   %zmm18, 832(%rsp)
+        vmovups   %zmm19, 768(%rsp)
+        vmovups   %zmm20, 704(%rsp)
+        vmovups   %zmm21, 640(%rsp)
+        vmovups   %zmm22, 576(%rsp)
+        vmovups   %zmm23, 512(%rsp)
+        vmovups   %zmm24, 448(%rsp)
+        vmovups   %zmm25, 384(%rsp)
+        vmovups   %zmm26, 320(%rsp)
+        vmovups   %zmm27, 256(%rsp)
+        vmovups   %zmm28, 192(%rsp)
+        vmovups   %zmm29, 128(%rsp)
+        vmovups   %zmm30, 64(%rsp)
+        vmovups   %zmm31, (%rsp)
+        movq      %rsi, 1064(%rsp)
+        movq      %rdi, 1056(%rsp)
+        movq      %r12, 1096(%rsp)
+        cfi_offset_rel_rsp (12, 1096)
+        movb      %dl, %r12b
+        movq      %r13, 1088(%rsp)
+        cfi_offset_rel_rsp (13, 1088)
+        movl      %ecx, %r13d
+        movq      %r14, 1080(%rsp)
+        cfi_offset_rel_rsp (14, 1080)
+        movl      %eax, %r14d
+        movq      %r15, 1072(%rsp)
+        cfi_offset_rel_rsp (15, 1072)
+        cfi_remember_state
+
+.LBL_2_6:
+        btl       %r14d, %r13d
+        jc        .LBL_2_12
+
+.LBL_2_7:
+        lea       1(%r14), %esi
+        btl       %esi, %r13d
+        jc        .LBL_2_10
+
+.LBL_2_8:
+        incb      %r12b
+        addl      $2, %r14d
+        cmpb      $16, %r12b
+        jb        .LBL_2_6
+
+        kmovw     1048(%rsp), %k4
+        kmovw     1040(%rsp), %k5
+        kmovw     1032(%rsp), %k6
+        kmovw     1024(%rsp), %k7
+        vmovups   960(%rsp), %zmm16
+        vmovups   896(%rsp), %zmm17
+        vmovups   832(%rsp), %zmm18
+        vmovups   768(%rsp), %zmm19
+        vmovups   704(%rsp), %zmm20
+        vmovups   640(%rsp), %zmm21
+        vmovups   576(%rsp), %zmm22
+        vmovups   512(%rsp), %zmm23
+        vmovups   448(%rsp), %zmm24
+        vmovups   384(%rsp), %zmm25
+        vmovups   320(%rsp), %zmm26
+        vmovups   256(%rsp), %zmm27
+        vmovups   192(%rsp), %zmm28
+        vmovups   128(%rsp), %zmm29
+        vmovups   64(%rsp), %zmm30
+        vmovups   (%rsp), %zmm31
+        vmovups   1216(%rsp), %zmm1
+        movq      1064(%rsp), %rsi
+        movq      1056(%rsp), %rdi
+        movq      1096(%rsp), %r12
+        cfi_restore (%r12)
+        movq      1088(%rsp), %r13
+        cfi_restore (%r13)
+        movq      1080(%rsp), %r14
+        cfi_restore (%r14)
+        movq      1072(%rsp), %r15
+        cfi_restore (%r15)
+        jmp       .LBL_2_2
+
+.LBL_2_10:
+        cfi_restore_state
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    1160(%rsp,%r15), %xmm0
+        vzeroupper
+        vmovsd    1160(%rsp,%r15), %xmm0
+
+        call      sin@PLT
+
+        vmovsd    %xmm0, 1224(%rsp,%r15)
+        jmp       .LBL_2_8
+
+.LBL_2_12:
+        movzbl    %r12b, %r15d
+        shlq      $4, %r15
+        vmovsd    1152(%rsp,%r15), %xmm0
+        vzeroupper
+        vmovsd    1152(%rsp,%r15), %xmm0
+
+        call      sin@PLT
+
+        vmovsd    %xmm0, 1216(%rsp,%r15)
+        jmp       .LBL_2_7
+#endif
+END (_ZGVeN8v_sin_skx)
+
+	.section .rodata, "a"
+.L_2il0floatpacket.14:
+	.long	0xffffffff,0xffffffff
+	.type	.L_2il0floatpacket.14,@object
diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c b/sysdeps/x86_64/fpu/svml_d_sin2_core.S
similarity index 77%
copy from sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
copy to sysdeps/x86_64/fpu/svml_d_sin2_core.S
index fcc4b64..c619dab 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
+++ b/sysdeps/x86_64/fpu/svml_d_sin2_core.S
@@ -1,4 +1,4 @@
-/* Tests for AVX2 ISA versions of vector math functions.
+/* Function sin vectorized with SSE2.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,13 +16,14 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include "test-double-vlen4.h"
+#include <sysdep.h>
+#include "svml_d_wrapper_impl.h"
 
-#undef VEC_SUFF
-#define VEC_SUFF _vlen4_avx2
+	.text
+ENTRY (_ZGVbN2v_sin)
+WRAPPER_IMPL_SSE2 sin
+END (_ZGVbN2v_sin)
 
-#define TEST_VECTOR_cos 1
-
-#define REQUIRE_AVX2
-
-#include "libm-test.c"
+#ifndef USE_MULTIARCH
+ libmvec_hidden_def (_ZGVbN2v_sin)
+#endif
diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c b/sysdeps/x86_64/fpu/svml_d_sin4_core.S
similarity index 76%
copy from sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
copy to sysdeps/x86_64/fpu/svml_d_sin4_core.S
index fcc4b64..f650d46 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
+++ b/sysdeps/x86_64/fpu/svml_d_sin4_core.S
@@ -1,4 +1,4 @@
-/* Tests for AVX2 ISA versions of vector math functions.
+/* Function sin vectorized with AVX2, wrapper version.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,13 +16,14 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include "test-double-vlen4.h"
+#include <sysdep.h>
+#include "svml_d_wrapper_impl.h"
 
-#undef VEC_SUFF
-#define VEC_SUFF _vlen4_avx2
+	.text
+ENTRY (_ZGVdN4v_sin)
+WRAPPER_IMPL_AVX _ZGVbN2v_sin
+END (_ZGVdN4v_sin)
 
-#define TEST_VECTOR_cos 1
-
-#define REQUIRE_AVX2
-
-#include "libm-test.c"
+#ifndef USE_MULTIARCH
+ libmvec_hidden_def (_ZGVdN4v_sin)
+#endif
diff --git a/sysdeps/x86_64/fpu/test-double-vlen2.c b/sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S
similarity index 79%
copy from sysdeps/x86_64/fpu/test-double-vlen2.c
copy to sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S
index 03e2046..a21ffaf 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen2.c
+++ b/sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S
@@ -1,4 +1,4 @@
-/* Tests for SSE ISA versions of vector math functions.
+/* Function sin vectorized in AVX ISA as wrapper to SSE4 ISA version.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,8 +16,10 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include "test-double-vlen2.h"
+#include <sysdep.h>
+#include "svml_d_wrapper_impl.h"
 
-#define TEST_VECTOR_cos 1
-
-#include "libm-test.c"
+	.text
+ENTRY (_ZGVcN4v_sin)
+WRAPPER_IMPL_AVX _ZGVbN2v_sin
+END (_ZGVcN4v_sin)
diff --git a/sysdeps/x86_64/fpu/test-double-vlen2.c b/sysdeps/x86_64/fpu/svml_d_sin8_core.S
similarity index 79%
copy from sysdeps/x86_64/fpu/test-double-vlen2.c
copy to sysdeps/x86_64/fpu/svml_d_sin8_core.S
index 03e2046..2e78b5e 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen2.c
+++ b/sysdeps/x86_64/fpu/svml_d_sin8_core.S
@@ -1,4 +1,4 @@
-/* Tests for SSE ISA versions of vector math functions.
+/* Function sin vectorized with AVX-512, wrapper to AVX2 version.
    Copyright (C) 2014-2015 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,8 +16,10 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include "test-double-vlen2.h"
+#include <sysdep.h>
+#include "svml_d_wrapper_impl.h"
 
-#define TEST_VECTOR_cos 1
-
-#include "libm-test.c"
+	.text
+ENTRY (_ZGVeN8v_sin)
+WRAPPER_IMPL_AVX512 _ZGVdN4v_sin
+END (_ZGVeN8v_sin)
diff --git a/sysdeps/x86_64/fpu/svml_d_sin_data.S b/sysdeps/x86_64/fpu/svml_d_sin_data.S
new file mode 100644
index 0000000..e5e1ff7
--- /dev/null
+++ b/sysdeps/x86_64/fpu/svml_d_sin_data.S
@@ -0,0 +1,82 @@
+/* Data for vectorized sin.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "svml_d_sin_data.h"
+
+	.section .rodata, "a"
+	.align 64
+
+/* Data table for vector implementations of function sin.
+   The table may contain polynomial, reduction, lookup coefficients and other macro_names
+   obtained through different methods of research and experimental work.  */
+
+	.globl __svml_dsin_data
+__svml_dsin_data:
+
+/* General purpose constants:
+   absolute value mask */
+double_vector __dAbsMask 0x7fffffffffffffff
+
+/* working range threshold */
+double_vector __dRangeVal 0x4170000000000000
+
+/* 1/PI */
+double_vector __dInvPI 0x3fd45f306dc9c883
+
+/* right-shifter constant */
+double_vector __dRShifter 0x4338000000000000
+
+/* 0.0 */
+double_vector __dZero 0x0000000000000000
+
+/* -0.0 */
+double_vector __lNZero 0x8000000000000000
+
+/* Range reduction PI-based constants:
+   PI high part */
+double_vector __dPI1 0x400921fb40000000
+
+/* PI mid  part 1 */
+double_vector __dPI2 0x3e84442d00000000
+
+/* PI mid  part 2 */
+double_vector __dPI3 0x3d08469880000000
+
+/* PI low  part */
+double_vector __dPI4 0x3b88cc51701b839a
+
+/* Range reduction PI-based constants if FMA available:
+   PI high part (FMA available) */
+double_vector __dPI1_FMA 0x400921fb54442d18
+
+/* PI mid part  (FMA available) */
+double_vector __dPI2_FMA 0x3ca1a62633145c06
+
+/* PI low part  (FMA available) */
+double_vector __dPI3_FMA 0x395c1cd129024e09
+
+/* Polynomial coefficients (relative error 2^(-52.115)): */
+double_vector __dC1 0xbfc55555555554a8
+double_vector __dC2 0x3f8111111110a573
+double_vector __dC3 0xbf2a01a019a659dd
+double_vector __dC4 0x3ec71de3806add1a
+double_vector __dC5 0xbe5ae6355aaa4a53
+double_vector __dC6 0x3de60e6bee01d83e
+double_vector __dC7 0xbd69f1517e9f65f0
+	.type	__svml_dsin_data,@object
+	.size __svml_dsin_data,.-__svml_dsin_data
diff --git a/sysdeps/x86_64/fpu/svml_d_sin_data.h b/sysdeps/x86_64/fpu/svml_d_sin_data.h
new file mode 100644
index 0000000..76ab508
--- /dev/null
+++ b/sysdeps/x86_64/fpu/svml_d_sin_data.h
@@ -0,0 +1,53 @@
+/* Offsets for data table for vectorized sin.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef D_SIN_DATA_H
+#define D_SIN_DATA_H
+
+/* Offsets for data table */
+#define __dAbsMask                    	0
+#define __dRangeVal                   	64
+#define __dInvPI                      	128
+#define __dRShifter                   	192
+#define __dZero                       	256
+#define __lNZero                      	320
+#define __dPI1                        	384
+#define __dPI2                        	448
+#define __dPI3                        	512
+#define __dPI4                        	576
+#define __dPI1_FMA                    	640
+#define __dPI2_FMA                    	704
+#define __dPI3_FMA                    	768
+#define __dC1                         	832
+#define __dC2                         	896
+#define __dC3                         	960
+#define __dC4                         	1024
+#define __dC5                         	1088
+#define __dC6                         	1152
+#define __dC7                         	1216
+
+.macro double_vector offset value
+.if .-__svml_dsin_data != \offset
+.err
+.endif
+.rept 8
+.quad \value
+.endr
+.endm
+
+#endif
diff --git a/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c
index bcfccb9..347aab5 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c
@@ -23,3 +23,4 @@
 #define VEC_TYPE __m128d
 
 VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVbN2v_cos)
+VECTOR_WRAPPER (WRAPPER_NAME (sin), _ZGVbN2v_sin)
diff --git a/sysdeps/x86_64/fpu/test-double-vlen2.c b/sysdeps/x86_64/fpu/test-double-vlen2.c
index 03e2046..353b680 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen2.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen2.c
@@ -19,5 +19,6 @@
 #include "test-double-vlen2.h"
 
 #define TEST_VECTOR_cos 1
+#define TEST_VECTOR_sin 1
 
 #include "libm-test.c"
diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c
index 69e3fb1..006c795 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c
@@ -26,3 +26,4 @@
 #define VEC_TYPE __m256d
 
 VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVdN4v_cos)
+VECTOR_WRAPPER (WRAPPER_NAME (sin), _ZGVdN4v_sin)
diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c b/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
index fcc4b64..51247b7 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen4-avx2.c
@@ -22,6 +22,7 @@
 #define VEC_SUFF _vlen4_avx2
 
 #define TEST_VECTOR_cos 1
+#define TEST_VECTOR_sin 1
 
 #define REQUIRE_AVX2
 
diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c
index cd9c0bb..b87454e 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c
@@ -23,3 +23,4 @@
 #define VEC_TYPE __m256d
 
 VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVcN4v_cos)
+VECTOR_WRAPPER (WRAPPER_NAME (sin), _ZGVcN4v_sin)
diff --git a/sysdeps/x86_64/fpu/test-double-vlen4.c b/sysdeps/x86_64/fpu/test-double-vlen4.c
index dd77b70..4c1aefa 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen4.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen4.c
@@ -19,5 +19,6 @@
 #include "test-double-vlen4.h"
 
 #define TEST_VECTOR_cos 1
+#define TEST_VECTOR_sin 1
 
 #include "libm-test.c"
diff --git a/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c
index 9381360..b789f5e 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c
@@ -23,3 +23,4 @@
 #define VEC_TYPE __m512d
 
 VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVeN8v_cos)
+VECTOR_WRAPPER (WRAPPER_NAME (sin), _ZGVeN8v_sin)
diff --git a/sysdeps/x86_64/fpu/test-double-vlen8.c b/sysdeps/x86_64/fpu/test-double-vlen8.c
index e4da2bd..9998280 100644
--- a/sysdeps/x86_64/fpu/test-double-vlen8.c
+++ b/sysdeps/x86_64/fpu/test-double-vlen8.c
@@ -19,6 +19,7 @@
 #include "test-double-vlen8.h"
 
 #define TEST_VECTOR_cos 1
+#define TEST_VECTOR_sin 1
 
 #define REQUIRE_AVX512F
 

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                                          |   31 ++
 NEWS                                               |    2 +-
 bits/libm-simd-decl-stubs.h                        |    4 +
 math/bits/mathcalls.h                              |    2 +-
 sysdeps/unix/sysv/linux/x86_64/libmvec.abilist     |    4 +
 sysdeps/x86/fpu/bits/math-vector.h                 |    2 +
 sysdeps/x86_64/fpu/Makefile                        |    4 +-
 sysdeps/x86_64/fpu/Versions                        |    1 +
 sysdeps/x86_64/fpu/libm-test-ulps                  |   12 +
 sysdeps/x86_64/fpu/multiarch/Makefile              |    6 +-
 sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S    |   38 ++
 .../x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S   |  229 ++++++++++
 sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S    |   38 ++
 .../x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S   |  210 +++++++++
 sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S    |   39 ++
 .../x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S |  465 ++++++++++++++++++++
 sysdeps/x86_64/fpu/svml_d_sin2_core.S              |   29 ++
 sysdeps/x86_64/fpu/svml_d_sin4_core.S              |   29 ++
 sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S          |   25 +
 sysdeps/x86_64/fpu/svml_d_sin8_core.S              |   25 +
 sysdeps/x86_64/fpu/svml_d_sin_data.S               |   82 ++++
 sysdeps/x86_64/fpu/svml_d_sin_data.h               |   53 +++
 sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c    |    1 +
 sysdeps/x86_64/fpu/test-double-vlen2.c             |    1 +
 .../x86_64/fpu/test-double-vlen4-avx2-wrappers.c   |    1 +
 sysdeps/x86_64/fpu/test-double-vlen4-avx2.c        |    1 +
 sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c    |    1 +
 sysdeps/x86_64/fpu/test-double-vlen4.c             |    1 +
 sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c    |    1 +
 sysdeps/x86_64/fpu/test-double-vlen8.c             |    1 +
 30 files changed, 1333 insertions(+), 5 deletions(-)
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_sin2_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_sin4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_sin8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_sin_data.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_sin_data.h


hooks/post-receive
-- 
GNU C Library master sources
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]