This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][PATCH] New expf, exp2f, logf, log2f and powf implementations




On 8/30/2017 10:04 AM, Corinna Vinschen wrote:
Hi Szabolcs,

On Aug 30 15:18, Szabolcs Nagy wrote:
Based on code from https://github.com/ARM-software/optimized-routines/

This patch adds a highly optimized generic implementation of expf,
exp2f, logf, log2f and powf.  The new functions are not only
significantly faster, but are also smaller, more accurate and fix
several correctness issues.  In order to achieve this, the algorithm
uses double precision arithmetic for accuracy, avoids divisions and
uses small table lookups to minimize the polynomials.  Special cases
are handled inline to avoid the unnecessary overhead of wrapper
functions and set errno to POSIX requirements.

The new functions are added under newlib/libm/common and are
currently enabled for AArch64 and AArch32 with VFP format.  The
new functions are written in C99 and do not support non-IEEE
representations, mixed endian doubles or non-standard errno setting.
Targets can enable the new math code by undefining OBSOLETE_MATH.
Targets with a single precision FPU may still prefer the old
implementation. It was not clear what's the best way to add such
alternative math implementations to newlib so this is an RFC
patch and feedback is welcome.

I'm not confident enough with math stuff to review this, so I'd
rather like to defer to Jeff here.

However, two points:

- Please send the patch as generated by `git format-patch'.  Ideally
   as a patchset of smaller patches.

- Why the OBSOLETE_MATH stuff?  If these implementations are generic
   and sufficiently tested, wouldn't it make sense to enable them by
   default without forcing other targets to enable the OBSOLETE_MATH
   setting?

Any comment on the difference in code and data size? Just curious
if we are considering it on all targets.

What capabilities does it require of the target architecture? You
mentioned VFP. Does it require vector operations. If something
is assumed, if the minimum capabilities are not met, how well
does this code perform?



Thanks,
Corinna


--joel


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]