This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

openlibm and libm: mutation testing


Hello,

I am addressing developers and maintainers of both libm and openlibm
(copy: https://github.com/JuliaLang/openlibm/issues/172).

I will start with tl;dr part first.

I am a developer of mutation testing system called
[Mull](https://github.com/mull-project/mull).

Our development involves analysis of open source projects. This
analysis helps us to develop Mull further and also we learn more about
how various projects are tested in the wild.

I have recorded a small demo from my sessions. It is about libm which
I was able to compile on Ubuntu 64 bit Docker image and it is very
similar to what I am also doing with openlibm on macOS.

[Analyzing sqrt function from libm C library with
Mull](https://www.youtube.com/watch?v=_ipprdCdVHc&t=9s&list=PLmo9Xl8PKe5sfM4gTYSUVaxJCJdQ06hDP&index=1)

---

Some time ago I started to analyze
[openlibm](https://github.com/JuliaLang/openlibm/issues/new) and
[libm](https://sourceware.org/newlib/libm.html). I was specifically
looking for something with lots of computational code, so libm and
then openlibm turned out to be very good targets for my testing of
so-called "math mutation operators", "scalar value mutation
operators", "replace function call mutation operators" etc.

Based on my analysis **both openlibm and libm have a lack of mutation
testing coverage**: you can modify chunks of the functions like `acos`
or `sqrt` and still see all of the tests passing.

These are my observations about the code:

- The code in both libraries seems use `__ieee754_*` functions that have
`Copyright (C) 1993 by Sun Microsystems` on it.
- `libm/test/**` has much more tests compared to `openlibm`. For
example `acos` is tested against ~280 test cases. They are passing on
my machine with some tweaks even though the test suite is not
maintained since 2002.
- `openlibm/test` suite works out of the box on macOS but has very few tests.
- Our `Mull` gives roughly `50-60%` mutation coverage on functions
like `acos` or `sqrt` which means that both `libm` and `openlibm` are
actually not tested well enough.

There are a few things I would like to ask:

1) Looks like both libraries have the same origin but the code in both
is very different these days. What is the reason of this difference in
implementations and what is the criteria of deciding which one
performs better after the years?

2) libm uses a struct to convert double to a pair of two uint32_t's and back.

```c
typedef union __ieee_double_shape_type {
  double value;
  ...
  struct {
    uint32_t lsw;
    uint32_t msw;
  } parts;
  ...
} __ieee_double_shape_type;
```

And it uses these pairs to write the test cases like:

```
{64,13, 37,__LINE__, 0x40500000, 0x00000000, 0xbff33333, 0x33333333},
/* 64.0000=f(-1.20000)*/
```

Where `0x40500000, 0x00000000` is expected result and `0xbff33333,
0x33333333` is the argument.

Why `libm` does this? Why not work with `double` directly? And if this
is a good practice, why then `openlibm` does not follow this approach?

3) Are libm and openlibm known to be used in critical software? I am
wondering how do people prove the correctness of how both libraries
work given:
- `libm`'s tests are not maintained since 2002. Its test coverage of
~280 cases for each function seems good but even a first shot with
Mull reveals that there is lot of weird code that does not contribute
to the implementation **if we judge only using that test suite**.
- `openlibm` compiles and runs its tests nicely right from master but
has so few of them so it is quite hard to consider its test suite as
one that gives a confidence.

4) Our long-term plans for Mull include making an open-source service
for mutation testing of OSS projects like
[google/oss-fuzz](https://github.com/google/oss-fuzz). Does anybody
see the potential in libm/openlibm being tested against mutation
coverage metric like [oss-fuzz
projects](https://github.com/google/oss-fuzz/tree/master/projects) are
tested using fuzzers?

Thanks for attention.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]