This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[2.20] [6/6] Do not terminate default test runs on test failure
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: <libc-alpha at sourceware dot org>
- Date: Fri, 10 Jan 2014 02:14:50 +0000
- Subject: [2.20] [6/6] Do not terminate default test runs on test failure
- Authentication-results: sourceware.org; auth=none
- References: <Pine dot LNX dot 4 dot 64 dot 1401100208000 dot 9412 at digraph dot polyomino dot org dot uk>
Normal practice for software testsuites is that rather than
terminating immediately when a test fails, they continue running and
report at the end on how many tests passed or failed.
The principle behind the glibc testsuite stopping on failure was
probably that the expected state is no failures and so any failure
indicates a problem such as miscompilation. In practice, while this
is fairly close to true for native testing on x86_64 and x86 (kernel
bugs and race conditions can still cause intermittent failures), it's
less likely to be the case on other platforms, and so people testing
glibc run the testsuite with "make -k" and then examine the logs to
determine whether the failures are what they expect to fail on that
platform, possibly with some automation for the comparison.
This patch switches the glibc testsuite to the normal convention of
not stopping on failure - unless you use stop-on-test-failure=y, in
which case it behaves essentially as it did before (and does not
generate overall test summaries on failure). Instead, the summary
tests.sum may contain tests that FAILed. At the end of the test run,
any FAIL or ERROR lines from tests.sum are printed, and then it exits
with error status if there were any ERROR lines (meaning a directory
had no test results). In addition, build failures will also cause the
test run to stop - this has the justification that those *do* indicate
serious problems that should be promptly fixed and aren't generally
hard to fix (but apart from that, avoiding the build stopping on those
failures seems harder).
Questions:
* Should "make check" also exit with error status if there were any
FAILs, not just if there were ERRORs? I didn't do this because it
seems useful to make the exit status distinguish the case where the
build is broken (early termination or an ERROR) from the case where
there are simply some test failures.
* There are various cases of miscellaneous files that are dependencies
of tests, the build of which involves running some code from the
newly built libc and associated programs, but which aren't counted
as tests themselves (aren't in tests-special, don't use
evaluate-test) - for example, timezone files generated with zic.
Should these more systematically change to be counted as tests?
They may not be that interesting as tests, but doing this would
increase the chances of getting a complete log of test results when
cross-testing if the host system is flaky - ensuring that only
problems on the build system will terminate a test run early, not
problems on the host used for running the newly built libc.
(Ideally, in all cases of a test depending on another test - and
there are plenty already of one test using the .out file from
another test, whether or not we make a rule that running code on the
host means it's a test - failure in the dependency would result in
UNRESOLVED not FAIL from the test depending on the failed test. But
that may be hard to implement, and I'm not particularly concerned
about one test failure causing others as fallout.)
Tested x86_64.
2014-01-10 Joseph Myers <joseph@codesourcery.com>
* scripts/evaluate-test.sh: Handle fourth argument to determine
whether test run should stop on failure.
* Makeconfig (stop-on-test-failure): New variable.
(evaluate-test): Pass fourth argument to evaluate-test.sh based on
$(stop-on-test-failure).
* Makefile (tests): Give a summary of results from testing and
exit with failure status if they include an ERROR.
(xtests): Likewise.
* manual/install.texi (Configuring and compiling): Mention
stop-on-test-failure=y.
* INSTALL: Regenerated.
diff --git a/INSTALL b/INSTALL
index bfa692d..852d192 100644
--- a/INSTALL
+++ b/INSTALL
@@ -192,11 +192,15 @@ an appropriate numeric parameter to `make'. You need a recent GNU
To build and run test programs which exercise some of the library
facilities, type `make check'. If it does not complete successfully,
-do not use the built library, and report a bug after verifying that the
-problem is not already known. *Note Reporting Bugs::, for instructions
-on reporting bugs. Note that some of the tests assume they are not
-being run by `root'. We recommend you compile and test the GNU C
-Library as an unprivileged user.
+without reporting any unexpected failures or errors in its final
+summary of results, do not use the built library, and report a bug
+after verifying that the problem is not already known. (You can
+specify `stop-on-test-failure=y' when running `make check' to make it
+stop immediately when a failure occurs rather than finishing running
+the tests then reporting all problems found.) *Note Reporting Bugs::,
+for instructions on reporting bugs. Note that some of the tests assume
+they are not being run by `root'. We recommend you compile and test
+the GNU C Library as an unprivileged user.
Before reporting bugs make sure there is no problem with your system.
The tests (and later installation) use some pre-existing files of the
diff --git a/Makeconfig b/Makeconfig
index ea40b38..5f39df2 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -601,6 +601,12 @@ run-built-tests = yes
endif
endif
+# Whether to stop immediately when a test fails. Nonempty means to
+# stop, empty means not to stop.
+ifndef stop-on-test-failure
+stop-on-test-failure =
+endif
+
# How to run a program we just linked with our library.
# The program binary is assumed to be $(word 2,$^).
built-program-file = $(dir $(word 2,$^))$(notdir $(word 2,$^))
@@ -1092,6 +1098,7 @@ endif
# XPASS or XFAIL rather than PASS or FAIL.
evaluate-test = $(..)scripts/evaluate-test.sh $(test-name) $$? \
$(if $(test-xfail-$(@F:.out=)),true,false) \
+ $(if $(stop-on-test-failure),true,false) \
> $(objpfx)$(@F:.out=).test-result
endif # Makeconfig not yet included
diff --git a/Makefile b/Makefile
index 8434c00..129f443 100644
--- a/Makefile
+++ b/Makefile
@@ -326,10 +326,20 @@ tests: $(tests-special)
$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-tests.sum \
$(sort $(subdirs) .) \
> $(objpfx)tests.sum
+ @grep '^ERROR:' $(objpfx)tests.sum || true
+ @grep '^FAIL:' $(objpfx)tests.sum || true
+ @echo "Summary of test results:"
+ @sed 's/:.*//' < $(objpfx)tests.sum | sort | uniq -c
+ @if grep -q '^ERROR:' $(objpfx)tests.sum; then exit 1; fi
xtests:
$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-xtests.sum \
$(sort $(subdirs)) \
> $(objpfx)xtests.sum
+ @grep '^ERROR:' $(objpfx)xtests.sum || true
+ @grep '^FAIL:' $(objpfx)xtests.sum || true
+ @echo "Summary of test results for extra tests:"
+ @sed 's/:.*//' < $(objpfx)xtests.sum | sort | uniq -c
+ @if grep -q '^ERROR:' $(objpfx)xtests.sum; then exit 1; fi
# The realclean target is just like distclean for the parent, but we want
# the subdirs to know the difference in case they care.
diff --git a/manual/install.texi b/manual/install.texi
index c0b8d9e..4644a64 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -224,11 +224,15 @@ GNU @code{make} version, though.
To build and run test programs which exercise some of the library
facilities, type @code{make check}. If it does not complete
-successfully, do not use the built library, and report a bug after
-verifying that the problem is not already known. @xref{Reporting Bugs},
-for instructions on reporting bugs. Note that some of the tests assume
-they are not being run by @code{root}. We recommend you compile and
-test @theglibc{} as an unprivileged user.
+successfully, without reporting any unexpected failures or errors in
+its final summary of results, do not use the built library, and report
+a bug after verifying that the problem is not already known. (You can
+specify @samp{stop-on-test-failure=y} when running @code{make check}
+to make it stop immediately when a failure occurs rather than
+finishing running the tests then reporting all problems found.)
+@xref{Reporting Bugs}, for instructions on reporting bugs. Note that
+some of the tests assume they are not being run by @code{root}. We
+recommend you compile and test @theglibc{} as an unprivileged user.
Before reporting bugs make sure there is no problem with your system.
The tests (and later installation) use some pre-existing files of the
diff --git a/scripts/evaluate-test.sh b/scripts/evaluate-test.sh
index be156df..0b37030 100755
--- a/scripts/evaluate-test.sh
+++ b/scripts/evaluate-test.sh
@@ -17,11 +17,12 @@
# License along with the GNU C Library; if not, see
# <http://www.gnu.org/licenses/>.
-# usage: evaluate-test.sh test_name rc xfail
+# usage: evaluate-test.sh test_name rc xfail stop_on_failure
test_name=$1
rc=$2
xfail=$3
+stop_on_failure=$4
if [ $rc -eq 0 ]; then
result="PASS"
@@ -35,4 +36,8 @@ if $xfail; then
fi
echo "$result: $test_name"
-exit $rc
+if $stop_on_failure; then
+ exit $rc
+else
+ exit 0
+fi
--
Joseph S. Myers
joseph@codesourcery.com