This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
[2.20] [6/6] Do not terminate default test runs on test failure

From: "Joseph S. Myers" <joseph at codesourcery dot com>
To: <libc-alpha at sourceware dot org>
Date: Fri, 10 Jan 2014 02:14:50 +0000
Subject: [2.20] [6/6] Do not terminate default test runs on test failure
Authentication-results: sourceware.org; auth=none
References: <Pine dot LNX dot 4 dot 64 dot 1401100208000 dot 9412 at digraph dot polyomino dot org dot uk>
Normal practice for software testsuites is that rather than
terminating immediately when a test fails, they continue running and
report at the end on how many tests passed or failed.

The principle behind the glibc testsuite stopping on failure was
probably that the expected state is no failures and so any failure
indicates a problem such as miscompilation.  In practice, while this
is fairly close to true for native testing on x86_64 and x86 (kernel
bugs and race conditions can still cause intermittent failures), it's
less likely to be the case on other platforms, and so people testing
glibc run the testsuite with "make -k" and then examine the logs to
determine whether the failures are what they expect to fail on that
platform, possibly with some automation for the comparison.

This patch switches the glibc testsuite to the normal convention of
not stopping on failure - unless you use stop-on-test-failure=y, in
which case it behaves essentially as it did before (and does not
generate overall test summaries on failure).  Instead, the summary
tests.sum may contain tests that FAILed.  At the end of the test run,
any FAIL or ERROR lines from tests.sum are printed, and then it exits
with error status if there were any ERROR lines (meaning a directory
had no test results).  In addition, build failures will also cause the
test run to stop - this has the justification that those *do* indicate
serious problems that should be promptly fixed and aren't generally
hard to fix (but apart from that, avoiding the build stopping on those
failures seems harder).

Questions:

* Should "make check" also exit with error status if there were any
  FAILs, not just if there were ERRORs?  I didn't do this because it
  seems useful to make the exit status distinguish the case where the
  build is broken (early termination or an ERROR) from the case where
  there are simply some test failures.

* There are various cases of miscellaneous files that are dependencies
  of tests, the build of which involves running some code from the
  newly built libc and associated programs, but which aren't counted
  as tests themselves (aren't in tests-special, don't use
  evaluate-test) - for example, timezone files generated with zic.
  Should these more systematically change to be counted as tests?
  They may not be that interesting as tests, but doing this would
  increase the chances of getting a complete log of test results when
  cross-testing if the host system is flaky - ensuring that only
  problems on the build system will terminate a test run early, not
  problems on the host used for running the newly built libc.

  (Ideally, in all cases of a test depending on another test - and
  there are plenty already of one test using the .out file from
  another test, whether or not we make a rule that running code on the
  host means it's a test - failure in the dependency would result in
  UNRESOLVED not FAIL from the test depending on the failed test.  But
  that may be hard to implement, and I'm not particularly concerned
  about one test failure causing others as fallout.)

Tested x86_64.

2014-01-10  Joseph Myers  <joseph@codesourcery.com>

	* scripts/evaluate-test.sh: Handle fourth argument to determine
	whether test run should stop on failure.
	* Makeconfig (stop-on-test-failure): New variable.
	(evaluate-test): Pass fourth argument to evaluate-test.sh based on
	$(stop-on-test-failure).
	* Makefile (tests): Give a summary of results from testing and
	exit with failure status if they include an ERROR.
	(xtests): Likewise.
	* manual/install.texi (Configuring and compiling): Mention
	stop-on-test-failure=y.
	* INSTALL: Regenerated.

diff --git a/INSTALL b/INSTALL
index bfa692d..852d192 100644
--- a/INSTALL
+++ b/INSTALL
@@ -192,11 +192,15 @@ an appropriate numeric parameter to `make'.  You need a recent GNU
 
    To build and run test programs which exercise some of the library
 facilities, type `make check'.  If it does not complete successfully,
-do not use the built library, and report a bug after verifying that the
-problem is not already known.  *Note Reporting Bugs::, for instructions
-on reporting bugs.  Note that some of the tests assume they are not
-being run by `root'.  We recommend you compile and test the GNU C
-Library as an unprivileged user.
+without reporting any unexpected failures or errors in its final
+summary of results, do not use the built library, and report a bug
+after verifying that the problem is not already known.  (You can
+specify `stop-on-test-failure=y' when running `make check' to make it
+stop immediately when a failure occurs rather than finishing running
+the tests then reporting all problems found.)  *Note Reporting Bugs::,
+for instructions on reporting bugs.  Note that some of the tests assume
+they are not being run by `root'.  We recommend you compile and test
+the GNU C Library as an unprivileged user.
 
    Before reporting bugs make sure there is no problem with your system.
 The tests (and later installation) use some pre-existing files of the
diff --git a/Makeconfig b/Makeconfig
index ea40b38..5f39df2 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -601,6 +601,12 @@ run-built-tests = yes
 endif
 endif
 
+# Whether to stop immediately when a test fails.  Nonempty means to
+# stop, empty means not to stop.
+ifndef stop-on-test-failure
+stop-on-test-failure =
+endif
+
 # How to run a program we just linked with our library.
 # The program binary is assumed to be $(word 2,$^).
 built-program-file = $(dir $(word 2,$^))$(notdir $(word 2,$^))
@@ -1092,6 +1098,7 @@ endif
 # XPASS or XFAIL rather than PASS or FAIL.
 evaluate-test = $(..)scripts/evaluate-test.sh $(test-name) $$? \
 		  $(if $(test-xfail-$(@F:.out=)),true,false) \
+		  $(if $(stop-on-test-failure),true,false) \
 		  > $(objpfx)$(@F:.out=).test-result
 
 endif # Makeconfig not yet included
diff --git a/Makefile b/Makefile
index 8434c00..129f443 100644
--- a/Makefile
+++ b/Makefile
@@ -326,10 +326,20 @@ tests: $(tests-special)
 	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-tests.sum \
 	  $(sort $(subdirs) .) \
 	  > $(objpfx)tests.sum
+	@grep '^ERROR:' $(objpfx)tests.sum || true
+	@grep '^FAIL:' $(objpfx)tests.sum || true
+	@echo "Summary of test results:"
+	@sed 's/:.*//' < $(objpfx)tests.sum | sort | uniq -c
+	@if grep -q '^ERROR:' $(objpfx)tests.sum; then exit 1; fi
 xtests:
 	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-xtests.sum \
 	  $(sort $(subdirs)) \
 	  > $(objpfx)xtests.sum
+	@grep '^ERROR:' $(objpfx)xtests.sum || true
+	@grep '^FAIL:' $(objpfx)xtests.sum || true
+	@echo "Summary of test results for extra tests:"
+	@sed 's/:.*//' < $(objpfx)xtests.sum | sort | uniq -c
+	@if grep -q '^ERROR:' $(objpfx)xtests.sum; then exit 1; fi
 
 # The realclean target is just like distclean for the parent, but we want
 # the subdirs to know the difference in case they care.
diff --git a/manual/install.texi b/manual/install.texi
index c0b8d9e..4644a64 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -224,11 +224,15 @@ GNU @code{make} version, though.
 
 To build and run test programs which exercise some of the library
 facilities, type @code{make check}.  If it does not complete
-successfully, do not use the built library, and report a bug after
-verifying that the problem is not already known.  @xref{Reporting Bugs},
-for instructions on reporting bugs.  Note that some of the tests assume
-they are not being run by @code{root}.  We recommend you compile and
-test @theglibc{} as an unprivileged user.
+successfully, without reporting any unexpected failures or errors in
+its final summary of results, do not use the built library, and report
+a bug after verifying that the problem is not already known.  (You can
+specify @samp{stop-on-test-failure=y} when running @code{make check}
+to make it stop immediately when a failure occurs rather than
+finishing running the tests then reporting all problems found.)
+@xref{Reporting Bugs}, for instructions on reporting bugs.  Note that
+some of the tests assume they are not being run by @code{root}.  We
+recommend you compile and test @theglibc{} as an unprivileged user.
 
 Before reporting bugs make sure there is no problem with your system.
 The tests (and later installation) use some pre-existing files of the
diff --git a/scripts/evaluate-test.sh b/scripts/evaluate-test.sh
index be156df..0b37030 100755
--- a/scripts/evaluate-test.sh
+++ b/scripts/evaluate-test.sh
@@ -17,11 +17,12 @@
 # License along with the GNU C Library; if not, see
 # <http://www.gnu.org/licenses/>.
 
-# usage: evaluate-test.sh test_name rc xfail
+# usage: evaluate-test.sh test_name rc xfail stop_on_failure
 
 test_name=$1
 rc=$2
 xfail=$3
+stop_on_failure=$4
 
 if [ $rc -eq 0 ]; then
   result="PASS"
@@ -35,4 +36,8 @@ if $xfail; then
 fi
 
 echo "$result: $test_name"
-exit $rc
+if $stop_on_failure; then
+  exit $rc
+else
+  exit 0
+fi

-- 
Joseph S. Myers
joseph@codesourcery.com
References:
- [2.20] [0/6] Generate test summaries with PASS / FAIL status
  - From: Joseph S. Myers
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]