This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: ARM port of testsuite and general testsuite fixes

From: Quentin Barnes <qbarnes at urbana dot css dot mot dot com>
To: David Smith <dsmith at redhat dot com>
Cc: systemtap at sources dot redhat dot com
Date: Wed, 6 Jun 2007 12:11:38 -0500
Subject: Re: ARM port of testsuite and general testsuite fixes
References: <20070606003646.GB20115@urbana.css.mot.com> <4666DA5D.3060403@redhat.com>

On Wed, Jun 06, 2007 at 11:01:33AM -0500, David Smith wrote:

Quentin Barnes wrote:

The ARM version of the test suite of 20070602 just completed.  I'll
post the specifics later.  With the earlier patches plus the patch
below, here's the summary using a 2.6.21.1 kernel in case anyone's
curious:

=======
               === systemtap Summary ===

# of expected passes            277
# of unexpected failures        37
# of unexpected successes       1
# of expected failures          129
# of known failures             7
# of untested testcases         30
# of unsupported tests          3
=======

Of course it is amazing that you got this working at all, but 37 failures is still quite high.


Some of them are due to my kernel configuration.  I run a stripped
down kernel with many things configured off.  I've noticed some
failures were relying on features being enabled.  I've gone back and
turned on some of these, but not all.  Also, I don't know how many
of them are due the tests needing updating to later kernels.

These more detailed summaries I see posted listing specific
failures, are these processed by a script or are they just hand
massaged output from the log by people each time?

Here's the list of my tests failures:
=======
$ egrep '^(FAIL|XPASS)' systemtap.log
FAIL: BASIC2 wasn't cached
FAIL: OPTION2 wasn't cached
FAIL: BULK2 wasn't cached
FAIL: MERGE2 wasn't cached
FAIL: RUNTIME2 wasn't cached
FAIL: BASIC4 wasn't cached
FAIL: systemtap.base/deref.stp startup (eof)
FAIL: systemtap.base/optim.stp shutdown (eof)
FAIL: OVERLOAD2 didn't receive expected overload
FAIL: probefunc:kernel.statement(0xc0035c54).absolute shutdown (eof)
FAIL: prologues -P
FAIL: prologues no-P
FAIL: buildok/eighteen.stp
FAIL: buildok/four.stp
FAIL: buildok/memory.stp
FAIL: buildok/scsi.stp
FAIL: buildok/tcp_test.stp
FAIL: buildok/twenty.stp
FAIL: buildok/twentyfive.stp
FAIL: buildok/twentythree.stp
FAIL: buildok/udp_test.stp
XPASS: semko/thirtytwo.stp
FAIL: semok/twelve.stp
FAIL: semok/twenty.stp
FAIL: systemtap.samples/lket(semantic error)
FAIL: pfaults (0)
FAIL: profile
FAIL: systemtap.samples/tcptest.stp compilation
FAIL: transport fill staging buffer - relayfs (0)
FAIL: systemtap.stress/current.stp compilation
FAIL: 32-bit acct
FAIL: 32-bit forkwait
FAIL: 32-bit mmap
FAIL: 32-bit net1
FAIL: 32-bit openclose
FAIL: 32-bit stat
=======

If someone wants to make a quick pass over this list to flag which
failures definitely are unexpected for my kernel and architecture,
I'll give those priority in analyzing.

I just didn't want to waste a lot of time dealing with already known
problems.

Below is the patch necessary to fix all timeout problems when running
on an ARM processor and other porting issues related to ARM.

I didn't go with a strategy that makes the hardcoded timeout values
variable.  Once I fixed various testsuite bugs, the timeout values
didn't increase all that much after all.  I felt the difference
wasn't enough warrant switching to a new scheme for.  If people feel
otherwise, we should discuss it further.

Hmm. Looking over the timeout value changes/additions, I see several different values:

20 (1 instance)
60 (1 instance)
120 (4 instances)
150 (17 instances)
180 (3 instances)
240 (4 instances)
360 (1 instance)
400 (1 instance)
1800 (1 instance)


There's also a couple of "-1"s in there.  I put these in when the
expect statement had no timeout clause, but they all have timeouts.
Someone might want to review and tweak these if there is ever a
chance that those could get stuck and need an explicit timeout.

It seems like we ought to standardize a bit and have 4 (or some other number) standard timeout values (that we could customize per platform if needed). I think you (or someone else) suggested this earlier.


I tried to make them on some rounded boundaries that I knew would
work based on the timing data from the run but not too large.  I
generally went for rounded up multiples of 30.

(Note that I don't have a problem with you checking your patch in as is and then we can go back later and improve things.)
There are various bug fixes scattered throughout the patch.  Please
review them carefully, but most should be self-explanatory.
One question I have is that you changed several instances of "." to "\." in regular expressions. Out of curiosity did you actually see a problem here or were you just cleaning up?


Nope, just cleanup.  If I was studying a line related to a failure I
was analyzing and I saw an unescaped meta character that should have
been escaped, I escaped it.  Then, I would grep for similar lines
in the rest of the scripts figuring on yank-and-put propagation of
the mistake and fix those too.  The change from if to switch that
I mentioned was just for robustness, but the other pattern changes
involving ".*" to "[^\r]*" and the rewrite in system_func.exp were
actual bug related.

--
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

Quentin

Follow-Ups:
- Re: ARM port of testsuite and general testsuite fixes
  - From: David Smith

References:
- ARM port of testsuite and general testsuite fixes
  - From: Quentin Barnes
- Re: ARM port of testsuite and general testsuite fixes
  - From: David Smith

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]