This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: testsuite and hardcoded timeouts


Quentin Barnes <qbarnes@urbana.css.mot.com> writes:

[...]
Ah, maybe there is some middle ground here.  Instead of putting the
effort into figuring out some portable method for dynamic timeouts,
just change the behavior for a timeout to be user-settable [...]

It can be even easier than that. dejagnu's "timeout" tcl variable is exactly the default timeout duration in seconds.

The "timeout" variable is an expect feature. It is already set in stap_run.exp and stap_run2.exp, but timeouts are also manually specified in various expect statements sprinkled through the testsuite. Those are the ones that cause me the most headaches. Otherwise, tinkering with just two files would be trivial.

The .exp files under
testsuite/config or even testsuite/lib could set this global variable
based on the "ishost" predicate - leave it for i686, double it for
s390x, dedicule (!) it for arm.  Then we just need to police the test
cases to avoid messing with this value.

It's not that simple. For example, my setup is really, really slow because it is using NFS mounted root and swap with a small amount of RAM. Another ARM system could run easily 5x-10x faster than mine with just more memory or a real hard disk.

Rather than create an "ishost" rule, I suggested that what would
probably be better is to use the MHz or BogoMIPS number from
/proc/cpuinfo.  But even that's a heuristic because it only takes
in account the CPU speed, not the system speed that can be choked
due to I/O limitations.


What I'd like to know is if it is really necessary to have fatal timeouts. How often does running the test suite truly hang up where the timeout feature gets it unstuck?

I've found that if my system has taken too long, it's due to a bug
and the kernel is no longer stable.  However, I don't work on the
stap translator.  I suspect bugs in it are what causes recoverable
test hang ups to exist.

- FChE

Quentin



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]