This is the mail archive of the mailing list for the GDB project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch/rfc] Remove all setup_xfail's from testsuite/gdb.mi/

I don't think making it a requirement that go out and analyze all the
existing XFAILs is reasonable, although it is patently something we
need to do.  That's not the same as ripping them out and introducing
failures in the test results without addressing those failures.

As a specific example, the i386 has an apparently low failure rate. That rate is badly misleading and the real number of failures is much higher :-( It's just that those failures have been [intentionally] camoflaged using xfail. It would be unfortunate if people, for the i386, tried to use that false result (almost zero fails) when initally setting the bar.

Have you reviewed the list of XFAILs?  None of them are related to the
i386.  One, in signals.exp, is either related to GDB's handling of
signals or to a longstanding limitation in most operating system
kernels, depending how you look at it.  The rest are pretty much
platform independent.
I've been through the files and looked at the actual xfail markings. They are dominated by what look like cpu specific cases (rs6000 and HP are especially bad at this).

I've also noticed cases where simply hanking the xfail doesn't make sense - when the failure has already been analized (easy to spot since they are conditional on the debug info or compiler version).

This is also why I think the xfail's should simply be yanked. It acts as a one time reset of gdb's test results, restoring them to their true values. While this may cause the bar to start out lower than some would like, I think that is far better and far more realistic than trying to start with a bar falsely set too high.

This is a _regression_ testsuite.  I've been trying for months to get
it down to zero failures without compromising its integrity, and I've
just about done it for one target, by judicious use of KFAILs (and
fixing bugs!).  The existing XFAILs all look to me like either
legitimate XFAILs or things that should be KFAILed.  If you're going
to rip up my test results, please sort them accordingly first.
No one is ripping up your individual and personal test results.

Several years ago some maintainers were intentionally xfailing many of the bugs that they had no intention of fixing. That was wrong, and that needs to be fixed.

An unfortunate consequence of that action is that the zero you've been shooting for is really only a local minimum. The real zero is further out, that zero was a mirage :-(

It doesn't need to be done all at once.  We can put markers in .exp
files saying "xfails audited".  But I think that we should audit
individual files, not yank madly.
(which reminds me, the existing xfail reference to bug reports need to be ripped out - they refer to Red Hat and HP bug databases :-().

Am I the only one who considers well-categorized results important?
Of course not. All the good developers on this list take the test results, and their analysis, very seriously.

you introduce seventy failures, then that's another couple of weeks I
can't just look at the results, see "oh, two failures in threads and
that's it, I didn't break anything".
People doing proper test analysis should be comparing the summary files and not the final numbers. A summary analysis would show 70 XFAIL->FAIL changes, but no real regressions.


If the eixsting (bogus) xfail PR numbers are _all_ ripped out, and then the requirement for all new xfail's to include a corresponding bug report, I think there is a way forward.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]