This is the mail archive of the
gdb@sources.redhat.com
mailing list for the GDB project.
Re: Checking function calls
- From: Michael Elizabeth Chastain <mec at shout dot net>
- To: fredrik at dolda2000 dot cjb dot net, gdb at sources dot redhat dot com
- Date: Wed, 4 Dec 2002 22:51:51 -0600
- Subject: Re: Checking function calls
Hi Fredrik,
I'm throwing out a bunch of ideas here, take whatever looks useful
and discard the rest.
> Therefore, the failure has to be that a called
> function doesn't restore EBX correctly, on rare occasions, right?
I have seen this happen in a mixed programming environment,
with a Cygwin program that used a Windows DLL. The Windows DLL
had subtly different calling conventions where it did not preserve
%ebx, %esi, and %edi across function calls. Perhaps you have some
kind of third party library in your program which has a similar
compatibilty issue?
> My question is thus: Is there any way of debugging this with GDB? Can I
> make GDB check that EBX is the same before and after every function call
> from that frame in this thread to isolate the failing function? The
> frame never exits (until the program exits, that is), if that helps.
You could set a bunch of conditional breakpoints with "break if %ebx !=
saved_ebx", where you add code to your program to initialize saved_ebx.
Or you could say "break if %ebx < 0x1000" or some convenient constant.
You could also try forcing your variable to be on the stack instead of a
register. Remove the "register" attribute from the declaration of "next"
if you have one. Then add a "do_nothing(&next)" call to your function,
to force "next" to be on the stack instead of in a register. If the
symptoms go away then it's more likely to really be a register clobber.
If the symptoms remain then it's more likely to be a memory clobber
(or you have a really sick low-level function that clobbers random words
on the stack but this does not feel like it).
> At first I was expecting that another thread somehow gets there and
> modifies the storage memory of next.
I still suspect this. It's more likely that memory gets clobbered rather
than a register value.
Perhaps you need a function that locks the whole list and walks it for
a sanity check, without deleting anything?
Here is another wild lead: if, somehow, a block gets freed and then
you read it, many implementations of malloc keep housekeeping information
in the first word or two of a freed block. That would explain why the
value is always 0x10 to 0x30 (that could be block size, especially if it is
rounded up to a multiple of 4 or 8) and why only 1-2 words are clobbered.
If you manage your blocks with malloc/free, you could try turning on any
malloc debugging facilities that you have.
Hope this helps,
Michael C