This is the mail archive of the
mailing list for the binutils project.
[RFC/RFA] dangling bfd pointer in archive cache...
- From: Joel Brobecker <brobecker at adacore dot com>
- To: binutils at sourceware dot org
- Cc: gdb-patches at sourceware dot org
- Date: Tue, 2 Oct 2012 07:14:06 -0700
- Subject: [RFC/RFA] dangling bfd pointer in archive cache...
I am trying to fix a crash that's occurring when running a program
from GDB on ppc-aix. Take any program that uses threading, and:
Starting program: /[...]/task_switch
zsh: 46458 segmentation fault (core dumped) gdb-head -q task_switch
This is related to a recent patch that started counting references
to BFDs, and closing them when the reference count reached zero.
Here is what is happening in chronological order:
During the startup phase, GDB receives notification that libthread.a
has been mapped. It creates an archive BFD for it, and starts going
through it object files. It looks at the first one by calling:
bfd *result = bfd_openr_next_archived_file (archive, previous);
(where previous is NULL).
Following what archive.c:bfd_openr_next_archived_file does, we
find that it calls coff-rs6000.c:_bfd_xcoff_openr_next_archived_file,
which eventually calls archive.c:_bfd_get_elt_at_filepos. This
routine first checks the archive's cache for our bfd, and creates
a new one if not found. At the end of the element's creation, it
then adds it to the archive BFD's cache:
if (_bfd_add_bfd_to_archive_cache (archive, filepos, n_nfd))
Back to GDB, GDB looks at our elt bfd, finds that it's not the one
it is looking for, gets the next one using the same function, and
then unref's it. As the ref count of that first objfile reached zero,
it therefore calls bfd_close.
This is when things start going wrong, as bfd_close frees the memory
allocated to our elt bfd, but does not remove it from the archive's
cache. As a result, the next time we query the first elt of our archive,
we find the reference in the cache, and return that - a pointer to
free'ed memory, which eventually leads to a crash.
Looking further into this, I went back and forth between different
approaches, until I found that archive.c defines a function that
the cleanup: archive.c:_bfd_archive_close_and_cleanup. I don't
think it should be hooked up to the target vector to be called
automatically, since it's not entirely a target properly, more like
a "construct" property. So I've simply added a call to it from
This fixes the problem on ppc-aix, tested using the gdb-testsuite
on ppc-aix and x86_64-linux. I'd be happy to do more testing if
someone told me which testsuites to run for this type of change.
But before doing so, I thought I'd make sure I am making the correct
type of change...
* opncls.c (bfd_close); Add call to _bfd_archive_close_and_cleanup.
diff --git a/bfd/opncls.c b/bfd/opncls.c
index fdccba3..b9b3743 100644
@@ -719,6 +719,17 @@ bfd_close (bfd *abfd)
if (! BFD_SEND (abfd, _close_and_cleanup, (abfd)))
+ /* If ABFD is an archive bfd, or a bfd contained in an archive,
+ do any associated cleanup. This is necessary to avoid leaving
+ a dangling pointer somewhere. For instance, if ABFD is an object
+ contained in an archive, the archive's cache might have a reference
+ to ABFD that needs to be removed before we free ABFD.
+ _bfd_archive_close_and_cleanup does nothing if ABFD is neither
+ an archive BFD nor a BFD of an object contained in an archive,
+ so it is safe to call it without checking the bfd kind. */
+ _bfd_archive_close_and_cleanup (abfd);
ret = abfd->iovec->bclose (abfd);