This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to associate Elf with Dwfl_Module returned by dwfl_report_module


Hi Milian,

On Wed, Mar 21, 2018 at 02:01:41PM +0100, Milian Wolff wrote:
> Here's the code for the perf tools:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/
> perf/util/unwind-libdw.c?h=perf/core#n52
> 
> Here's the code for the perfparser:
> 
> http://code.qt.io/cgit/qt-creator/perfparser.git/tree/app/
> perfsymboltable.cpp#n479
> 
> Let's concentrate on perf for now, but perfparser has similar logic:
> 
> We parse the mmap events in the perf.data file and store that information. 
> Note that the perf.data file does not contain events for munmap calls. Then 
> while unwinding the callstack of a perf sample, we lookup the most recent mmap 
> event for every given instruction pointer address, and ensure that the 
> corresponding ELF was registered with libdw.

So, modules are never deregistered?
In that case, that might explain the issue.
But I see there is a check if there is already something at the address.
The interface to "remove" a module might not be immediately clear.
The idea is that if modules need to be remove you'll call
dwfl_report_begin, possibly dwfl_report_elf for any new module and then
dwfl_report_end has a callback that gets all old modules and decides
whether to re-report them, or they'll get removed. You might want to
experiment with doing that and not re-report any module that overlaps
with the new module. (See the libdwfl.h documentation for a hopefully
clearer description.)

> > Specifically are you using false for the add_p_vaddr argument?
> 
> Yes, we are.
> 
> > And could you provide some example where the reported address is
> > wrong/different from the start address of the Dwfl_Module?
> 
> I don't think it's the start address that is wrong, rather it's the end 
> address. But it's hard for me to come up with a small selfcontained example at 
> this stage. I am regularly seeing broken backtraces for samples where I have 
> the gut feeling that missing reported ELFs are to blame. But we report 
> everything, except for scenarios where the mmap events seemingly overlap. This 
> overlapping is, as far as I can see, actually a side effect of remapping 
> taking place in the dynamic linker (i.e. a single dlopen/dynamic linked 
> library can yield multiple mmap events). One way or another, we end up with a 
> situation where we cannot report an ELF to dwfl due to two issues:
> 
> a) either ELF tells us we are overlapping some module and just stops which is 
> bad, since we would actually much prefer the newly reported ELF to take 
> precedence
> 
> b) we find an mmap event that with a non-zero pgoff, and have no clue how to 
> call dwfl_report_elf and just give up.
> 
> In both cases, I was hopeing for dwfl_report_module to help since it seemingly 
> allows me to exactly recreate the mapping that was traced originally.

If you could add some logging and post that plus the eu-readelf -l
output of the ELF file, that might help track down what is really going
on.

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]