This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: Ensure correct symbol-file when attaching to a (remote) process
On 12-12-21 08:38 PM, John Gilmore wrote:
Interesting timing. I have just posted
http://sourceware.org/ml/gdb-patches/2012-12/msg00776.html addressing
this issue.
That's a useful patch. It's smart to read the headers from the
"symbol_file" bfd (well, in this case the solib bfd used for symbols)
rather than from the "exec_file" bfd.
But doesn't it read the ELF header twice? In
svr4_validate_hdrs_match, it reads it once in the call to
read_elf_header_from_target, and then reads it again by calling
read_program_headers_from_target which again calls
read_elf_header_from_target. I suspect that you should pass the
already-read ELF header into each of the read_program_headers
functions.
Yes, it reads it twice, but our target memory could (and probably
should) cache this. Note that we are talking about sizeof(Elf32_Ehdr) or
sizeof(Elf64_Ehdr) which is not a lot.
The way it is proposed in the patch matches corresponding
read_*_from_bfd functions which do not require passing elf header to
read phdrs.
This is not to say my proposal is the only way of doing it, what you are
suggesting is valid approach as well. However, I would prefer to leave
it as it is in the patch.
Also, in read_elf_header_from_bfd, BFD has probably already read the
ELF header, when it first opened the file, and has it lying around.
This code should not be (1) reading it again, nor (2) seeking to file
offset "0" (not even a #define, not even "0L", but an int constant!)
to get it.
Elf header is always at 0, so I'm not sure what define would you prefer
seeing there instead? Elf header is specifically located at offset 0,
this is by the System V spec. of the elf format. I can put 0L but in
this case, does it really matter?
Also, it might be simpler in read_program_headers_from_target to just
translate the ELF header into its internal BFD struct form - then all
that manual messing with byte orders and field lengths could be
eliminated. See how read_program_headers_from_bfd does it.
Note that this is because we read directly from target memory.
You could also just call bfd_get_elf_phdrs (and before that,
bfd_get_elf_phdr_upper_bound to size the buffer) which would avoid
spreading details into GDB about these headers and how they are read
and translated into local byte order. BFD even has a
bfd_elf_bfd_from_remote_memory function for creating a BFD by portably
reading the target's ELF header and phdrs out of memory.
I will have to look at how could bfd be used to read from target memory,
but the idea is to compare 'raw' data. There is no need to translate
into internal form - in-memory data needs to match that from the file,
no cross-endiannes happens here at all (except for, of course, getting
phdr offset directly from in-memory elf header; I intentionally did not
want to use bfd's internal representation for that as it corresponds to
the header read from the file residing on the host, not necessarily
matching file, which is what we are checking here.
Also, is "validate" the right name for this? That works for checking
shared libraries, but it's not as obvious what this function should
do when e.g. checking the argument to symbol_file, or immediately
after doing "target attach remote". If "validation" fails, should
we refuse to read that symbol file, or refuse to attach to the remote
target? What exactly are we validating? Perhaps "compare_headers"
or "do_symbols_and_target_match" are better names.
I am not a native English speaker and as such I will accept better
proposals, but I thought 'validate' would be the right name. We are at
the point where we are reading symbol file. Symbol file is generic here
(and 'symbol' somewhat may confuse. The file we found on the host could
be completely stripped and so only "minimal symbols" (those that can be
derived from .dynsym/.dynstr) may be available, but that is irrelevant
to what is being validated here), we are really opening a file on the
host that corresponds to loaded object in target memory. Therefore,
attach is only one way of getting into this situation. It may be
provoked by e.g. noshared ... shared. Regardless of how did we get here,
we found a file that gdb thinks corresponds to that in the target's
memory. We are now validating and answering the question: "Is the found
file valid?". I chose 'validate' as a generic concept, someone may
decide to implement some other type of validation, e.g. bit-for-bit of
text segment, someone might decide to open file using target fileio and
read it all. My proposal is to have check that is practical: it is
almost always sufficient and a lot faster than doing full segments (i.e.
exhaustive) comparison. Full segment comparison approach may not include
anything that could have been modified by the dynamic linker, so we
would be restricted to full text compare (so still not exhaustive, but
slightly better than what is in the proposal).
In effect, bfd corresponding to the object that does not pass validation
gets thrown away and we do not read symbols from it. We had done some
work on it such as relocating sections, and general bfd work but that's
about it. This is all host work so I don't think it affects performance
much. If this is a concern, validation could take place earlier, but I
don't see a need at the moment.
With those things addressed, it's a good start. Then, shouldn't this
call be implemented for non-shared-library symbol files (like
Mr. Zulliger was hoping for), and for object file formats other than
ELF?
Not for non-elf formats, but the API is there. It can be implemented (I
did not plan to do that).
I'm not sure I understand the first part of the question: it doesn't
matter whether the object is shared or not. What matters is that it is
an object loaded by the dynamic linker (therefore this validation is
performed only for dynamic processes that link typically shared objects).
I believe we already have executable comparison in place, but I will
double check.
Has this code been tested when the target uses a different byte
order than the host? Probably not, since it is only currently invoked
for shared library loads, which usually only happen native. Is
there a test case written for it?
This was implemented primarily for remote scenarios as library mismatch
is a lot more likely in this use-case. I believe I addressed different
endian-es where applicable. I also put comment on why is bit comparison
of 'raw' headers fine. We are reading library that resides on the host,
but is built for the target. Therefore, endianes in the headers must
match and there is no need to translate it.
Testcase: I haven't written one yet.
Could it be easily extended to check a build-id when one exists? Or
would that require different function arguments, or need to happen at
a different time?
This is stronger than build-id check. Note also that this does not
presume that any section header (nor section headers in general) be
present in the loaded image. build-id is not prescribed to be mapped to
a loadable segment and can not be counted on to be present in targets
memory. I specifically did not want to require any file operations on
the target. Notice that data read from the target relies _only_ on data
that _must_ be present for dynamic linker, and consequently must reside
in target memory. It does not attempt to use section headers (e.g. we
strip them for some of our libraries) nor bits that are optional.
Should "target attach" do this kind of validation traffic to and from
the target? Or should checking this be a separate step initiated by
the user? The target might be in a relatively sensitive state -- or
the user might not want to wait for this check -- immediately after
attaching. We could add yet another GDB setting for automatic or
manual validation, defaulting to "check", and requiring those who want
to bypass the check to set that setting. Is that too much
complication for the gain?
Traffic should be fairly small. Elf64_Ehdr is 64 bytes. Phdrs may vary
as their number varies, but from my observations today (on native
gnu/linux 64 bit) gnulibc has phdrs that are less than 1K in size. This
is one time work, so I think it won't be detrimental to performance,
while providing very useful functionality (those who have to deal with
library mismatches know what I am talking about).
As stated above, this check happens on shared library load. My example
from the patch is just an illustration of a real scenario (but
admittedly less common) that can happen when debugging natively.
I do not think there is a need for another option; for example, we
perform "separate symbol file" CRC calculation without an option to
disable it (and this *could* be time consuming).
John
Thanks for looking at the patch and your feedback,
Aleksandar