This is the mail archive of the
mailing list for the binutils project.
non-representable symbols in PE executables
- From: "Jan Beulich" <JBeulich at suse dot com>
- To: <binutils at sourceware dot org>
- Date: Tue, 28 Jun 2016 00:38:13 -0600
- Subject: non-representable symbols in PE executables
- Authentication-results: sourceware.org; auth=none
In the build process of the Xen hypervisor EFI binary we've run into
a situation where an assembler value truncation warning (in the
generated source file producing the internally used symbol table,
similar to the Linux kernel's kallsyms one) is actually caused by ld
emitting truncated symbol values to the PE (COFF) symbol table.
I'd therefore like to find out whether stripping such symbols (as
done in the patch below) wouldn't be the better route, in order to
avoid misleading consumers (in our case, nm) - I'm generally of
the opinion that having some piece of information missing is
preferable over the information being present but wrong, and if
potentially wrong information gets emitted, that this at least be
accompanied by some diagnostic.
As to the seemingly unrelated parts of the patch: I've at once
tried to limit the number of resulting warnings, namely for
linker generated symbols like __image_base__ / __ImageBase.
Which in turn required marking such symbols as linker generated.
I'm of course open for suggestion of how to do this in a less ad
hoc way - I can imagine that the (ab)use of lineno for this
purpose could be controversial.
While of course there's no way to get symbol values in range for
absolute symbols, these wouldn't have been a problem for the
specific purposes of Xen. Instead we ran into them as a result of
section_for_dot() preferring the following section for symbols
defined outside of any section, after a . adjustment. Therefore
I wonder whether, on top of the patch below, that behavior (or
that of the relevant caller update_definedness()) shouldn't be
changed. According to the comments there this behavior is
really based on an assumption rather than formally established
requirements, and the assumption (assignment to . setting the
address for the following section) turned out wrong in our case:
We place a section end label after aligning to page size and only
then establish the following section's address (2M aligned).
Obviously with the current logic this results in a negative section
offset, which - values being unsigned - gets turned into a huge
positive one. nm as the consumer then gets the symbol address
off by 4Gb.
The most simple adjustment I could think of would be to associate
symbols with the following section only if they're right at the start
of that section. But of course I have no idea what other users of
ld would break with a change to heuristics like this one.
@@ -2611,6 +2611,16 @@ _bfd_coff_write_global_sym (struct bfd_h
if (! obj_pe (flaginfo->output_bfd))
isym.n_value += sec->vma;
+ if (isym.n_value > (bfd_vma)0xffffffff)
+ if (! h->root.linker_def)
+ _("%B: symbol '%s' value %"BFD_VMA_FMT"x is not representable, stripped"),
+ output_bfd, h->root.root.string, isym.n_value);
+ return TRUE;
@@ -1181,7 +1181,7 @@ exp_fold_tree_1 (etree_type *tree)
h->type = bfd_link_hash_defined;
h->u.def.value = expld.result.value;
h->u.def.section = expld.result.section;
- h->linker_def = 0;
+ h->linker_def = ! tree->assign.type.lineno;
if (tree->type.node_class == etree_provide)
tree->type.node_class = etree_provided;
@@ -39,9 +39,8 @@
yylex and yyparse (indirectly) both check this. */
-/* Line number in the current input file.
- (FIXME Actually, it doesn't appear to get reset for each file?) */
-unsigned int lineno = 1;
+/* Line number in the current input file. */
+unsigned int lineno;
/* The string we are currently lexing, or NULL if we are reading a
@@ -459,7 +458,10 @@ V_IDENTIFIER [*?.$_a-zA-Z\[\]\-\!\^\\]([
if (include_stack_ptr == 0)
- yyterminate ();
+ lineno = 0;
+ yyterminate ();