This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Feature request: improved build-id generation


On Thu, 2018-03-15 at 00:45 -0700, Cary Coutant wrote:
> > > To inject explicit out-of-band data into the hash computation, you
> > > could insert an object with nothing but a note section, or even use
> > > --defsym to create a symbol table entry with your extra key(s).
> > 
> > Fedora wants to insert extra data into the build-id of all packages in
> > its repository, and it does so right now by post-processing the
> > package.  This is ugly, and it flat-out doesn't work for the kernel,
> > which is apparently breaking a legitimate debugging use case.
> > 
> > I think Fedora should be able to ask its tool chain to insert the
> > extra data rather than hacking it in after the fact.  Asking Fedora to
> > use --defsym for this purpose is IMO a non-starter, as is asking
> > Fedora to come up with some magic .o file and linking it into every
> > object.
> 
> I understand your objection to the magic .o file, but why exactly is
> --defsym a "non-starter"? It's pretty close to what you're asking for
> (except for the spelling), it's already available, and it has the
> advantage of adding the extra data to the binary in a form where it
> can be easily extracted (e.g., with "nm").
> 
> I don't understand the need for forcing two otherwise-identical
> binaries to have different build ids, simply because they're part of
> different distributions. Perhaps it would help if you could explain
> why you need that.

In theory two different build environments could produce the same
identical binary. Even different source files could if they express the
same algorithm and the compiler optimizes them the same way. But you
might still want to identify which came from which build. The build-id
is used to identify the build (environment) that produced the binary.
If the build environment is identical then in theory it should produce
identical binaries and build-ids. But this isn't necessarily the other
way around. For a distro it is nice to have unique build-id identifiers
for different package version builds. Then if you just have e.g. a core
file with a build-id in it, you can map it back to which package build
it came from.

The linker doesn't see the whole build environment, but it often has
enough to differentiate if there is debuginfo involved, which contains
the source file paths, compiler versions, command line options, etc. I
think in the case Andy is interested in, the vdso, just isn't "unique"
enough (it contains mostly assembly without debuginfo). The vdso is
also slightly special because it is build and then "inserted" into the
kernel image (so it can be mapped into the process space at runtime).
Which makes it difficult to "post-process".

If using new command line flags is out of the question, could the
linker use an environment variable as seed for the build-id hash
computation? Then a package build could just set e.g.
BUILD_ID_ENV="<package-version>". That could also be used for other
purposes to capture anything from the build environment that might make
a build unique.

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]