This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Prelinking of shared libraries


On Saturday 05 May 2001 06:29, Martin v. Loewis wrote:
> > You forget about the ones in the .data section (due to vtables) In
> > libkdeui.so.3.0.0 (Latest KDE CVS) I have 15547 relocations of the type
> > R_386_32 and 12759 of those refer to qt symbols. Most of those seem to
> > come out of the .data section.
>
> [...]
>
> > No. What currently happens is that the .text and .plt of a lib are shared
> > between processes but that each process has its own .got and .data.
>
> So it appears that you claim that "most" overhead comes from the loss
> of sharing, and the need to page-in (or copy-on-write) the data
> sections, and the got.

Let me make some points clear first: 
1) How well the linker performs depends very much on what you use it for. So 
it's impossible to say something generic as "most overhead comes from xyz", 
it depends entirely on the situation. I will identify a few situations and 
then show what kind of problems occur in the situation. (See below)

2) There are two issues, speed and memory usage. Some problems relate to 
speed, some to memory and some to both. 

The approach that I described (basically the Nelson paper) solves everything 
because it basically means that you don't need to link/relocate your 
libraries at all anymore, so any problem associated with it, disappears as 
well. I guess that you now would like to know whether that is indeed needed, 
or whether a less far-reaching solution would suffice as well.

> If this is indeed your claim, can you give some proof to support it?
> E.g. how many pages need to be copied? In kdeui.so.2, .got has just 3
> pages of memory (.data has 5 pages) ...

I'll try to give an overview of the various aspects:

1) Exception handling. 
Exception handling causes a lot of relocations of type R_386_RELATIVE. These 
relocations can be done relatively fast but they do cause memory usage to 
increase. For an average KDE application this amounted to about 800Kb when i 
checked this about a year ago. I don't have recent data on this, since KDE 
compiles without exceptions nowadays. I don't know if all the overhead is 
caused by the relocations or whether exception handling does some 
initialisation during runtime as well which allocates/touches memory. In the 
paper that I wrote, exception handling wasn't taken into account. 

2) vtables
As far as I know, does every entry in a vtable require a R_386_32 relocation. 
That is slow because of the symbol lookup that is associated with it. They 
also cause memory usage to increase since the page gets touched.
In Table 5 of my paper I showed as an example that every class derived from 
QWidget introduces 109 relocations. 105 of them are R_382_32 relocations and 
4 of them R_386_GLOB_DAT. 

Looking at the relocation entries in libqt, libkdecore and libkdeui, then it 
_seems_ that all relocations of R_386_32 are due to vtables. Glancing over it 
shows nothing but virtual functions. (How do I check that they are indeed 
part of vtables? Does it matter?)

3) Other data structures
The index part of lookup tables ends up in the .data section. These are all 
R_386_REL relocations, so at issue is mostly the memory that they need. 

4) The .got section.
This should mostly (only?) contain relocations of type R_386_JUMP_SLOT which 
can be done lazy, so only the memory aspect is of importance.

Ok if I will now look at "kedit" as a whole and at libqt, libkdecore and 
kdeui seperately, I have also added libXft because it has a surprisingly 
large .bss section. "kedit" links to a total of 28 libraries:

/ext/kde-head/lib/kde2/kedit.so
/ext/kde-head/lib/libkspell.so.3
/ext/kde-head/lib/libkfile.so.3
/ext/kde-head/lib/libksycoca.so.3
/ext/kde-head/lib/libkio.so.3
/ext/kde-head/lib/libkdeui.so.3
/ext/kde-head/lib/libkdesu.so.1
/ext/kde-head/lib/libkdecore.so.3
/ext/kde-head/lib/libkdefakes.so.3
/lib/libdl.so.2
/ext/kde-head/lib/libDCOP.so.1
/ext/cvs/qt-copy/lib/libqt.so.2
/usr/lib/libpng.so.2
/usr/lib/libjpeg.so.62
/usr/X11R6/lib/libXext.so.6
/usr/X11R6/lib/libX11.so.6
/usr/X11R6/lib/libSM.so.6
/usr/X11R6/lib/libICE.so.6
/lib/libutil.so.1
/lib/libz.so.1
/usr/local/lib/libfam.so.0
/usr/lib/libstdc++-libc6.2-2.so.3
/lib/libm.so.6
/lib/libc.so.6
/usr/lib/libstdc++-libc6.1-2.so.3
/usr/X11R6/lib/libXft.so.1
/lib/ld-linux.so.2
/usr/X11R6/lib/libXrender.so.1

I will now use the sum over all these libs and refer to that as "kedit".

		.got	.data	.bss	R_386_REL	R_386_32	R_386_JUMP_SLOT
kedit		129Kb	308Kb	160Kb	21021		43311		25866
libqt		44Kb	131Kb	19Kb	5124		17090		8515
libkdecore	15Kb	17Kb	7Kb	813		1719		3331
libkdeui		24Kb	76Kb	4Kb	684		15547		4687
libXft		1Kb	5Kb	72Kb	2074		46		189

Assuming that each R_386_REL, R_386_32 and R_386_JUMP_SLOT affects 4 bytes of 
data that translates into:

		.got	.data	.bss	R_386_REL	R_386_32	R_386_JUMP_SLOT
kedit		129Kb	308Kb	160Kb	82Kb		169Kb		101Kb
libqt		44Kb	131Kb	19Kb	20Kb		67Kb		33Kb
libkdecore	15Kb	17Kb	7Kb	3KB		7KB		13Kb
libkdeui		24Kb	76Kb	4Kb	3KB		61KB		18Kb
libXft		1Kb	5Kb	72Kb	8KB		0KB		1Kb

So an application like kedit, has a total .data section of 308Kb , 82Kb of 
that is touched by R_386_REL relocations and 169Kb of that is touched by 
R_386_32 relocations. (which leaves 67Kb unaccounted for) (Assuming that all 
R_386_REL and R_386_32 are bound to .data, is that so? How can I check?)

The total number of .got sections amounts to 129Kb, of which 101Kb is touched 
by R_386_JUMP_SLOT relocations, leaving 28Kb unaccounted for. (Does .got 
contain something else besides jump slots?)

All of the above is without exception handling which appearantly would add an 
extra  bunch of R_386_32 relocations to this all in eh-sections.

I must say that I still miss some memory because starting kedit leaves me 
with 220 dirty pages, or 880Kb, but the .got, .data and .bss together only 
account for 597Kb. VmData reports 176Kb, is .bss included in that?

Something else that I noted: R_386_REL+R_386_32 amounts to 64332 relocations, 
but according to LD_DEBUG=statistics I only get 50346 relocations when I 
start kedit (with lazy binding). Any idea?

> > Besides I doubt whether you have enough control over the layout of
> > the .data section to pull that off.
>
> Not sure what "that" is here. If significant speed improvements can be
> achieved by systematically re-arranging the elements of the .data
> section, I think gcc and/or ld could be taught to execute such
> control.

Yes, I thought that a vtable might contain both R_386_REL and R_386_32 
relocations, and you can't of  course split the vtable, but it seems that it 
consists of R_386_32 entries almost entirely (+ 4  R_386_GLOB_DAT 
relocations, not sure if they are part of the vtable itself or if they are 
used to point to the vtable/ type info structs)

But rearranging wouldn't do much for the speed, it would mostly improve the 
page sharing. (But then again, having less dirty pages improves speed as well 
of course)

Cheers,
Waldo
-- 
bastian@kde.org | SuSE Labs KDE Developer | bastian@suse.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]