This is the mail archive of the
binutils@sources.redhat.com
mailing list for the binutils project.
Re: parallelized 'ld'?
- From: marcov at dragon dot stack dot nl (Marco van de Voort)
- To: binutils at sources dot redhat dot com
- Date: Sat, 26 Jul 2003 01:44:42 +0200 (CEST)
- Subject: Re: parallelized 'ld'?
Sorry for being late, BUT:
If my boss told me to speed up ld with a minimal effort I'd:
- check if LD memory usages isn't very close to the maximal amount of memory on the
target systems.
- check if LD allocates a lot of small parts of mem (e.g. per symbol).
If both assumptuons are true, I'd simply change malloc to an awfully dumb
implementation that doesn't really deallocate any mem, doesn't merge
anything. If the below pseudocode looks bad, it's because my C is a bit
rusty, as long as the idea is good :-)
void * start_of_free_datasegment // initialised to the beginning of free
space, enlarged with mmap in large chunks.
void * malloc(size_t size) // directives to inline on this
// procedure.
{void * ret;
if !(size<sizeleft_in_data_segment())
allocate_larger_size_datasegment; //sbrk
*((size_t *) start_of_free_datasegment)=size; // needed for realloc
ret= start_of_free_datasegment+sizeof(size_t);
start_of_free_datasegment+=size+sizeof(size_t);
return ret;
}
void free(void *ptr) {} // inline me too
Same for realloc, but that simply always moves to a new block if the size
is larger.
Ugly, I'd admit frankly. But the OS doesn't care, and the speed gains when
doing a lot of fine grained allocations can be flabbergasting, at the
expense of some memory.
I'm not used to profile C code, but with the compiler I usually use, I start
doing such optimisation if profiling shows 5-15% time spent in the memory
allocating routine. (the total time spent in allocation is than already
much larger due to time eaten by procedure the allocator calls), and
requirements are met (single pass, than exit, doesn't have large allocates,
large free's, then again large allocate's memory usage profile)