This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Scheduling x86 dispatch windows


On Thu, Jun 10, 2010 at 11:05 AM, Quentin Neill
<quentin.neill.gnu@gmail.com> wrote:
> Cross-posting Reza's call for feedback to the binutils list since it
> is relevant -
> see the last few paragraphs regarding how to "solve the alignment problem".
>
> Original thread: http://gcc.gnu.org/ml/gcc/2010-06/threads.html#00402
>
> Not sure if followups should occur on one list or both.
> --
> Quentin Neill
>
>
> On Thu, Jun 10, 2010 at 12:20 PM, reza yazdani <yazdani_reza@yahoo.com> wrote:
>> Hi,
>>
>> We are in the process of adding a feature to GCC to take advantage
>> of a new hardware feature in the latest AMD micro processor. This
>> feature requires a certain mix, ordering and alignments in
>> instruction sequences to obtain the expected hardware performance.
>>
>> I am asking the community to review this high level implementation
>> design and give me direction or advice.
>>
>> The new hardware issues two windows of the size N bytes of
>> instructions in every cycle. It goes into accelerate mode if the
>> windows have the right combination of instructions or alignments. Our
>> goal is to maximize the IPC by proper instruction scheduling and
>> alignments.
>>
>> Here is a summary of the most important requirements:
>>
>> a) Maximum of N instructions per window.
>> b) An instruction may cross the first window.
>> c) Each window can have maximum of x memory loads and y memory
>> ? ?stores .
>> d) The total number of immediate constants in the instructions
>> ? ?of a window should not exceed k.
>> e) The first window must be aligned on 16 byte boundary.
>> f) A Window set terminates when a branch exists in a window.
>> g) The number of allowed prefixes varies for instructions.
>> h) A window set needs to be padded by prefixes in instructions
>> ? ?or terminated by nops to ensure adherence to the rules.
>>
>> We have the following implementation plan for GCC:
>>
>> 1) Modify the Haifa scheduler to make the desired arrangement of
>> ? ?instructions for the two dispatch windows. The scheduler is called
>> ? ?once before and once after register allocation as usual. In both
>> ? ?cases it performs dispatch scheduling along with its normal job of
>> ? ?instruction scheduling.
>>
>> The advantage of doing it before register allocation is avoiding
>> extra dependencies caused by register allocation which may become
>> an obstacle to movement of instructions. ?The advantage of doing
>> it after register allocation is a consideration for spilling code
>> which may be generated by the register allocator.
>>
>> The algorithm we use is:
>>
>> a) Considering the current dispatch window set, choose the first
>> ? ?instruction from ready queue that does not violate dispatch rules.
>> b) When an instruction is selected and scheduled, inform the
>> ? ?dispatcher code about the instruction. This step keeps track of the
>> ? ?instruction content of windows for future evaluation. It also manages
>> ? ?the window set by closing and opening new virtual dispatch windows.
>>
>> 2) Insertion of alignment code.
>>
>> In x86 alignment is done by inserting prefixes or by generating
>> nops. As the object code is generated by the assembler in GCC, some
>> information such as sizes of branches are unknown until assembly or
>> link time. To do alignments related to dispatch correctly in GCC,
>> we need to iteratively compute prefixes and branch sizes until
>> its convergence. This pass currently does not exist in GCC, but it
>> exists in the assembler.
>>
>> There are two possible approaches to solve alignment problem.
>>
>> a) ?Let the assembler performs the alignments and padding needed
>> ? ? to adhere with the new machine dispatching rules and avoid an extra
>> ? ? pass in GCC.
>> b) ?Add a new pass to mimic what assembler does before generating
>> ? ? the assembly listing in GCC and insert the required alignments.
>>
>> I appreciate your comments on the proposed implementation procedure
>> and the choices a or b above.

I don't this should be done in assembler. Assembler should just assemble
the assembly input.

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]