This is the mail archive of the
mailing list for the CGEN project.
Re: "just in time" compiler/translator for the simulators.
- To: cgen at sources dot redhat dot com
- Subject: Re: "just in time" compiler/translator for the simulators.
- From: graydon at venge dot net
- Date: Sat, 15 Sep 2001 14:07:53 -0400
- References: <Pine.GSO.4.33.0109151521580.7857-100000@night>
- Reply-To: graydon at pobox dot com
On Sat, Sep 15, 2001 at 03:33:18PM +0200, Johan Rydberg wrote:
> The idea is to translate the simulated insns into native insns
> and run them on the host machine. Insns that can not be translated
> will be simulated in `the old fashion way'.
I had a very similar conversation with fche a couple months ago, so I'll just
regurgitate what he said and tailor it a bit to the current proposal.
when the simulator is generated, we have a static description of the insn's
semantics, but it is at least partially abstract: it has "holes" into which the
actual flags, operand values, etc. will be placed, when a given instance of the
insn is decoded and extracted. if you're lucky, the chosen semantics won't
depend on the target CPU's dynamic state, so we'll assume that for now.
once an insn is decoded and extracted, in our present simulators, a record is
kept in a hashtable indicating the decoded semantic function (a function
pointer) and the extracted operand values. the table is hashed on the pc value
of the insn, so if the insn is returned to (say in a loop) the same record is
fetched and fed into the semantic function for subsequent execution. if we're
being very ambitious we chain such records together into pseudo basic blocks,
jumping directly from semantics to semantics.
now, keep in mind that semantic functions can be specialized arbitrarily. for
instance, say we have one semantic function representing a three-operand "mul"
insn. we may specialize this to eight functions: one for immediate operands and
one for indirect operands, in each of 3 operand "holes" (2^3 = 8). so we'd
have "mul_imm-imm-imm", "mul_imm-imm-ind", "mul_imm-ind-imm", etc. when
decoding and extracting, we could set the semantic function pointer to the
variant within this space of 8 mul functions, and save us ever having to
execute any sort of operand-mode switching logic inside the function.
but that's just one specialization; we could in fact specialize semantic
functions into "small immediate" vs. "large immediate", into "power-of-two" vs.
"general integer", even all the way down to the individual bit-pattern level.
i.e., in a 16-bit insn word machine, we could generate 2^16 semantic functions,
one for each possible opcode _and operand_. obviously this becomes a little
unwieldy on large insn-word machines, not to mention inefficient on
sparsely-coded insn sets. but the thing to keep in mind is that the
specialization itself can be performed statically, during simulator generation,
when we have a lot of time on our hands. gcc generates code "one function at a
time", so it will not run out of memory or anything processing an excessively
large set of semantic functions, and you're only ever going to load into memory
those functions which are demand-paged in by nature of being used. so it's not
what you're proposing (jit simulation in general) is to delay the task of
specializing semantic functions until the moment of execution (or perhaps
slightly before, say during loading). this has the advantage that you only ever
generate the specialized variant when it occurs (avoiding 2^32 functions), so
you can probably specialize all the way down to the bit level, i.e. perform
a reasonably full "translation".
the disadvantage is that you're essentially taking on the burden of a compiler
backend. you need to do host insn selection, scheduling, register allocation,
dataflow optimization, and assembly for every host platform you want to work
with. the only credible tool I can imagine using for this "live" is MLRISC,
which means you're coding in SML; not a terrible burden, but something to keep
another, slightly weirder approach is to scan your target programs insns and
emit fully-specialized semantic C for those insns alone, and feed them into
gcc, essentially pre-decoding and pre-extracting the entire set of functions
used by your program alone. then you could feed gperf the set of insn bit
patterns you encountered, and get a nice direct dispatch table into your
semantic functions. this would be comparatively easier than jitting, as you'd
just be guiding the existing specialization concept by the set of insns which
actually occurs in your program, and leaving all the backend work to gcc.
the downside would be that you'd need to re-do all this stuff for each target
program; similar to your jit proposal, you'd want to do it into a temporary file
at program-load time. loading a really big program could take a while.
many mixtures of these strategies are of course possible. I wouldn't fully endorse
jitting carte-blanche, but it might be a good strategy in some settings.