This is the mail archive of the
binutils@sources.redhat.com
mailing list for the binutils project.
Re: RFC: MIPS: Workaround the "daddi" and "daddiu" errata
- From: Richard Sandiford <rsandifo at redhat dot com>
- To: "Maciej W. Rozycki" <macro at ds2 dot pg dot gda dot pl>
- Cc: gcc-patches at gcc dot gnu dot org, binutils at sources dot redhat dot com
- Date: Sat, 06 Mar 2004 11:02:35 +0000
- Subject: Re: RFC: MIPS: Workaround the "daddi" and "daddiu" errata
- References: <Pine.LNX.4.55.0403031531360.3561@jurand.ds.pg.gda.pl>
[Most of this is related to the gcc side of things, but I've kept
binutils@ cc'ed since some comments apply to gas too.]
"Maciej W. Rozycki" <macro@ds2.pg.gda.pl> writes:
> As you may know there are errata in the R4000 and the initial revision of
> the R4400 processor, that lead to an incorrect execution of the "daddi"
> and the "daddiu" instructions whan a two-complement overflow happens. I
> believe the only way to work this problem around is to assume the affected
> processors don't support these instructions at all. Fortunately, this can
> be quite easily done by pretending these instructions are actually
> macros and expanding them to a sequence of a "li" ("addiu" to $0) and a
> "dadd" or a "daddu" instruction with an aid of a temporary register, like
> this:
>
> daddiu $2,$3,0x1234 -> addiu $1,$0,0x1234; daddu $2,$3,$1
> daddi $2,$3,0x1234 -> addiu $1,$0,0x1234; dadd $2,$3,$1
>
> This works quite well when $1 is available and could be implemented with
> gas exclusively for that case. Unfortunately, "daddiu" is quite commonly
> used in expansions of address references in macros.
I guess there are two main ways of approaching an errata like this:
(a) Stop gas & gcc from ever using daddiu.
(b) Try to justify whether each use of daddiu is safe. Use an
alternative if not.
You've gone for (a) here, which I suppose you could argue is safer.
Problem is, the patch ends up touching far more code than (b) would,
and introduces far more special cases. I think it's going to be very
hard to maintain the patch in the long-term. From that point of view,
it could be argued that it's actually safer to restrict the workaround
to the cases that really matter. There's less chance of things silently
breaking in future.
A quick look at the gcc patch suggests that the changes fall into
three categories:
(1) daddiu used to add an integer constant whose value is known to gcc
Related hunks:
* adddi3_internal_3
* changes to other patterns that I don't think are necessary
(2) daddiu used to add %lo(x) to an incomplete address
Hunks related to general addresses:
* mips_symbol_insns
* mips_secondary_reload_class
* all(?) 64-bit non-mips16 .md patterns that have a load or store
alternative
* various reload patterns
Hunks related to branch handling:
* mips_conditional_register_usage
* mips_output_load_label
* mips_output_conditional_branch
(3) daddiu used to allocate and deallocate stack space in gcc
Related hunks:
* ASM_OUTPUT_REG_PUSH
* ASM_OUTPUT_REG_POP
And the errata we're working around is that:
daddiu dst,src,xxxx
will give the incorrect result in the following cases:
(+) src = 0x7fff_ffff_ffff_yyyy && xxxx + yyyy > 0x10000
RESULT: dst will be 0x0000_0000_0000_zzzz
rather than 0x8000_0000_0000_zzzz
(-) src = 0x8000_0000_0000_yyyy && -xxxx > yyyy
RESULT: dst will be 0xffff_ffff_ffff_zzzz
rather than 0x7fff_ffff_ffff_zzzz
So looking at each of these in turn:
(1) Definitely needs a workaround, no question about that.
(2) Adding %lo(X) to the high part of X can only trigger case (-)
since yyyy (the low 16 bits of src) will be 0. In this case,
X must be of the form 0x7fff_ffff_ffff_8???, which is an invalid
address, right? So there's no way this bug could trigger if X
refers to a valid in-memory object.
Of course, X could still be some kind of constant defined in a
linker script, but, well... don't do that if you care about
targets with a buggy daddiu. It hardly seems to justify the
kind of code bloat it causes in the common case, nor the big
maintenance burden.
The same argument applies when "src" contains the high part of
X plus an index. If X refers to an in-memory object, no valid
index can cause the daddiu to overflow.
Unfortunately, I don't know of any specific code in gcc to stop:
&foo[y - x]
from being expanded as:
...load high part of foo into dst...
daddu dst,dst,y
daddiu dst,dst,%lo(foo)
dsubu dst,dst,x
so perhaps we could trip over the errata when x and y are both
humungous values. But I don't think this case justifies treating
"daddiu ...,%lo(...)" as a complete no-no. Again, it leads to
large code bloat and has a high maintenance burden.
If you can demonstrate that &foo[y - x] is indeed a problem then
I'd prefer a patch that addresses it head-on.
(3) In the case of ASM_OUTPUT_REG_PUSH, you're catering for case (-);
i.e. the case where the stack pointer goes from 0x8000... to
0x7fff.... Since this represents the stack pointer moving from
xkphys to an invalid address, it surely represents a program gone
wrong. Even the original stack pointer value is a bit suspect ;)
Perhaps you could argue that, since 0xffff... is a valid address,
the behaviour of the broken daddiu is less friendly for error-detection
and debugging purposes. But I don't think that's enough to justify
the extra gcc maintenance burden, nor the cost of using two instructions
to allocate small chunks of stack.
Similar arguments apply for the POP case.
Your patch also includes changes to patterns like ffsdi. This seems
a bit over-the-top; there's no way counting from 0 to 64 can trigger
overflow.
Anyway, concentrating on (1):
> @@ -1030,17 +1066,41 @@
> (match_dup 3)))]
> "")
>
> -(define_insn "adddi3_internal_3"
> +(define_expand "adddi3_internal_3"
> [(set (match_operand:DI 0 "register_operand" "=d,d")
> (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
> (match_operand:DI 2 "arith_operand" "d,Q")))]
> "TARGET_64BIT && !TARGET_MIPS16"
> - "@
> - daddu\t%0,%z1,%2
> - daddiu\t%0,%z1,%2"
> + "")
> +
> +(define_insn "adddi3_internal_3a"
> + [(set (match_operand:DI 0 "register_operand" "=d")
> + (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> + (match_operand:DI 2 "register_operand" "d")))]
> + "TARGET_64BIT && !TARGET_MIPS16"
> + "daddu\t%0,%z1,%2"
> + [(set_attr "type" "darith")
> + (set_attr "mode" "DI")])
> +
> +(define_insn "adddi3_internal_3b"
> + [(set (match_operand:DI 0 "register_operand" "=d")
> + (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> + (match_operand:DI 2 "const_arith_operand" "Q")))]
> + "TARGET_64BIT && !TARGET_MIPS16 && !TARGET_NO_DADDI"
> + "daddiu\t%0,%z1,%2"
> [(set_attr "type" "darith")
> (set_attr "mode" "DI")])
>
> +(define_insn "adddi3_internal_3c"
> + [(set (match_operand:DI 0 "register_operand" "=d")
> + (plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> + (match_operand:DI 2 "const_arith_operand" "Q")))]
> + "TARGET_64BIT && !TARGET_MIPS16 && TARGET_NO_DADDI"
> + "%[addiu\\t%@,%.,%2\;daddu\t%0,%z1,%@%]"
> + [(set_attr "type" "darith")
> + (set_attr "mode" "DI")
> + (set_attr "length" "8")])
> +
> ;; For the mips16, we need to recognize stack pointer additions
> ;; explicitly, since we don't have a constraint for $sp. These insns
> ;; will be generated by the save_restore_insns functions.
Please don't split the existing pattern. One pattern with several
alternatives gives reload more freedom than several patterns with
single alternatives.
I suggest the TARGET_NO_DADDIU version look something like:
(define_insn "adddi3_internal_3_nodaddiu"
[(set (match_operand:DI 0 "register_operand" "=d,d")
(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
(match_operand:DI 2 "arith_operand" "d,Q")))]
"TARGET_64BIT && TARGET_NO_DADDIU"
{
if (which_alternative == 0)
return "daddu\t%0,%z1,%2";
if (operands[1] == stack_pointer_rtx)
return "daddiu\t%0,%z1,%2";
return "%[addiu\t%@,%.,%2\;daddu\t%0,%z1,%@%]";
}
[(set_attr "type" "darith")
(set_attr "length" "8,4")])
...completely untested, might not even be syntactically valid.
This overestimates the length of the stack pointer case,
but that's not a problem.
Richard