This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: MIPS: Workaround the "daddi" and "daddiu" errata


[Most of this is related to the gcc side of things, but I've kept
 binutils@ cc'ed since some comments apply to gas too.]

"Maciej W. Rozycki" <macro@ds2.pg.gda.pl> writes:
>  As you may know there are errata in the R4000 and the initial revision of 
> the R4400 processor, that lead to an incorrect execution of the "daddi" 
> and the "daddiu" instructions whan a two-complement overflow happens.  I 
> believe the only way to work this problem around is to assume the affected 
> processors don't support these instructions at all.  Fortunately, this can 
> be quite easily done by pretending these instructions are actually 
> macros and expanding them to a sequence of a "li" ("addiu" to $0) and a 
> "dadd" or a "daddu" instruction with an aid of a temporary register, like 
> this:
>
> daddiu $2,$3,0x1234 -> addiu $1,$0,0x1234; daddu $2,$3,$1
> daddi  $2,$3,0x1234 -> addiu $1,$0,0x1234; dadd  $2,$3,$1
>
>  This works quite well when $1 is available and could be implemented with 
> gas exclusively for that case.  Unfortunately, "daddiu" is quite commonly 
> used in expansions of address references in macros.

I guess there are two main ways of approaching an errata like this:

  (a) Stop gas & gcc from ever using daddiu.
  (b) Try to justify whether each use of daddiu is safe. Use an
      alternative if not.

You've gone for (a) here, which I suppose you could argue is safer.
Problem is, the patch ends up touching far more code than (b) would,
and introduces far more special cases.  I think it's going to be very
hard to maintain the patch in the long-term.  From that point of view,
it could be argued that it's actually safer to restrict the workaround
to the cases that really matter.  There's less chance of things silently
breaking in future.

A quick look at the gcc patch suggests that the changes fall into
three categories:

  (1) daddiu used to add an integer constant whose value is known to gcc

        Related hunks:
        * adddi3_internal_3
        * changes to other patterns that I don't think are necessary

  (2) daddiu used to add %lo(x) to an incomplete address

        Hunks related to general addresses:
        * mips_symbol_insns
        * mips_secondary_reload_class
        * all(?) 64-bit non-mips16 .md patterns that have a load or store
          alternative
        * various reload patterns

        Hunks related to branch handling:
        * mips_conditional_register_usage
        * mips_output_load_label
        * mips_output_conditional_branch

  (3) daddiu used to allocate and deallocate stack space in gcc

        Related hunks:
        * ASM_OUTPUT_REG_PUSH
        * ASM_OUTPUT_REG_POP

And the errata we're working around is that:

      daddiu dst,src,xxxx

will give the incorrect result in the following cases:

  (+) src = 0x7fff_ffff_ffff_yyyy && xxxx + yyyy > 0x10000

      RESULT: dst will be 0x0000_0000_0000_zzzz
              rather than 0x8000_0000_0000_zzzz

  (-) src = 0x8000_0000_0000_yyyy && -xxxx > yyyy

      RESULT: dst will be 0xffff_ffff_ffff_zzzz
              rather than 0x7fff_ffff_ffff_zzzz

So looking at each of these in turn:

  (1) Definitely needs a workaround, no question about that.

  (2) Adding %lo(X) to the high part of X can only trigger case (-)
      since yyyy (the low 16 bits of src) will be 0.  In this case,
      X must be of the form 0x7fff_ffff_ffff_8???, which is an invalid
      address, right?  So there's no way this bug could trigger if X
      refers to a valid in-memory object.

      Of course, X could still be some kind of constant defined in a
      linker script, but, well... don't do that if you care about
      targets with a buggy daddiu.  It hardly seems to justify the
      kind of code bloat it causes in the common case, nor the big
      maintenance burden.

      The same argument applies when "src" contains the high part of
      X plus an index.  If X refers to an in-memory object, no valid
      index can cause the daddiu to overflow.

      Unfortunately, I don't know of any specific code in gcc to stop:

          &foo[y - x]

      from being expanded as:

          ...load high part of foo into dst...
          daddu dst,dst,y
          daddiu dst,dst,%lo(foo)
          dsubu dst,dst,x

      so perhaps we could trip over the errata when x and y are both
      humungous values.  But I don't think this case justifies treating
      "daddiu ...,%lo(...)" as a complete no-no.  Again, it leads to
      large code bloat and has a high maintenance burden.

      If you can demonstrate that &foo[y - x] is indeed a problem then
      I'd prefer a patch that addresses it head-on.

  (3) In the case of ASM_OUTPUT_REG_PUSH, you're catering for case (-);
      i.e. the case where the stack pointer goes from 0x8000... to
      0x7fff....  Since this represents the stack pointer moving from
      xkphys to an invalid address, it surely represents a program gone
      wrong.  Even the original stack pointer value is a bit suspect ;)

      Perhaps you could argue that, since 0xffff... is a valid address,
      the behaviour of the broken daddiu is less friendly for error-detection
      and debugging purposes.  But I don't think that's enough to justify
      the extra gcc maintenance burden, nor the cost of using two instructions
      to allocate small chunks of stack.

      Similar arguments apply for the POP case.

Your patch also includes changes to patterns like ffsdi.  This seems
a bit over-the-top; there's no way counting from 0 to 64 can trigger
overflow.

Anyway, concentrating on (1):

> @@ -1030,17 +1066,41 @@
>  		 (match_dup 3)))]
>    "")
>  
> -(define_insn "adddi3_internal_3"
> +(define_expand "adddi3_internal_3"
>    [(set (match_operand:DI 0 "register_operand" "=d,d")
>  	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
>  		 (match_operand:DI 2 "arith_operand" "d,Q")))]
>    "TARGET_64BIT && !TARGET_MIPS16"
> -  "@
> -    daddu\t%0,%z1,%2
> -    daddiu\t%0,%z1,%2"
> +  "")
> +
> +(define_insn "adddi3_internal_3a"
> +  [(set (match_operand:DI 0 "register_operand" "=d")
> +	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> +		 (match_operand:DI 2 "register_operand" "d")))]
> +  "TARGET_64BIT && !TARGET_MIPS16"
> +  "daddu\t%0,%z1,%2"
> +  [(set_attr "type"	"darith")
> +   (set_attr "mode"	"DI")])
> +
> +(define_insn "adddi3_internal_3b"
> +  [(set (match_operand:DI 0 "register_operand" "=d")
> +	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> +		 (match_operand:DI 2 "const_arith_operand" "Q")))]
> +  "TARGET_64BIT && !TARGET_MIPS16 && !TARGET_NO_DADDI"
> +  "daddiu\t%0,%z1,%2"
>    [(set_attr "type"	"darith")
>     (set_attr "mode"	"DI")])
>  
> +(define_insn "adddi3_internal_3c"
> +  [(set (match_operand:DI 0 "register_operand" "=d")
> +	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ")
> +		 (match_operand:DI 2 "const_arith_operand" "Q")))]
> +  "TARGET_64BIT && !TARGET_MIPS16 && TARGET_NO_DADDI"
> +  "%[addiu\\t%@,%.,%2\;daddu\t%0,%z1,%@%]"
> +  [(set_attr "type"	"darith")
> +   (set_attr "mode"	"DI")
> +   (set_attr "length"	"8")])
> +
>  ;; For the mips16, we need to recognize stack pointer additions
>  ;; explicitly, since we don't have a constraint for $sp.  These insns
>  ;; will be generated by the save_restore_insns functions.

Please don't split the existing pattern.  One pattern with several
alternatives gives reload more freedom than several patterns with
single alternatives.

I suggest the TARGET_NO_DADDIU version look something like:

(define_insn "adddi3_internal_3_nodaddiu"
  [(set (match_operand:DI 0 "register_operand" "=d,d")
	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
		 (match_operand:DI 2 "arith_operand" "d,Q")))]
  "TARGET_64BIT && TARGET_NO_DADDIU"
{
  if (which_alternative == 0)
    return "daddu\t%0,%z1,%2";

  if (operands[1] == stack_pointer_rtx)
    return "daddiu\t%0,%z1,%2";

  return "%[addiu\t%@,%.,%2\;daddu\t%0,%z1,%@%]";
}
  [(set_attr "type"	"darith")
   (set_attr "length"   "8,4")])

...completely untested, might not even be syntactically valid.
This overestimates the length of the stack pointer case,
but that's not a problem.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]