This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: fast additive copy method


On Mon, 2014-08-11 at 11:04 +0000, Kilian, Jens wrote:
> > -----Original Message-----
> > From: Joël Krähemann [mailto:weedlight@gmail.com]
> > Sent: Sunday, 10 August, 2014 22:44
> > To: Carlos O'Donell
> > Cc: libc-help@sourceware.org
> > Subject: Re: fast additive copy method
> 
> [...]
> 
> > Hi, I'm doing a soft synth therefore in RAM is copied audio buffers in a
> > repining way. The function ags_audio_signal_copy_buffer_to_buffer()
> > should be optimized.
> 
> First, you seem to be adding (short) ints with wraparound (0x7fff + 1 -> -0x8000).  For audio signals a saturating addition (0x7fff + 1 -> 0x7fff) may be more appropriate.
> Second, you want to look into whether your compiler supports vectorized operations, aka. MMX/SSE/etc.; either via autovectoring or special intrinsic functions (which are less portable).
> 
> Hope this helps,
> 
> 	Jens.
I'm using gcc. What file do I need to include in order to get __m128i
type on debian GNU/Linux?

  signed short s1[64] __attribute__((aligned(128)));
  signed short s2[64] __attribute__((aligned(128)));

  size = (guint) ceil((float) size / 64.0);

  for(; 0 < size; size--){
    __m128i *a;
    __m128i *b;
    signed short *offset;
    guint i;

    offset = destination;

    for(i = 64; i > 0; i--){
      *s1++ = *destination;
      destination += dchannels;
    }

    for(i = 64; i > 0; i--){
      *s2++ = *source;
      source += schannels;
    }

    a = _mm_load_si128((__m128i *) s1);
    b = _mm_load_si128((__m128i *) s2);

    _mm_store_si128(s1, _mm_adds_epu16(a, b));
    destination = offset;

    for(i = 64; i > 0; i--){
      destination = *s1++;
      destination += dchannels;
    }
  }



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]