This is the mail archive of the gsl-discuss@sourceware.cygnus.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

suggest to add 'heapsort'.



Hello,
I want to suggest the 'heapsort' algorithm to 'gsl'.
The algorithm is described in 'Sedgewick, Algorithms in C', though I
use the german version.
Compared with 'quicksort' the behaviour of 'heapsort':

disadvantages:
a) it is slower for random data

advantages:
a) faster on presorted data
b) a true in place algorithm
c) the algorithm is short and easy
d) easy to modify for special needs

Curently I wrote the implementation following the template feature in
'gsl/statistics'.  I implemented direct and indirect sorting (== sort
only an index array to the data, the order of the original data is not
changed).  The code works, but needs some clean ups and the test code
is currently not ready.

If there is some interest, where shall I send the source?  I assume
doing it with this list is some waste of band width because of the
size of some files.

Hints are welcome.

Bye
Thomas

PS:
 following the heart of the source, that's really all.

static void
FUNCTION(my,downheap) (BASE data[],
		       const MY_INDX_T N,
		       MY_INDX_T k)
{
  BASE v = data[k];

  while (k <= N / 2)
    {
      MY_INDX_T j = k + k;

      if ((j < N) && (data[j] < data[j + 1]))
	{
	  j++;
	}

      if (v >= data[j])
	{
	  break;
	}

      data[k] = data[j];
      k = j;
    }

  data[k] = v;
}


void
FUNCTION(gsl_stats, heapsort_data) (BASE data[],
				   MY_INDX_T N)
{
  /*
   * Sort the array data[i=0...N-1] (original description sorts data[i=1...N])
   * In ascending order.
   * This is a true inplace algorithm with N log N operations
   * Worst case (feed it with an already sorted array) is something like 20% slower
   */

  MY_INDX_T k;
  
  /*
   * We have N-data, last element is at 'N-1', first at '0'
   */
  N--;

#if defined(USE_SIGNED_INDIZES)

  /*
   * This is the original and does not work if
   * the type of 'k' is 'unsigned' because
   * 'k >= 0' is always true.
   */
  for (k = N / 2; k >= 0; k--)
    {
      FUNCTION(my,downheap) (data, N, k);
    }

#else  /* !USE_SIGNED_INDIZES */

  /*
   * To try to avoid this (negative array indizes are strange)
   * recode it using a 'do-while-loop'.
   */
  k = N / 2;
  k++;				/* Compensate the first use of 'k--' */
  do
    {
      k--;
      FUNCTION(my,downheap) (data, N, k);
    }
  while (k > 0);

#endif  /* !USE_SIGNED_INDIZES */

  while (N > 0)
    {
      BASE t = data[0]; data[0] = data[N]; data[N] = t;
      FUNCTION(my,downheap) (data, --N, 0);
    }
}

-- 
Latest news:
Einbruch bei Microsoft, Bill Gates hat Windows offen gelassen.

----------------------------------------------
Dipl. Phys. Thomas Walter
Inst. f. Physiklische Chemie II
Egerlandstr. 3				Tel.: ++9131-85 27326 / 27330
91058 Erlangen, Germany			email: walter@pctc.chemie.uni-erlangen.de

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]