[GO] qsort Fast and stable sort modified from quicksort ss18

◎ Inconvenience of old qsort

Well-known C libraries include newlib and glibc. I think the qsort function in it has been used for over 20 years. There are some inconveniences.

1.newlib The newlib qsort only sorts directly and does not automatically switch to indirect sorting. Therefore, it is slow when the elements of the original array are large. (Indirect sort: Create a pointer array for each element and sort it. Finally, move the elements of the original array.)

2.glibc The glibc qsort is actually a merge sort. So it's fast when the keys are unique (like employee numbers), It is slow when there are duplicates in the key (like man / woman). Also, since it is a merge sort, it is a stable sort. However, when the work area allocation (malloc) fails, Do a normal quicksort. Therefore, "stability" cannot be guaranteed.

The ss18 released this time is a fast stable sort. If you want "fast sorting because it's unstable" We recommend ss15 (http://ww51.tiki.ne.jp/~srr-cake/qsort/qs15/).

newlib ・ glibc I haven't researched much unexpectedly. Please let me know if there is a faster comparison sort.

◎ Work area

ss18 requires a work area (order $ O (n) $) (= auxiliary array). glibc's qsort uses the same workspace $ O (n) $ even though its stability is not guaranteed. The author thinks this is fine in an era when main memory is cheap.

◎ About calculation time

In qsort, the worst calculation time is $ O (n ^ 2) $. However, ss18 carefully selects the pivot, so Almost no incompatible sortable sequences (occur by chance or unlucky) thinking about.

However, the calculation time is $ O (n ^ 2) $ for a maliciously created array to be sorted. Can be. As a countermeasure, "select the pivot more carefully" You may. (However, it is not implemented in the current version) Specifically, "When the first ss18 is called in the process, with the time () function etc. Generate an irreproducible random number and fluctuate the pivot with this value. " This method is especially useful for ssort, which is a stable sort. Even if the pivot fluctuates This is because the target array will always be the same after sorting. It can be different in qsort.

Modified quicksort to create stable sort (ssort) (ss18)

It is a completely different algorithm from the previously published ss14. It's more compact and faster. Roughly speaking ss18 repeats the following movements in the auxiliary array (ptr2) and the target array (ptr).

   ptr2:...............  →  ptr2:5555......77787  →  ptr2:..........77787
   ptr :357358257257237  A  ptr:332223.........  B  ptr:3322235555.....

Simplified version for algorithm explanation (element size fixed to 8 bytes) (ss18c4d) The measured processing time is shown below. The simplified version of the program is listed at the end.

Key type Number of elements Size Number of repetitions Processing time(Seconds) 
    qs_glibc    d=-3      e=10000   s=8      R2463      T=5.74
    ss18c4d     d=-3      e=10000   s=8      R2463      T=4.73
                                                              
    qs_glibc    d=100     e=10000   s=8      R7035      T=15.47
    ss18c4d     d=100     e=10000   s=8      R7035      T=6.25
                                                              
    qs_glibc    d=2       e=10000   s=8      R24630     T=35.19
    ss18c4d     d=2       e=10000   s=8      R24630     T=2.35
                                                              
      d=-3:No duplicate keys d=100:100 types of keys d=2:The key is men and women, etc.
Number of iterations: The number of times the sort was repeated to reduce the measurement error of the processing time.

qs_glibc is a modification of the qsort function of the C library "glibc" for performance measurement. qs_glibc works as a merge sort unless malloc () fails. Merge sort is a stable sort (https://ja.wikipedia.org/wiki/sort).   In all cases, ss18c4d is faster than qs_glibc.     The regular version (ss18e8b) can be found at http://ww51.tiki.ne.jp/~srr-cake/qsort/ss18/.

I posted the method of the regular version benchmark test in readme2.txt. The regular version also measures the processing time of qsnewlib. (See http://ww51.tiki.ne.jp/~srr-cake/qsort/ss18/bench-sample.txt)

qsnewlib is the qsort function of the C library "newlib". Unstable sort. So it's not surprising that qsnewlib is the fastest, but to indirect sorting Qsnewlib is slow when the element size is large because there is no automatic switching.

Comparing qsnewlib (unstable) qs_glibc (stable) ss18e8b (stable) On 64-bit systems, ss18e8b is generally the fastest.

/**********   ssort (stable sort)   ss18    *******/
/* 
Simplified version of ss18(ss18 is a stable comparison sort. Faster than merge sort and ss14.)

Target array(ptr)And auxiliary arrays(ptr2)(Same size as ptr)Perform a stable sort using.
Move the element in AB2 stage as follows. Here "5" is the pivot(Split element)Represents.

   ptr2:...............  →  ptr2:5555......77787  →  ptr2:..........77787
   ptr :357358257257237  A  ptr:332223.........  B  ptr:3322235555.....

In move A, collect the pivots at the left edge of ptr2, collect elements less than 5 at the left edge of ptr2,
Collect elements greater than 5 at the right edge of ptr2. At this time, 77787 is in the reverse order of the initial order.
In the case of movement B, if 5555 is in reverse order, the 5555 immediately after B is moved in forward order.
At this point, the 5555 is already in the sorted completed position.
Then 332223{Start position / end position / forward / reverse order}To the stack.
After that, the same process is repeated for 77787. If the partial array is 2 or less in length, the partial sort is finished.
From the stack{Start position / end position / forward / reverse order}Is taken out and processing is continued.
The process ends when the stack becomes empty.
(Since the above is a simplified version, the storage position of the pivot is different from the regular version)
*/

#include <stdio.h>
//typedef long unsigned int size_t;
void *malloc(size_t); void free(void*); void exit(int);

static void assert(int x, char *s)  //If necessary, uncomment the line below
{ /* if (x) return; fprintf(stderr, "++++ %s ++++\n", s); exit(1); */ }

#define  F(x) (*((long long int *)(x)))
#define  COPY(x,y) {                      F(x)=F(y);        }  /*Copy processing of array elements*/
#define  SWAP(x,y) {long long int v=F(x); F(x)=F(y); F(y)=v;}  /*Array element swap processing*/
//The above three lines are related to the movement of elements. Can be rewritten to another one.


//Below is the sort program by ss18
typedef struct { char *LLss, *RRss; int Rv; } stack_node;   /*L,Stack structure for stacking R*/
#define PUSH(llss,rrss,rv) {top->LLss = (llss); top->RRss = (rrss); top->Rv = (rv); ++top;} /*Stack*/
#define POP(llss,rrss)     {--top; llss = top->LLss; rrss = top->RRss; rev = top->Rv;}      /*return*/

int ssort( void *base, size_t nel, size_t size,  int (*cmp)(void *a, void *b) )
              /* base :Pointer to the array you want to sort*/
              /* nel  :Number of elements in array base*/
              /* size :Element size of array base (in bytes)*/
              /* cmp  :A pointer to a function that compares the size of elements*/
{
 char *l,*l0,*r,*r0,*L,*R; // l0,r0,L,R is the starting point on the left and right of the array. The element between L and R is l,r,Sort to m
 char *ptr,*ptrE,*ptr2;    // ptr,ptr2 is L ~ R,The first transmission and reception of the l0 to r0 array is switched every time.
 char *m,*m0,*p;           //p moves between L and R*Accumulate pivots in m
 int  t, rev;              //t is for work rev==When it is 0, the order between L and R is normal.
 long long int Z;          //Z is the distance between arrays(Can be positive or negative)
 stack_node stack[32], *top = stack;     /*32 is enough for now*/

 assert(sizeof(          int) == 4, "sizeof(          int) == 4");
 assert(sizeof(long long int) == 8, "sizeof(long long int) == 8");
 assert(size == 8 , "Since it is a simplified version, size is fixed at 8."       );
 //Since it is a simplified version, the base alignment is sizeof(char*)Suppose it is

 ptr2 = malloc( nel * size );      //ptr2 is an auxiliary array for sorting
 if (ptr2 == NULL) {assert(0,"malloc failed"); return 1;}
 ptr = base;
 Z = ptr2 - ptr;  //Z has both positive and negative distances between two arrays and auxiliary arrays
 ptrE = ptr + nel * size;  //ptrE is used to determine whether L to R are in the target sequence or auxiliary sequence.
 L=ptr; R=&ptr[size*(nel-1)]; rev=0;
 goto LOOP;
 //So far, one call to ssort is executed only once


nxt:
 if (stack == top) {free(ptr2); return 0;}  /*Exit when the stack is empty*/
 POP(L,R);                                  /*Take out the value on the stack*/

LOOP:
 if (R<L) {                                           /*No element*/
   goto nxt;
 }
 if (L==R) {                                          /*Number of elements 1*/
   if (ptr<=L && L<ptrE) {           }  // L<Judgment of ptr2 is normal, but ptr2<There is also a base system
   else                  {COPY(L-Z,L)}
   goto nxt;
 }
 if (L+size==R) {                                     /*Number of elements 2*/
   if ((t=cmp(L,R))<0) if (ptr<=L && L<ptrE) {                       }
                       else                  {COPY(L-Z,L) COPY(R-Z,R)}
   else       if (t>0) if (ptr<=L && L<ptrE) {SWAP(L,R)              }
                       else                  {COPY(L-Z,R) COPY(R-Z,L)}
   else                if (ptr<=L && L<ptrE) if (rev==0) {                       }
                                             else        {SWAP(L,R)              }
                       else                  if (rev==0) {COPY(L-Z,L) COPY(R-Z,R)}
                                             else        {COPY(L-Z,R) COPY(R-Z,L)}
   goto nxt;      
 }
 
 if (ptr<=L && L<ptrE) {l=l0=L  ; r=r0=R+Z; m=m0=L+Z;}  // l :Where to put elements smaller than the pivot next
 else                  {l=l0=L-Z; r=r0=R-Z; m=m0=L;  }  // r :Where to put elements larger than the pivot next
                                                        // m :Where to put the same element as the pivot next

 COPY(m,L) m+=size;  //Since it is a simplified version, simply pivot the first element
 for (p=L+size; p<=R; p+=size) {
   if ((t=cmp(p,m0))<0) {COPY(l,p) l+=size;}
   else        if (t>0) {COPY(r,p) r-=size;}
   else                 {COPY(m,p) m+=size;}
 }
 assert((l-l0)+(m-m0)+(r0-r)==(R-L+size),"(l-l0)+(m-m0)+(r0-r)Is strange");

 if (rev==0) {
   p=l; do {COPY(p,m0) m0+=size; p+=size;} while (m0<m);  //There is at least one in m0-m
   if (l-l0 < r0-r) {PUSH(r+size,r0,1) L=l0; R=l-size; rev=0;} /*Sort from left to right*/
   else             {PUSH(l0,l-size,0) L=r+size; R=r0; rev=1;} /*Sort from right to first*/
 }else{
   p=l; do {m-=size;   COPY(p,m) p+=size;} while (m0<m);  //There is at least one in m0-m
   if (l-l0 < r0-r) {PUSH(r+size,r0,0) L=l0; R=l-size; rev=1;} /*Sort from left to right*/
   else             {PUSH(l0,l-size,1) L=r+size; R=r0; rev=0;} /*Sort from right to first*/
 }
 goto LOOP;
}

Recommended Posts