Chapter 12: Elementary Sorting Methods

This page is a generated reference surface for selective reading. It exists to keep the learner apps guide-first while still preserving source access.

Learning objectives

Explain the main ideas and vocabulary in Elementary Sorting Methods.
Work through the source examples for Elementary Sorting Methods without depending on raw chunk order.
Use Elementary Sorting Methods as selective reference when learner modules point back to Algorithms Sedgewick.

Prerequisites

Earlier prerequisite concepts leading into Chapter 12: Elementary Sorting Methods.

Module targets

module-02-sorting-searching-structures

AI companion modes

Explain simply
Socratic tutor
Quiz me
Challenge my understanding
Diagnose my confusion
Generate extra practice
Revision mode
Connect forward / backward

Source-of-truth note

This unit is anchored to Algorithms Sedgewick and the source chapter "Chapter 12: Elementary Sorting Methods". Use external resources only to clarify, extend, or modernize details without replacing the chapter's conceptual spine.

External enrichment

No chapter-specific enrichment resources are curated yet. Add them in the unit manifest when a source clearly improves learning.

Source provenance

Primary source: Algorithms Sedgewick
Source chapter 12: Chapter 12: Elementary Sorting Methods
Raw source file: 028-elementary-sorting-methods.md
Raw source file: 029-elementary-sorting-methods.md
Raw source file: 030-elementary-sorting-methods.md
Raw source file: 031-exercises.md

Merged source

Elementary Sorting Methods

8. Elementary Sorting Methods

As our first excursion into the area of sorting algorithms, we'll study some "elementary" methods which are appropriate for small files or files with some special structure. There are several reasons for studying these simple sorting algorithms in some detail. First, they provide a relatively painless way to learn terminology and basic mechanisms for sorting algorithms so that we get an adequate background for studying the more sophisticated algorithms. Second, there are a great ma.ny applications of sorting where it's better to use these simple methods than the more powerful general-purpose methods. Finally, some of the simple methods extend to better generalpurpose methods or can be used to improve the efficiency of more powerful methods. The most prominent example of this is seen in recursive sorts which "divide and conquer" big files into many small ones. Obviously, it is advantageous to know the best way to deal with small files in such situations. As mentioned above, there are several sorting applications in which a relatively simple algorithm may be the method of choice. Sorting programs are often used only once (or only a few times). If the number of items to be sorted is not too large (say, less than five hundred elements), it may well be more efficient just to run a simple method than to implement and debug a complicated method. Elementary metho'ds are always suitable for small files (say, less than fifty elements); it is unlikely that a sophisticated algorithm would be justified for a small file, unless a very large number of such files are to be sorted. Other types of files that are relatively easy to sort are ones that are already almost sorted (or already sorted!') or ones that contain large numbers of equal keys. Simple methods can do much better on such well-structured files than general-purpose methods. As a rule, the elementary methods that we'll be discussing take about N2 steps to sort N randomly arranged items. If N is small enough, this may not be a problem, and if the items are not randomly arranged, some of the

methods might run much faster than more sophisticated ones. However, it must be emphasized that these methods (with one notable exception) should not be used for large, randomly arranged files.

Before considering some specific algorithms, it will be useful to discuss some general terminology and basic assumptions for sorting algorithms. We'll be considering methods of sorting files of records containing keys. The keys, which are only part of the records (often a small part), are used to control the sort. The objective of the sorting method is to rearrange the records so that their keys are in order according to some well-defined ordering rule (usually numerical or alphabetical order).

If the file to be sorted will fit into memory (or, in our context, if it will fit into a Pascal array), then the sorting method is called internal. Sorting files from tape or disk is called external sorting. The main difference between the two is that any record can easily be accessed in an internal sort, while an external sort must access records sequentially, or at least in large blocks. We'll look at a few external sorts in Chapter 13, but most of the algorithms that we'll consider are internal sorts.

As usual, the main performance parameter that we'll be interested in is the running time of our sorting algorithms. As mentioned above, the elementary methods that we'll examine in this chapter require time proportional to N2 to sort N items, while more advanced methods can sort N items in time proportional to N log N. It can be shown that no sorting algorithm can use less than N log N comparisons between keys, but we'll see that there are methods that use digital properties of keys to get a total running time proportional to N.

The amount of extra memory used by a sorting algorithm is the second important factor we'll be considering. Basically, the methods divide into three types: those that sort in place and use no extra memory except perhaps for a small stack or table; those that use a linked-list representation and so use N extra words of memory for list pointers; and those that need enough extra memory to hold another copy of the array to be sorted.

A characteristic of sorting methods which is sometimes important in practice is stability: a sorting method is called stable if it preserves the relative order of equal keys in the file. For example, if an alphabetized class list is sorted by grade, then a stable method will produce a list in which students with the same grade are still in alphabetical order, but a non-stable method is likely to produce a list with no evidence of the original alphabetic order. Most of the simple methods are stable, but most of the well-known sophisticated algorithms are not. If stability is vital, it can be forced by appending a

small index to each key before sorting or 5y lengthening the sort key in some other way. It is easy to take stability for granted: people often react to the unpleasant effects of instability with disbelief. Actually there are few methods which achieve stability without using significant extra time or space.

The following program, for sorting three records, is intended to illustrate
neral conventions that we'll be using. (In particular, the main program is
liar way to exercise a program that is known to work only for N = 3: the
is that most of the sorting programs we'll consider could be substituted
rt3 in this "driver" program.)

program threesort(input, output);
con& maxN==100;
var a: array [l..maxN] of integer;

     N, i: integer;
procedure sort3;

var t : integer;
begin
if a[l]>a[2] then

   begin t:-=a[l]; a[1 ]:=a[2]; a[2]:=t end
if a[l]>a[J] then

   begin t:=a[l]; a[l]:=a[3]; a[3]:=t end;
if a[2]>a[3] then

      begin t:=a[2]; a[2]:=a[3]; a[3]:=t end;
   end;
begin
readln (N) ;
for i:=l to N do read(a[i]);
if N=3 then sort3;
for i:=l to N do write(a[i]);
wri teln
end.

The three assignment statements following each if actually implement an "exchange" operation. We'll write out the code for such exchanges rather than use a procedure call because they're fundamental to many sorting programs and often fall in the inner loop.

In order to concentrate on algorithmjc issues, we'll work with algorithms that simply sort arrays of integers into numerical order. It is generally straightforward to adapt such algorithms for use in a practical application involving large keys or records. Basically, sorting programs access records in one of two ways: either keys are accessed for comparison, or entire records are accessed

to be moved. Most of the algorithms that we will study can be recast in terms of performing these two operations on arbitrary records. If the records to be sorted are large, it is normally wise to do an "indirect sort": here the records themselves are not necessarily rearranged, but rather an array of pointers (or indices) is rearranged so that the first pointer points to the smallest record, etc. The keys can be kept either with the records (if they are large) or with the pointers (if they are small).

By using programs which simply operate on a global array, we're ignoring "packaging problems" that can be troublesome in some programming environments. Should the array be passed to the sorting routine as a parameter? Can the same sorting routine be used to sort arrays of integers and arrays of reals (and arrays of arbitrarily complex records)? Even with our simple assumptions, we must (as usual) circumvent the lack of dynamic array sizes in Pascal by predeclaring a maximum. Such concerns will be easier to deal with in programming environments of the future than in those of the past and present. For example, some modern languages have quite well-developed facilities for packaging together programs into large systems. On the other hand, such mechanisms are not truly required for many applications: small programs which work directly on global arrays have many uses; and some operating systems make it quite easy to put together simple programs like the one above, which serve as "filters" between their input and their output. Obviously, these comments apply to many of the other algorithms that we'll be examining, though their effects are perhaps most acutely felt for sorting algorithms.

Some of the programs use a few other global variables. Declarations which are not obvious will be included with the program code. Also, we'll sometimes assume that the array bounds go to 0 or iV+1, to hold special keys used by some of the algorithms. We'll frequently use letters from the alphabet rather than numbers for examples: these are handled in the obvious way using Pascal's ord and chr "transfer functions" between integers and characters.

The sort3 program above uses an even more constrained access to the file: it is three instructions of the form "compare two records and exchange them if necessary to put the one with the smaller key first." Programs which use only this type of instruction are interesting because they are well suited for hardware implementation. We'll study this issue in more detail in Chapter 35.

Elementary Sorting Methods

One of the simplest sorting algorithms works as follows: first find the smallest element in the array and exchange it with the element in the first position, then find the second smallest element and exchange it with the element in

the second position, continuing in this way until the entire array is sorted. This method is called selection sort because it works by repeatedly "selecting" the smallest remaining element. The following program sorts a [1..N] into numerical order:

procedure selection;
   var i, j, min, t: integer;
   begin
   for i:=l to N do
      begin
       min:=i;
      for j:=i+l to N do
          if ab]<a[min] then min:=j;
       t:=a[min]; a[min]:=a[i]; a[i]:=t
       end ;
   end ;

This is among the simplest of sorting methods, and it will work very well for small files. Its running time is proportional to N2: the number of comparisons between array elements is about N2/2 since the outer loop (on i) is executed N times and the inner loop (on j) is executed about N/2 times on the average. It turns out that the statement min:=j is executed only on the order of N log N times, so it is not part of the inner loop

Despite its simplicity, selection sort has a quite important application: it is the method of choice for sorting files with very large records and small keys. If the records are M words long (but the keys are only a few words long), then the exchange takes time proportional to M, so the total running time is proportional to N2 (for the comparisons) plus NM (for the exchanges). If M is proportional to N then the running time is linear in the amount of data input, which is difficult to beat even with an advanced method. Of course if it is not absolutely required that the records be actually rearranged, then an "indirect sort" can be used to avoid the NM term entirely, so a method which uses less comparisons would be justified. Still selection sort is quite attractive for sorting (say) a thousand lOOO-word records on one-word keys.

Elementary Sorting Methods

An algorithm almost as simple as selection sort but perhaps more flexible is insertion sort. This is the method often used by people to sort bridge hands: consider the elements one at a time, inserting each in its proper place among those already considered (keeping them s.orted). The element being considered is inserted merely by moving larger elements one position to the right, then

96                                            ChXPTER 8

inserting the element into the vacated position. The code for this algorithm is straightforward:

procedure insertion;         j:=j-1 end;

var i, j, v: integer;

begin
for i:=2 to N do

begin
v:=a[i]; j:=i;
while ab-1]>v do

   begin ab] :=ab-11;
ab] :=v
end ;

end;

As is, this code doesn't work, because the while will run past the left end of the array if t is the smallest element in the array. One way to fix this is to put a "sentinel" key in a[O], making it at least as small as the smallest element in the array. Using sentinels in situations like this is common in sorting programs to avoid including a test (in this case j>l) which almost always succeeds within the inner loop. If for some reason it is inconvenient to use a sentinel and the array really must have the bounds [1..N], then standard Pascal does not allow a clean alternative, since it does not have a "conditional" and instruction: the test while (j>l) and (a1j-l]>v) won't work because even when j=l, the second part of the and will be evaluated and will cause an o&of-bounds array access. A goto out of the loop seems to be required. (Some programmers prefer to goto some lengths to avoid goto instructions, for example by performing an action within the loop to ensure that the loop terminates. In this case, such a solution seems hardly justified, since it makes the program no clearer, and it adds extra overhead everytime through the loop to guard against a rare event.)

On the average, the inner loop of insertion sort is executed about N2/2 times: The "average" insertion goes about halfway into a subfile of size N/2. This is inherent in the method. The point of insertion can be found more efficiently using the searching techniques in Chapter 14, but N2/2 moves (to make room for each element being inserted) are still required; or the number of moves can be lowered by using a linked list instead of an array, but then the methods of Chapter 14 don't apply and N2/2 comparisons are required (to find each insertion point).

Shellsort

Insertion sort is slow because it exchang,es only adjacent elements. For example, if the smallest element happens to be at the end of the array, it takes N steps to get it where it belongs. Shellsort is a simple extension of insertion sort which gets around this problem by allowing exchanges of elements that are far apart.

If we replace every occurrence of "1" by "h" (and "2" by "h+l") in insertion sort, the resulting program rearranges a file to give it the property that taking every hth element (starting anywhere) yields a sorted file. Such a file is said to be h-sorted. Put another way, an h-sorted file is h independent sorted files, interleaved together. By h-sorting for some large values of h, we can move elements in the array long distances and thus make it easier to h-sort for smaller values of h. Using such a procedure for any sequence of values of h which ends in 1 will produce a sorted file: this is Shellsort.

The following example shows how a sample file of fifteen elements is sorted using the increments 13, 4, 1:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
AS0RTINGEXAMPLE
A E 0 R T I N C: E X A M P L S
AEAGE INMPL0RTXS
A A E E G I L h4 N 0 P R S T X

In the first pass, the A in position 1 is compared to the L in position 14, then the S in position 2 is compared (and exchanged) with the E in position 15. In the second pass, the A T E P in positions 1, 5, 9, and 13 are rearranged to put A E P T in those positions, and similarly for positions 2, 6, 10, and 14, etc. The last pass is just insertion sort, but no element has to move very far.

The above description of how Shellsort gains efficiency is necessarily imprecise because no one has been able to analyze the algorithm. Some sequences of values of h work better than others, but no explanation for this has been discovered. A sequence which has been shown empirically to do well is...,1093,364,121,40,13,4,1, as in the following program:

procedure shellsort;
   label 0;
   var i, j, h, v: integer;
   begin
    h:=l; repeat h:=3*h+l until h>N;
   repeat
       h:=h div 3;
       for i:=h+l to N do
       begin
          v:=a[i]; j:=i;
          while ab-h]>v do
             begin
             a[j]:=ab-h]; j : = j - h ;
             if j<=h then goto 0
             end ;
       0: ab]:=v
       end ;
    until h= 1;
    end ;

Note that sentinels are not used because there would have to be h of them, for the largest value of h used.

The increment sequence in this program is easy to use and leads to an efficient sort. There are many other increment sequences which lead to a more efficient sort (the reader might be amused to try to discover one), but it is difficult to beat the above program by more than 20% even for relatively large N. (The possibility that much better increment sequences exist is still, however, quite real.) On the other hand, there are some bad increment sequences. Shellsort is sometimes implemented by starting at h=N (instead of initializing so as to ensure the same sequence is always used as above). This virtually ensures that a bad sequence will turn up for some N.

Comparing Shellsort with other methods analytically is difficult because the functional form of the running time for Shellsort is not, even known (and depends on the increment sequence). For the above program, two conjectures a r e N(logN)2 a n d 1N25. The running time is not particularly sensitive to the initial ordering of the file, especially in contrast to, say, insertion sort, which is linear for a file already in order but quadratic for a file in reverse order.

Shellsort is the method of choice for many sorting applications because it
ceptable running time even for moderately large files (say, five thousand
ts) and requires only a very srnall amount of code, which is easy to get

working. We'll see methods that are more efficient in the next few chapters, but they're perhaps only twice as fast (if that much) except for large N, and they're significantly more complicated. In short, if you have a sorting problem, use the above program, then determine vvhether the extra effort required to replace it with a sophisticated method will be worthwhile. (On the other hand, the Quicksort algorithm of the next chapter is not that much more difficult to implement... )

Digression: Bubble Sort

An elementary sorting method that is often taught in introductory classes is bubble sort: keep passing through the file, exchanging adjacent elements, if necessary; when no exchanges are required on some pass, the file is sorted. An implementation of this method is given below.

procedure bubblesort;                      a[j]:=t end

var j, t: integer;
begin
repeat

t:=a[l];
for j:=2 to N do

if ab--l]>alj] then
   begin t:=:a[j-I]; ab-1]:=ab];

until t=a[l];
end ;

It takes a moment's reflection to convince oneself first that this works at all, second that the running time is quadratic. It is not clear why this method is so often taught, since insertion sort seems simpler and more efficient by almost any measure. The inner loop of bubble sort has about twice as many instructions as either insertion sort or selection sort.

Exercises

A very special situation for which there is a simple sorting algorithm is the following: "sort a file of N records whose keys are distinct integers between 1 and N." The algorithm for this problem is

for i:=l to N do t[a[i]]:=a[i];
for i:=l to N do a[i]:=t[i];

This algorithm uses a temporary array t. It is possible (but much more complicated) to solve this problem without an auxiliary array.

A more realistic problem solved by an algorithm in the same spirit is: "sort a file of N records whose keys are integers between 0 and M - 1." If M is not too large, an algorithm called distribution counting can be used to solve this problem. The idea is to count the number of keys with each value, then use the counts to move the records into position on a second pass through the file, as in the following code:

for j:=O to M-l do countlj]:=O;
for i:=l to Ndo

    count[a[i]]:=count[a[i]]+l;
for j:=l to M-l do

    count b] :=count Ij-l]+count b] ;
for i:=N downto 1 do

    begin
    t[count[a[i]]]:=a[i];
   count[a[i]]:=count[a[i]]-1
    end ;
for i:=l to N do a[i]:=t[i];

To see how this code works, consider the following sample file of integers:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 3 110 3 7 5 5 2 4 2 10 2 6 4

The first for loop initializes the counts to 0; the second produces the counts

0 1234567

This says that there are two O's, three l's, etc. The third for loop adds these numbers to produce

2 5 8 10 12 14 15 16

That is, there are two numbers less than 1, five numbers less than 2, etc. Now, these can be used as addresses to sort the array:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0 0 1112 2 2 3 3 4 4 5 5 6 7

For example, when the 4 at the end of the file is encountered, it's put into location 12, since count[4] says that there are 12 keys less than or equal to 4. Then count[4] is decremented, since there's now one less key less than or equal to 4. The inner loop goes from N down to 1 so that the sort will be stable. (The reader may wish to check this.)

This method will work very well for the type of files postulated. Furthermore, it can be extended to produce a much more powerful method that we'll examine in Chapter 10.

We usually think of sorting files that are in some arbitrarily scrambled order. However, it is quite often the case that we have a lot of information about a file to be sorted. For example, one often wants to add a few elements to a sorted file and thus produce a larger sorted file. One way to do so is to simply append the new elements to the end of t.he file, then call a sorting algorithm. General-purpose sorts are commonly mi'sused for such applications; actually, elementary methods can take advantage of the order present in the file.

For example, consider the operation of insertion sort on a file which is already sorted. Each element is immediately determined to be in its proper in the file, and the total running time is linear. The same is true for bubble sort, but selection sort is still quadratic. (The leading term in the running time of selection sort does not depend on the order in the file to be sorted.)

Even if a file is not completely sorted, insertion sort can be quite useful because the running time of insertion sort depends quite heavily on the order present in the file. The running time depends on the number of inversions: for each element count up the number of e:iements to its left which are greater. This is the distance the elements have to move when inserted into the file during insertion sort. A file which has some order in it will have fewer inversions in it than one which is arbitrarily scrambled.

The example cited above of a file formed by tacking a few new elenients onto a large sorted file is clearly a case where the number of the inversions is low: a file which has only a constant number of elements out of place will have only a linear number of inversions. Another example is a file where each element is only a constant distance front its final position. Files like this can be created in the initial stages of some advanced sorting methods: at a certain point it is worthwhile to switch over to jnsertion sort.

In short, insertion sort is the method of choice for "almost sorted" files
ew inversions: for such files, it will outperform even the sophisticated
s in the next few chapters.

Exercises

Give a sequence of "compare-exchange" operations for sorting four records.
Which of the three elementary methods runs fastest for a file which is already sorted?
Which of the three elementary methods runs fastest for a file in reverse order?
Test the hypothesis that selection sort is the fastest of the three elementary methods, then insertion sort, then bubble sort.

5. Give a good reason why it might be inconvenient to use a sentinel key for
     insertion sort (aside from the one that comes up in the implementation
     of Shellsort).

How many comparisons are used by Shellsort to 7-sort, then S-sort the keys EASYQUESTION?
Give an example to show why 8,4,2,1 would not be a good way to finish off a Shellsort increment sequence.
Is selection sort stable? How about insertion sort and bubble sort? 9. Give a specialized version of distribution counting for sorting files where

elements have only one of two values (x or y).

periment with different increment sequences for Shellsort: find one that
runs faster than the one given for a random file of 1000 elements.

Learning objectives​

Prerequisites​

Module targets​

AI companion modes​

Source-of-truth note​

External enrichment​

Source provenance​

Merged source​

Elementary Sorting Methods​

Elementary Sorting Methods

8. Elementary Sorting Methods​

Elementary Sorting Methods​

Elementary Sorting Methods

Elementary Sorting Methods​

Elementary Sorting Methods

Exercises​

Exercises

Exercises​

Learning objectives

Prerequisites

Module targets

AI companion modes

Source-of-truth note

External enrichment

Source provenance

Merged source

Elementary Sorting Methods

8. Elementary Sorting Methods

Elementary Sorting Methods

Elementary Sorting Methods

Exercises

Exercises