$30
CMPT 435
Assignment Two – Sorting Algorithms
Applying Sorting Algorithms to a list of Strings
Contents
1 Introduction 2
2 Ontology of the Sort class 2
3 Insertion Sort 3
3.1 Understanding Insertion Sort . . . . . . . . . . . . . . . . . . . . 3
3.2 Implementation of Insertion Sort . . . . . . . . . . . . . . . . . . 3
4 Selection Sort 5
4.1 Understanding Selection Sort . . . . . . . . . . . . . . . . . . . . 5
4.2 Implementation of Selection Sort . . . . . . . . . . . . . . . . . . 5
5 Merge Sort 6
5.1 Understanding Merge Sort . . . . . . . . . . . . . . . . . . . . . . 6
5.2 Implementation of Merge Sort . . . . . . . . . . . . . . . . . . . . 7
6 Quick Sort 10
6.1 Understanding Quick Sort . . . . . . . . . . . . . . . . . . . . . . 10
6.2 Implementation of Quick Sort . . . . . . . . . . . . . . . . . . . . 10
7 Setup and Execution 12
8 Conclusion: The Results 13
9 Appendix 15
page 1 of 15
1 Introduction
This document describes four different methods for sorting items, and implementations of those methods, or algorithms, in C++. The four methods are
known as insertion sort, selection sort, merge sort, and quick sort. Merge sort
and quick sort are more efficient as they implement recursion within their sorting processes, as seen in the chart above. The reason for this will be discussed
later on.
2 Ontology of the Sort class
The Sort class seen below contains all of the different sorts as static functions,
with the idea that the main running function would then be able to have easy access to the methods. This class contains the sorting algorithms, helper functions
such as swap() and printList(), and helper functions for the two recursive
sorting algorithms.
1 #pragma once
2 #include <iostream>
3 #include <string>
4
5 class Sort {
6 public:
7
8
9 static void swap(std::string &a, std::string &b);
10 static void printList(std::string s[], int n);
11 static int Insertion(std::string s[], int n);
12 static int Selection(std::string s[], int n);
13 static int subMSort(std::string s[], int l, int m, int r, int cnt);
14 static void mergeSort(std::string s[], int lInd, int rInd, int &cnt);
15 static void quickSort(std::string s[], int p, int r, int &cnt);
16 static int Partition(std::string s[], int p, int r, int &cnt);
17
18 };
A Quick Note
To make sense of these algorithms, for each one I will start by giving an
overview of the process including a diagram. After, I will show my implementation of the algorithm followed by a detailed description of each step with
references to specific lines in the code. This is done to avoid the clutter of describing the algorithm itself and my implementation of said algorithm at the
same time.
page 2 of 15
3 Insertion Sort
3.1 Understanding Insertion Sort
Insertion sort operates by stepping through each element of a list and comparing
as it goes. Each current element will compare itself to each element that was
before it in the order of the list, and swap. The site Programiz.com has a better
description of the method.
Insertion sort works similarly as we sort cards in our hand in a card
game.
We assume that the first card is already sorted then, we select an
unsorted card. If the unsorted card is greater than the card in hand,
it is placed on the right otherwise, to the left. In the same way, other
unsorted cards are taken and put at their right place.
A similar approach is used by insertion sort.
Insertion sort is a sorting algorithm that places an unsorted element
at its suitable place in each iteration.
–programiz.com
3.2 Implementation of Insertion Sort
Insertion sort is probably the simplest of the sorting algorithms out of the four.
This does not mean however that it is the most efficient, as will be evident later
on.
1 int Sort::Insertion(std::string s[], int n)
2 {
3 int counter = 0;
4 std::string last;
page 3 of 15
5 for (int i = 1; i < n; i++)
6 {
7 last = s[i];
8 int j = i - 1;
9 while (j >= 0 && s[j] > last)
10 {
11 counter++;
12
13 s[j + 1] = s[j];
14
15 j = j - 1;
16 }
17
18 s[j + 1] = last;
19 }
20
21 //printList(s, n); //debugging
22 return counter;
23 }
Code Breakdown
1. For each index in the array s[] after index one(Line 5)
2. Compare the current element to the one before it (line 9)
3. If the current element is smaller than the one before it, keep comparing
until the current element is bigger than the next one being compared. At
that point, shift all elements that were previously compared up by one.
(lines 9-16)
4. Place the current element in its place which was created by the movement
of the other elements (line 18)
5. Continue the same process until the end of the array.
page 4 of 15
4 Selection Sort
4.1 Understanding Selection Sort
Selection sort is a sorting algorithm similar to Insertion sort. Although, due
to its differences, it is usually slower and less efficient. The process is quite
simple. For every index in the array, if there is any value further in the array
that is smaller than the current index, swap the current index with the lowest
of those values.
4.2 Implementation of Selection Sort
The simplicity of selection sort can be seen below, as the total number of lines
of code is even less than insertion sort.
1 int Sort::Selection(std::string s[], int n)
2 {
3 int low;
4 int counter = 0;
5 for (int i = 0; i < n; i++)
6 {
7 low = i;
8 for (int j = i + 1; j < n; j++)
9 {
10 counter++;
11
12 if (s[j] < s[low])
13 {
14 low = j;
15 }
page 5 of 15
16 }
17 Sort::swap(s[low], s[i]);
18 }
19 //printList(s, n); //debugging
20 return counter;
21 }
Code Breakdown
1. for each element in the array s[] starting at index 0 (lines 5,6):
2. Start by assuming that this current index i is the lowest value by assigning
the index number to the low variable (line 7).
3. For each element after i, compare it to element s[low] (lines 8-12)
4. Whenever a lower element is found, mark that as the new lowest index
(line 14). Now, all elements after this new element will be compared to
the new lowest value, until the true lowest value is found.
5. Once the inner iterator reaches the end of the array (line 8), swap the
value of the current element s[i] with the newly found lowest element
denoted by s[low] (line 17).
5 Merge Sort
5.1 Understanding Merge Sort
Refresher on Recursion
Before we progress any further, we should quickly refresh oursleves on the
concept of recursion. Simply put, when a function is recursive, this means that
it calls itself.
1 def fact(n):
2 if n == 0:
3 return 1
4 else:
5 return n * fact(n-1)
The Python code above shows a simple use of recursion to calculate factorials.
Since a factorial is just a number multiplied by each integer below it in succession, this can be expedited by recursion. Notice how on line 5 the function
calls itself by passing n-1 to itself. With this, the function can continue. The
next part of recursion is its base case of 0. Without this, the program will never
finish, and possibly will cause a stack overflow error. By having a base case,
there is a breakpoint for the code where it will begin to move back up the chain
of calculation, freeing up the stack as it progresses.
page 6 of 15
Upon first glance of the graphic above, one may think that merge sort is
much more complex than the previous two algorithms discussed. This person
would be correct. Although, it is easier to understand after it is broken into its
parts. These parts being the splitting, and the sorting. To begin, we will discuss
the splitting function, which also happens to be the recursive function.
5.2 Implementation of Merge Sort
1 void Sort::mergeSort(std::string s[], int lInd, int rInd, int &cnt)
2 {
3
4 int m = 0;
5
6 if (lInd < rInd)
7 {
8 m = lInd + (rInd - lInd) / 2;
9 mergeSort(s, lInd, m, cnt);
10 mergeSort(s, (m + 1), rInd, cnt);
page 7 of 15
11
12 cnt += Sort::subMSort(s, lInd, m, rInd, cnt);
13 }
14 else
15 {
16 //int n = sizeof(*s) / sizeof(s[0]);
17 //printList(s, n);
18 }
19 }
Unforeseen Issues
Up until now, I have not acknowledged the lines of code dedicated to counting comparisons, because they are not integral to the function of the algorithms.
In this sorting method though, it was a particular challenge. The reason being
that this function is recursive, and therefore I could not simply run the function, count the comparisons as it goes, and then return it. Because of this, in
this algorithm along with Quick Sort, I passed a reference to a count variable I
called cnt. This way I could have the algorithms increment the counter while
not having to worry about maintaining the variable within themselves. Now on
to the description.
Code Breakdown
1. Check that there is actually enough range to sort by seeing if the left index
is still before the right index (line 6)
2. set m, our midpoint variable, to lInd + (rInd - lInd) / 2, in order to
account for odd numbers (line 8).
3. Call mergeSort on the first half (0-m) and the second half (m+1 to end)
(lines 9, 10)
4. Sort the array within the described bounds and add to the count (line 12).
If the process still doesn’t make sense, focus on the left and right indexes
that are being passed to mergeSort on lines 9 and 10.
mergeSort(s, lInd, m, cnt);
mergeSort(s, (m + 1), rInd, cnt);
Since this function will continue running until the left and right indexes
match up, or when the bounds of the array given equal one index, these two
lines will continue to take the given bounds and keep splitting them up into
quarters, eighths, and so on until that base case is achieved. Now onto the
sorting portion of the algorithm.
1 int Sort::subMSort(std::string s[], int l, int m, int r, int cnt)
2 {
3 int counter = 0;
page 8 of 15
4 int len1 = m - l + 1;
5 int len2 = r - m;
6
7 std::string *left = new std::string[len1];
8 std::string *right = new std::string[len2];
9
10 for (int i = 0; i < len1; i++)
11 {
12 left[i] = s[l + i];
13 }
14 for (int j = 0; j < len2; j++)
15 {
16 right[j] = s[m + 1 + j];
17 }
18 int i = 0, j = 0;
19 int k = l;
20 while (i < len1 && j < len2)
21 {
22 if (left[i] < right[j])
23 {
24 s[k++] = left[i++];
25 counter++;
26 }
27 else
28 {
29 s[k++] = right[j++];
30 counter++;
31 }
32 }
33 while (i < len1)
34 {
35 s[k++] = left[i++];
36 }
37 while (j < len2)
38 {
39 s[k++] = right[j++];
40 }
41 return counter;
42 }
1. Allocate space for the left side of the given portion of the array with the
correct size (lines 4, 7).
2. Allocate space for the right size of the given portion of the array with the
correct size (lines 5, 8).
3. Put the first half of the items in the given portion of the array to the left
array, and the second half to the right array (lines 10-17)
4. Set index counter i for the left array and j for the right array, both at
page 9 of 15
0. also set an index counter for the actual array as k, initiated with l, the
left bound index for the array.(lines 18-19)
5. Compare both the left and right arrays by index starting 0, whichever
value at the current index is larger, place that value at the next index in
the main array s[]. Increment the index counter on the sub array that the
value was pulled from, as well as the index counter for the whole main
array. This way, the sorting can continue (lines 20-32).
6. Continue comparing until the end of either sub array is reached (line 20).
7. At this point, append the remaining unsorted items after the last sorted
index of s[] (lines 33-40).
6 Quick Sort
6.1 Understanding Quick Sort
Quick sort is even faster than merge sort. The reason being that much of
the sorting is being done at the same time that the array is being broken up.
This process will seem very similar to merge sort. Quick sort uses pivot values
to choose where to compare values. The basic structure is to choose a pivot
/ partition value to start, and split at that point. Next, a new pivot value is
chosen, and all values below the pivot value are put below the pivot value, and
all values greater are put in front of the value. From here, a new pivot value is
chosen and the process continues until each element is sorted in its place. The
decision of what pivot value to use is widley discussed, although in my version
I start with the last element.
6.2 Implementation of Quick Sort
Much like merge sort, quick sort calls itself recursivley on two halves of the
given bounds. Although this time, the ”midpoint” value is determined by the
page 10 of 15
partition function.
1 void Sort::quickSort(std::string s[], int p, int r, int &cnt)
2 {
3 if (p < r)
4 {
5 int q = Sort::Partition(s, p, r, cnt);
6 Sort::quickSort(s, p, (q - 1), cnt);
7 Sort::quickSort(s, (q + 1), r, cnt);
8 }
9 }
1. Ensure that the given bounds account for more than one index (line 3)
2. Use the Partition() function to sort the given bounds and return a new
midpoint. (line 5)
3. Call quickSort() on all elements from index 0 to q as defined by the
Partition function.
Below is the definition of the Partition function. It is most likely the most
difficult to understand. At a very high level, it is placing elements in front of
or behind the pivot element (chosen as the last element in the unsorted array)
depending on if the value of the element is higher or lower than the pivot. Also,
it returns the index of the next partition index.
1 int Sort::Partition(std::string s[], int p, int r, int &cnt)
2 {
3 std::string x = s[r];
4 int i = p - 1;
5 for (int j = p; j < (r - 1); j++)
6 {
7 if (s[j] <= x)
8 {
9 i++;
10 cnt++;
11 Sort::swap(s[i], s[j]);
12 }else{
13 cnt++;
14 }
15 }
16 Sort::swap(s[(i + 1)], s[r]);
17 return ++i;
18 }
1. set the last element in the range as the pivot element x (line 3).
2. variable i, used to point to the last sorted element below the pivot is
initialized to (p - 1)(line 4).
3. begin an for loop with an iterator called j. Initiate j with p and continue
until (r - 1) (line 5).
page 11 of 15
4. If value s[j] <= x, then increment i, and swap the values s[i] and s[j]
(lines 7-12)
5. After exiting the for loop, swap s[(i + 1)] and s[r].(line 16)
6. Finally, Partition returns i after incrementing it one final time. With
this, the Partition function returns the new pivot index (line 17).
7 Setup and Execution
Finally, Here is a brief description of the actual execution of these algorithms.
1 const int len = 666; //normal mode finishes in ~15 - 35 ms
2
3 std::string arr[4][len];
4 //std::string arr[len];
5 void setup()
6 {
7 ...
8 ...
9 ...
10 arr[i][count++] = ln;
11 ...
12 ...
13 ...
14 }
I have excluded most of the setup() function, as it is a direct copy from the
AlanParse() function. Briefly, it reads in the .txt file line by line to an array
for use by the functions. The only difference this time is that the list needs to
be reused. In the beginning, this was done by using a function called reset()
to erase all of the content from the array each time. This was used between
executions of the different algorithms.
Although When playing around with multithreading (which never was completed) I realized that I could not have just one array. The reason being that
there would be memory issues if four separate threads were accessing the same
memory. So, on line 10, it is clear to see that I made a multidimensional 4x666
array of strings. Each subarray for each algorithm. Although the dream of
having multithreading, and testing on greater numbers of strings was never
realized, this was a much cleaner organization of the data.
1 int main()
2 {
3 setup();
4 std::cout << "Insertion sort made " << std::to_string(Sort::Insertion(arr[0], len)) << " comparisons." << std::endl;
5
6 std::cout << "Selection sort made " << std::to_string(Sort::Selection(arr[1], len)) << " comparisons." << std::endl;
7
8 int cnt = 0;
9 Sort::mergeSort(arr[2], 0, (len - 1), cnt);
page 12 of 15
10 std::cout << "Merge sort made " << std::to_string(cnt) << " comparisons." << std::endl;
11
12 int qCnt = 0;
13 Sort::quickSort(arr[3], 1, len, qCnt);
14 std::cout << "Quick sort made " << std::to_string(qCnt) << " comparisons." << std::endl;
15 /* Sort::printList(arr[0], len); //test Insertion Sort
16 Sort::printList(arr[1], len); //test Selection Sort
17 Sort::printList(arr[2], len); //test Merge Sort
18 Sort::printList(arr[3], len); //test Quick Sort */
19 }
The first to algorithms, insertion sort and selection sort, have nothing notable about their execution. They both are passed the array itself and the length
of the array, and return an integer which is converted into a string to display.
The execution of merge sort and quick sort are more complicated. Both are
passed their array just as insertion and selection are, although these two get
low and high bounds, as well as a count variable. As previously discussed, the
bounds are very important to these sorts, as they are recursive. Without them,
this process would be much more complicated.
8 Conclusion: The Results
Finally, the results tell the real story. It is clear to see that the recursion helped to
make Merge Sort and Quick Sort considerably faster. This is easy to understand
as both functions operate with O((n)log(n)) complexity. Logarithmic algorithms
are much quicker already, as logarithms grow very slowly. Compared to Selection sort and Insertion sort, their complexity of O(n
2
) stands no chance when
it comes to speed. Although in the real world, this does not make much of a
difference. On my personal laptop, a moderately powerful machine, I clocked
the whole program to run in just around 30 milliseconds.
Sort Comparisons
Insertion Sort 114309
Selection Sort 221445
Merge Sort 5445
Quick Sort 7155
page 13 of 15
Closing Thoughts
For someone not in the field of computer science, it may seem trivial to sort
a list of numbers, or sentences, or anything else. Although to me, this was and
still is one of the hardest things to grasp. Beyond that, this assignment began
to get my wheels turning thinking about multi threading. I am sure that with
more time, I would have had a working test case for sorting over 1,000,000
items per algorithm, each being on a separate core. This can wait until the next
assignment though.
{S.A.}
page 14 of 15
9 Appendix
References
I n s e r t i o n S o r t Algori thm , Programiz . com , www. programiz . com/ dsa / i n s e r t i o n −s o r t .
I n s e r t i o n So r t , Geeks forGeeks , 25 J uly 2020 , www. g ee k s f o r ge e k s . org / i n s e r t i o n −s o r t / .
Merge So r t , GeeksForGeeks , 7 June 2020 , www. g ee k s f o r ge e k s . org /merge−s o r t / .
Quick So r t , Geeks forGeeks , 4 Sep t . 2020 , www. gee k s f o r ge e k s . org / quick−s o r t /? r e f=lbp .
S e l e c t i o n S o r t Algori thm , Top Si t e s , www. t h e t o p s i t e s . ne t / a r t i c l e /51681938. sh tml .
Used for Graphics and quotes
Another interesting metric is to compare the total lines of code
per algorithm (in C++) compared to their number of
comparisons. It should be noted that in this instance quick sort is
less efficient than merge sort. This is truly down to the way that
the pivot index is chosen. If the index was chosen at random, or
some other way, the algorithm may be more efficient.
Insertion Sort
Selection Sort
Merge Sort
Quick Sort
1.14 · 105
2.21 · 105
5,445
7,155
22
26
59
26
Complexity of Sorting Algorithms For 666 Items
Comparisons
Lines of code
page 15 of 15