$30
1
CSE 2320 - Homework 9
Total points: 100 Topics: Graph adjacency matrix representation, connected components, hash tables, quicksort, radix sort, count sort, Timsort.
P1 (40 pts) Assume a graph represents a social network: the vertices are people and the edges are friendship relation between them.
Implement a program that does the following:
- reads the graph with input redirection.
- prints the graph to verify it was loaded correctly.
- find the connected components and label them with numbers (starting at 1) and
- print the connected components. In particular print the names of the people in that connected component.
You can assume the graph is an undirected graph.
The graph data is given in the order:
- N (number of vertices)
- Vertices: N lines each containing one string with a vertex name.
- Edges: A number of lines in format: “name1 name 2”. The line “-1 -1” indicates the end of lines with edges.
Files: data1.txt, run1.txt.
Save the program in a file called graph.c.
Hint: you can adapt the DFS (Depth First Search) algorithm to label connected components.
P2. (7 points) Is Quick_Sort (as given in CLRS, page 171) stable?
If yes, prove it. If no, give an example array, A, (however small or big), sort it with Quick_Sort, and show what the algorithm does that makes it not
stable. Use the original array and the final, sorted array to base your proof (do not base your proof on a partially sorted array).
Hint: Focus on the pivot jump.
2
P3. (7 points) Given the array A = < 8, 6, 9, 2, 7, 1, 5, 10, 6 , using the Quicksort Lecture or Figure 7.1, CLRS page 172, as a model, show the
execution of the Partition function. Show the <= partition by circling the last element of it and show the partition by putting a square around the
last element of it.
0 1 2 3 4 5 6 7 8
Original
array:
8 6 9 2 7 1 5 10 6
P4. (9 points) (Radix sort)
Show how LSD radix sort sorts the following numbers in the given representation (base 10). Show the numbers after each complete round of count
sort.
Index: 0 1 2 3 4 5 6
Original
Array:
513 145 320 235 141 433 2
3
P5. (10 points) Count sort, radix sort
Estimate the performance of count sort and radix sort for the following problems and set-ups. Assume you are dealing with numbers on b=32 bits.
Only positive values are allowed (so there is no sign bit, all 32 bits are used to represent the positive value). Let:
- b – number of bits
- k - number of different possible values of keys
- r – number of bits used for a specific radix
- N – size of array to be sorted.
- d – number of digits in a certain base (on r-bits) representation.
Continue to fill-in the table. Express every term as a power of 2. In the Θ use the values for the terms. E.g. N2
for N=8 would be replaced with
2
6
(=64). In order to compare quantities easily, we express everything as a power of 2 in the table below (see 7≈2
2.8).
N Count sort Radix sort for r = 5 bits Optimal radix sort (with optimal r) Best method
N k Θ Dominant
term
k d Θ Dominant
term
r k d Θ Dominant
term
Smallest
dominant
term
Method
8=23
2
32 2
3
+232 2
32 2
5
7≈22.8 2
2.8(23
+2
5
) 2
7.8
1015≈250
P6. (10 points) self study of Timsort Read this article, and answer the Timsort questions below based on it.
(2pt) Timsort was created by:………………………….. It is used as the default sorting algorithm for: ……………………………..………………..
(3pts) Time complexity: Best case: Ω(………….) Average case: Θ(………………..) Worst case: O(………………)
(1pts) What two sorting algorithms does it combine? ……………………………………………………………………………….
(2 pt) Circle your answer: Is it stable? YES / No Does it do well on arrays with preexisting structure? YES / No
(1 pt) What data does “~ sort” indicate in Tim Peters’s introduction to Timsort found here? …………………………………………………………..
(1pt) This Wikipedia article discusses a bug found in Timsort. What Java error does that bug produce? …………………………………………..
4
P6. (5 points) Hashing
The images below show the occupancy of two hash tables. Both tables have the same size, the same items hashed in them, and both use open
addressing. However they differ in the way they find an available slot in the table. Which one is a better hash table and why? (You do NOT have to
deduce how they find the next available slot. You simply have to judge which one would behave better based on this image.)
P7. (12 points) Hashing
a) (5 points) Give an example of a bad hash function for strings (that generates many collisions). Justify why it is bad: find some strings that will hash
to the same cell.
b) (7 points) In the hash table below * and + indicate the cells probed when trying to insert two different items. These items are originally hashed to
the same slot (see +* at index 5). The star, *, shows the probed cells for item and the plus, +, shows the probed cells for the other item. You can
assume that the table size is very big (e.g. more than 1000) and the slots shown did not require a mod (%) operation (that is we did not have to wrap
around).
5
1) What type of open addressing was used? Justify.
2) Give the next slots to be checked for each item (show where the next * and where the next + will be).
Remember to include your name at the top.
Write your answers in this document or a new document called 2320_H9.pdf. It can be hand-written and scanned, but it must be uploaded
electronically. Place 2320_H9.pdf and graph.c in a folder called hw9, zip that and send it.
Index
… …
5 + *
… …
8 +
9 *
… …
11 +
12
13 *
14 +
… …
17 *
… …