Starting from:

$30

Algorithms for Data Science Homework 5


685.621 Algorithms for Data Science
Homework 5
Assigned at the start of Module 9
Due at the end of Module 11
Total Points 100/100
You will be self enrolling into one of the collaborative groups, either problem 3 or 4. Problem 3 will
have a part a and b while Problem 4 will have a part a and b as well as either JAVA, Python or R. You
will not be required to complete both problem 3 and 4, rather decide if you want to complete Problem
3 which is a theory problem or Problem 4 that is a coding problem. Collaboration groups have been
set up in Blackboard. Make sure your group starts an individual thread for each collaborative problem
and subproblem. You are required to participate in each of the collaborative problem and subproblem.
Do not directly post a complete solution, the goal is for the group to develop a solution after everyone
has participated.
Problems for Grading
1. [25 points] The following problem is known as the Dutch Flag Problem. In this problem, the
task is to rearrange an array of characters R, W, and B (for red, white, and blue, which are the
colors of the Dutch national flag) so that all the R’s come first, all the W’s come next, and all
the B’s come last. Design a linear, in-place algorithm that solves this problem.
2. [25 points] Give an O(n lg k)-time algorithm to merge k sorted lists into one sorted list, where n
is the total number of elements in all the input lists. (Hints: Use a heap for k-way merging.)
3. [50 points] Note this is a Collaborative Problem
(a) [25 points] Note this is a Collaborative Problem
Consider the following algorithm for doing a postorder traversal of a binary tree with root
vertex root.
Algorithm 1 Postorder Traversal
function Postorder(root)
if root 6= null then
Postorder(root.lef t)
Postorder(root.right)
visit root
end if
end function
Prove that this algorithm run in time Θ(n) when the input is an n-vertex binary tree.
(b) [25 points] Note this is a Collaborative Problem
We define an AVL binary search tree to be a tree with the binary search tree property
where, for each node in the tree, the height of its children differs by no more than 1. For 1
this problem, assume we have a team of biologists that keep information about DNA sequences in an AVL binary search tree using the specific weight (an integer) of the structure
as the key. The biologists routinely ask questions of the type, “Are there any structures in
the tree with specific weight between a and b, inclusive?” and they hope to get an answer
as soon as possible. Design an efficient algorithm that, given integers a and b, returns true
if there exists a key x in the tree such that a ≤ x ≤ b, and false if no such tree exists in the
tree. Describe your algorithm in pseudocode and English. What is the time complexity of
your algorithm? Explain.
4. [50 points] Note this is a Collaborative Problem
(a) [25 points] Note this is a Collaborative Problem
Using either the Radial Basis Neural Network, Feed Forward Neural Network or Probabilistic Neural Network, classify the IRIS data set and the data generated in Problem 3 from
HW 4 to determine how well your classifier works on both data sets.
(b) [25 points] Note this is a Collaborative Problem
Using the Support Vector Machine, classify the IRIS data set and the data generated in
Problem 3 from HW 4 to determine how well your classifier works on both data sets.
5. [10 Bonus Points]
Using your implementation of EM or the built-in EM algorithm add the Bayes Classifier of
Equation 8 in the Machine Learning I document. Use the IRIS data set and the data generated
in Problem 3 from HW 4 to determine how well your classifier works. Compare the results against
the Neural Network algorithm and the Support Vector Machine.
2

More products