$35
Project 7 - Priority Queues & Heaps
Description
Fig A: Priority Queue
Fig B: Min-Heap and Max-Heap
For this project, you will build a priority queue that is implemented as both a binary min-heap and binary max-heap. It will default to a min-heap. Essentially, a priority queue is a min-heap or max-heap, but instead of ordering itself based on value, it orders itself based on priority. You will then use this priority queue to implement a max-heap. The min-heap is taken care of for you, and is based on your max-heap implementation. As in, if your max-heap is correct, your min-heap will be too.
Min-heaps are a data structure that is a complete tree satisfying the heap property: Every child node must be larger than or equal to it's parent. This means the smallest node is always at the root of the tree. Max-heaps are the same as min-heaps, with a different heap property: Every child node must be smaller than or equal to it's parent. This means the largest node is always at the root of the tree.
You will be using these two data structures to implement the priority queue, which is a complete tree with nodes having a key-value pair. The key denotes the node's priority, and value denotes the data the node holds. For a min-heap, the node with the smallest priority will always be at the root of the tree, and vice versa for the max heap.
The priority queue will use a list to store data. Despite the underlying data structure using a list, it can be represented as a tree (Fig. A). The list will hold MinNodes or MaxNodes (which you will implement). The node's ordering in the list determines the tree hierarchy: root is at index 0. Its left child is at index 1, and the right child at index 2. Etc.
Assignment Overview + Clarification of Classes
First, you will implement MinNode and MaxNode classes.
MinNode vs MaxNode:
MinNode:
Is a PriorityNode - has a priority and value.
Despite the name, this class would be used within a priority queue that uses a min-heap, not within a normal min-heap. Normal min-heaps do not need nodes typically.
Smaller numbers signify higher priority. Thus, nodes with smaller priorities and values are higher in the priority queue.
MaxNode:
Is a PriorityNode - has a priority and value.Despite the name, this class would be used within a priority queue that uses a max-heap, not within a normal max-heap.
Larger numbers signify higher priority. Thus, nodes with larger priorities and values are higher in the priority queue.
A PriorityNode has a priority and a value. The value is the node's data, while the priority is used to determine the node's ordering in the priority queue. MinNodes and MaxNodes are PriorityNodes - they have a priority and a value - but they change how the node is sorted within the priority queue. They do this by overriding the __lt__ and __gt__ magic functions.
Both of these nodes are to be used within a priority queue. A MinNode is used to satisfy the min-heap property, and a MaxNode is used to satisfy the max-heap property. This is done so that the priority queue can be implemented as both a min-heap or max-heap simply by changing the type of node it uses (MinNode vs MaxNode).
A MinNode wants lower priority nodes to be higher in the priority queue (children greater than the parent). A MaxNode wants to be sorted so that higher priority nodes will be higher in the priority queue (children less than parent). If the priorities are the same, the ordering is based on the value of each node - as in the value will act as the priority. If the node's have the same priority and value, then the nodes are already in the proper order relative to each other.
Next, you will be implementing a priority queue as a min-heap and max-heap, but with PriorityNodes (MinNode and MaxNode). You will then be implementing class MaxHeap, which will alter how values are pushed and popped from the heap (from the priority queue). It uses a PriorityQueue. You do not have to implement MinHeap. This is done for you, and will work if you properly implement MaxHeap.
Heaps vs Priority Queue:
Heaps:
Is either a complete tree satisfying the min-heap or a max-heap properties
Values are pushed onto the "heap"
Common implementation's use a list to store valuesThe list is known as the "heap"
Does not typically use nodes, but rather just the values. We use Nodes where priority = value
Order is determined by the values
Priority Queue:
Complete tree implemented as either a min-heap, a max-heap, or both
Uses a min-heap or max-heap to push Nodes onto the "heap"
Common implementation's use a list to store values.The list is known as the "heap"
Uses nodes to keep track of a priority and a value (data)
Order is determined by the node's priorities. (If priorities are the same, common implementations sort on value, which was added first, or arbitrary (undefined). We use value as the second condition for ordering)
PriorityQueue will take a parameter at initialization (is_min) to determine whether it uses/is a min-heap or a max-heap. This parameter determines if a MinNode (min-heap) or a MaxNode (max-heap) should be used within the priority queue. The priority queue pushes priorities and values onto the heap by creating MinNodes and MaxNodes and pushing those to the heap (both are PriorityNodes, just with different behavior based on __lt__ and __gt__).
The MaxHeap will use a PriorityQueue, initializing the priority queue to act as a MaxHeap. MaxHeap does not use priorities. To do properly push onto the PriorityQueue, values should be pushed in a way that ignores priority. When popping from the MaxHeap, it will not return a node, just the value that was pushed onto the heap to begin with. Peeking at the heap will also need to return a value instead of a node. Essentially, the MaxHeap is abstracting away (hiding) the priority aspect of PriorityQueue.
Assignment Specifications
Node classes:
PriorityNode (Do Not Modify)
| ---- MinNode (Implement)
| ---- MaxNode (Implement)
class PriorityNode:
Implementation of a Priority Node - a Heap Node with a Priority
DO NOT MODIFY the following attributes/functions
Attributespriority: Node's priority - determines its ordering in the priority queue.
value: Node's value - Data associated with node. Can be anything. Int, float, dict, str, or another object.
__init__(self, Priority: Any, val: Any)Abstract function - you are not able to initialize a PriorityNode. Instead you must initialize the subclasses below (MaxNode and MinNode)
priority: Priority to be stored in the node
value: Data to be stored in the node
__str__(self) -> str and __repr__(self) -> strThis method is inherited by MinNode and MaxNode. Hence, they are use this as their __str__ method.
Represents nodes as a string in the form <priority_of_node, value_data_held_by_node>. Thus, <1, 10> indicates a Min or Max node object with a priority of 1 holding a value of 10 as an integer, whereas <1, None> indicates a node with a priority of 1 holding a value of None.
Note that Python will automatically invoke this function when using printing a Node to the console, and PyCharm will automatically invoke this function when displaying a Node in the debugger.
Call this with str(node) (rather than node.__str__()).
Returns: str
__eq__(self, other: "PriorityNode") -> boolEquality comparator for when priority nodes are equal
other: second node to compare self to
Returns: True if the nodes are equal, False if otherwise
class MaxNode:
Implementation of a priority node for a priority queue max-heap.Nodes with higher priority values are at the top of the priority queue.
Will help to implement MinNode first.
DO NOT MODIFY the following attributes/functions
Attributespriority: Node's priority - determines its ordering in the priority queue.
value: Node's value - Data associated with node. Can be anything. Int, float, dict, str, or another object.
__init__(self, Priority: Any, val: Any)Upcalls to PriorityNode constructor. Constructs a MaxNode.
Object constructor - this function is called when initializing an instance of this class. Create a MaxNode by doing:node = MaxNode(priority, value)
priority: Priority to be stored in the node
value: Data to be stored in the node
IMPLEMENT the following functions
__lt__(self, other: "PriorityNode") -> boolLess than comparator. Determines if self is greater than other node, and thus should be sorted lower than other node in priority queue.
Defines behavior for the less-than operator "<". Compare nodes by doing:node1 < node2
Tip: Because PriorityQueue defaults to a min-heap, this function does the opposite of what is expected from a __lt__ function. Read below:
To implement this properly, it should return the opposite of what is expected from __lt__ / the opposite as MinNode. As in, if self is less than other, this function should return false.
Less means that self has a priority lower in magnitude than other, or if the priorities are the same, self has a lower value than other. If this is the case, return False.
Examples:node <5, 1> is greater than node <2, 2> because priority 5 is greater than priority 2. Thus, return True.
node <1, 5> is greater than node <1, 2> because the priorities are the same and value 5 is greater than value 2. Thus return True.
node <1, 1> is not greater than node <1, 1> because they have equal priorities and values.
other: node to compare self to
Returns: bool - False if self is less than other, True if self is greater than other
Time Complexity: O(1)
Space Complexity: O(1)
__gt__(self, other: "PriorityNode") -> bool
Greater than comparator. Determines if self node is less than other node, and thus should be sorted higher than other node in priority queue.
Defines behavior for the greater-than operator ">". Compare nodes by doing:node1 > node2
Tip: Because PriorityQueue defaults to a min-heap, this function does the opposite of what is expected from a __gt__ function. Read below:
To implement this properly, it should return the opposite of what is expected from __gt__ / the opposite as MinNode. As in, if self is greater than other, this function should return false.Greater means that self has a priority greater in magnitude than other, or if the priorities are the same, self has a greater value than other. If this is the case, return False.
Examples:node <1, 1> is less than node <2, 2> because priority 1 is less than priority 2. Thus, return True.
node <1, 1> is less than node <1, 2> because the priorities are the same and value 1 is less than value 2. Thus return True.
node <1, 1> is not less than node <1, 1> because they have equal priorities and values.
other: node to compare self to
Returns: bool - False if self is greater than other, True if self is less than other
Time Complexity: O(1)
Space Complexity: O(1)
class MinNode:
Implementation of a priority node for a priority queue min-heap.Nodes with lower priority values are at the top of the priority queue.
DO NOT MODIFY the following attributes/functions
Attributespriority: Node's priority - determines its ordering in the priority queue.
value: Node's value - Data associated with node. Can be anything. Int, float, dict, str, or another object.
__init__(self, Priority: Any, val: Any)Upcalls to PriorityNode constructor. Constructs a MinNode.
priority: Priority to be stored in the node
value: Data to be stored in the node
IMPLEMENT the following functions
__lt__(self, other: "PriorityNode") -> bool
Less than comparator. Determines if self node is less than other node, and thus should be sorted higher than other node in priority queue.
Defines behavior for the less-than operator "<". Compare nodes by doing:node1 < node2
Less means that self has a priority lower in magnitude than other, or if the priorities are the same, self has a lower value than other. If this is the case, the function returns true.Examples:node <1, 1> is less than node <2, 2> because priority 1 is less than priority 2. Thus, return True.
node <1, 1> is less than node <1, 2> because the priorities are the same and value 1 is less than value 2. Thus return True.
node <1, 1> is not less than node <1, 1> because they have equal priorities and values.
other: node to compare self to
Returns: bool - True if self is less than other, False is self is greater than other
Time Complexity: O(1)
Space Complexity: O(1)
__gt__(self, other: "PriorityNode") -> boolGreater than comparator. Determines if self node is greater than other node, and thus should be sorted lower than other node in priority queue.
Defines behavior for the greater-than operator ">". Compare nodes by doing:node1 > node2
Greater means that self has a priority higher in magnitude than other, or if the priorities are the same, self has a higher value than other. If this is the case, the function returns true.Examples:node <5, 1> is greater than node <2, 2> because priority 5 is greater than priority 2. Thus, return True.
node <1, 5> is greater than node <1, 2> because the priorities are the same and value 5 is greater than value 2. Thus return True.
node <1, 1> is not greater than node <1, 1> because they have equal priorities and values.
other: node to compare self to
Returns: bool - True if self is greater than other, False if other is less than other
Time Complexity: O(1)
Space Complexity: O(1)
Priority Queue/ Heap Classes:
PriorityQueue (Implement)
MaxHeap (Implement)
| ---- MinHeap (Do Not Modify)
class PriorityQueue:
Implementation of a priority queue - the highest/lowest priority elements are at the front (root). Can act as a min or max-heap.
DO NOT MODIFY the following attributes/functions
Attributes_data: List that holds all nodes on heap
_is_min: Bool - True if priority queue is a min-heap, False if max-heap
__init__(self, is_min: bool = True)Constructs the priority queue
Is_min: If the priority queue acts as a priority min or max-heap.
__str__(self) -> str and __repr__(self) -> str
Represents the priority queue as a string
Call this with str(pqueue) (rather than pqueue.__str__()).
Returns: str - string representation of the heap
Time Complexity: O(N)
Space Complexity: O(N)
to_tree_str(self) -> strRepresents the priority queue as a string in the format of a tree:
Example:
root
/ \
left_child right_child
Returns: str - String to print
Time Complexity: O(N)
Space Complexity: O(N)
is_min_heap(self) -> boolCheck if priority queue is a min or a max-heap
Returns: True if min-heap, False if max-heap
Time Complexity: O(1)
Space Complexity: O(1)
IMPLEMENT the following functions
__len__(self) -> intDetermine the amount of nodes on the heap
Call this with len(pqueue) (rather than pqueue.__len__()).
Returns: The amount of nodes in the priority queue.
Time Complexity: O(1)
Space Complexity: O(1)
empty(self) -> boolPublic method
Checks if the heap is empty
Returns: bool - True if Empty, else False
Time Complexity: O(1)
Space Complexity: O(1)
peek(self) -> PriorityNode
Public method
Gets the root node (min or max node)
Returns: MinNode or MaxNode - None if heap is empty, else root node
Time Complexity: O(1)
Space Complexity: O(1)
get_left_child_index(self, index: int) -> int
Gets the specified parent node's left child index
Index: Index of parent node
Returns: Index of left child or None if it does not exist (invalid)
Time Complexity: O(1)
Space Complexity: O(1)
get_right_child_index(self, index: int) -> intDesc: Gets the specified parent node's right child index
Index: Index of parent node
Returns: Index of right child or None if it does not exist (invalid)
Time Complexity: O(1)
Space Complexity: O(1)
get_parent_index(self, index: int) -> intGets the specified child node's parent index
Index: Index of child node
Returns: Index of parent or None if does not exist (root)
Time Complexity: O(1)
Spacestring representation of the heap Complexity: O(1)
push (self, priority: Any, val: Any) -> NonePublic method
Inserts a node with the specified priority/value pair onto the heap
Hint: Use one of the percolate functions here
Priority: Node's priority
Val: Node's value
Returns: None
Time Complexity: O(log[N])*
Space Complexity: O(1)
pop(self) -> PriorityNode
Public method
Removes the top priority node from heap (min or max element)
Hint: Use one of the percolate functions here
Returns: MinNode or MaxNode - The root node of the heap (min or max element)
Time Complexity: O(log(N))*
Space Complexity: O(1)
get_minmax_child_index(self, index: int) -> intGets the specified parent's min (min-heap) or max (max-heap) child index
Index: Index of parent element
Returns: Index of min child (if min-heap) or max child (if max-heap) or None if invalid
Time Complexity: O(1)
Space Complexity: O(1)
percolate_up(self, index: int) -> NoneMoves a node in the queue/heap up to its correct position (level in the tree).
Index: Index of node to be percolated up
Returns: None
Time Complexity: O(log(N))
Space Complexity: O(1)
percolate_down(self, index: int) -> NoneMoves a node in the queue/heap down to its correct position (level in the tree).
Index: Index of node to be percolated down
Returns: None
Time Complexity: O(log(N))
Space Complexity: O(1)
class MaxHeap:
Implementation of a max-heap - the highest value is at the front (root).
Initializes a PriorityQueue with is_min set to False.
Uses the priority queue to satisfy the min heap properties by initializing the priority queue as a max-heap, and then using value as both the priority and value.
DO NOT MODIFY the following attributes/functions
Attributes_pqueue: PriorityQueue to use as a MaxHeap
__init__(self)
Constructs the max-heap
Upcalls PriorityQueue constructor with _is_min set to False
__str__(self) -> str and __repr__(self) -> strRepresents the max-heap as a string
Call this with str(maxheap) (rather than maxheap.__str__()).
Returns: str - String representation of the heap
Time Complexity: O(N)
Space Complexity: O(N)
to_tree_str(self) -> strRepresents the priority queue as a string in the format of a tree:
Example:
root
/ \
left_child right_child
Returns: str - String to print
Time Complexity: O(N)
Space Complexity: O(N)
__len__(self) -> int
Determine the amount of nodes on the heap
Call this with len(maxheap) (rather than maxheap.__len__()).
Returns: The amount of nodes in the priority queue.
Time Complexity: O(1)
Space Complexity: O(1)
empty(self) -> boolPublic method
Checks if the heap is empty
Returns: bool - True if Empty, else False
Time Complexity: O(1)
Space Complexity: O(1)
IMPLEMENT the following functions
peek(self) -> AnyPublic method
Gets the max element's value (root node's value)
You may not access private member variables of PriorityQueue. Doing so will result in a 0 for this function and any related testcases manual or otherwise.
Returns: None if heap is empty, else root's value
Time Complexity: O(1)
Space Complexity: O(1)
push (self, Val: Any) -> NonePublic method
Inserts a node with the specified value onto the heap
You may not access private member variables of PriorityQueue. Doing so will result in a 0 for this function and any related testcases manual or otherwise.
Val: Node's value
Returns: None
Time Complexity: O(log[N])*
Space Complexity: O(1)
pop(self) -> AnyPublic method
Removes the max element from the heap
You may not access private member variables of PriorityQueue. Doing so will result in a 0 for this function and any related testcases manual or otherwise.
Returns: Value of max element
Time Complexity: O(log(N))*
Space Complexity: O(1)
class MinHeap(MaxHeap):
Implementation of a max-heap - the highest value is at the front (root).
Initializes a PriorityQueue with is_min set to True.
Inherits from MaxHeap because it uses the same exact functions, but instead has a priority queue with a min-heap.
DO NOT MODIFY the following attributes/functions
Attributes_pqueue: PriorityQueue to use as a MinHeap
__init__(self)
Constructs the max-heap
Upcalls PriorityQueue constructor with _is_min set to False
__str__(self) -> str and __repr__(self) -> str
Represents the max-heap as a string
Call this with str(maxheap) (rather than maxheap.__str__()).
Returns: str - String representation of the heap
Time Complexity: O(N)
Space Complexity: O(N)
*Amortized Time Complexity
Heap Sort:
Heap sort is an algorithm that sorts using a Binary Heap data structure. The root node of a heap is guaranteed to be the next smallest element (min-heap) or next largest element (max-heap) which is a useful property for sorting a list.
heap_sort(array: List[Any]) -> None
Sort array in-place using heap sort algorithm w/ max-heap
You are not allowed to create a new list and append values onto it. You must modify the original array (in-place).
CREATING A NEW LIST WILL RESULT IN ZERO POINTS FOR THIS FUNCTION (ITS TEST CASES AND THE MANUAL GRADING)
You may not access private member variables of PriorityQueue, MinHeap, or MaxHeap. Doing so will result result in a 0 points for this function's test case.
Array: List to be sorted
Returns: None
Time Complexity: O(Nlog(N))
Space Complexity: O(N) for MaxHeap only, you are not allowed to create a list.
Application Problem
Angelo wants to determine how project difficulty changes throughout the semester. To do this, he wants to review all student feedback in real-time so he can determine the course's difficulty in realtime, and at any previous point in the semester.
Course difficulty is measured by the median amount of time student's spend on the projects. If students spend more time on a project, the median will be higher, and one can assume the project is more difficult. The median is used because it is less affected by outliers. The amount of time student's spend on a project is recorded every time they submit their README feedback.
You are being tasked with writing this program. You need to calculate and record the median at every point in time - AKA every time a student submits their README feedback. This will allow for Angelo to graph course difficulty over time.
You are given a list of values to read in one by one. Find an efficient way to keep track of the median after each value is read in. Return this as a list of medians in the order calculated.
The median of a list of values is defined as the middle number after sorting them in order. If the there is an odd number of values the median is the average of the middle two numbers.
Example:
current_medians( [2, 8, 35, 9] ) -> [2, 5, 8, 8.5]
As each number from the data was read in, a median of current data was calculated and added to the return list.
After 2 was read in, the median was 2
After 8 was read in, the median was 5 because the median of 2 and 8 is 5
After 35 was read in, the median was 8 because the median of 2, 8, and 35 is 8
And so on....
current_medians(array: List[int]) -> List[int]
Must use PriorityQueue, MinHeap, and/or MaxHeap. Failure to do so will result in a zero for this function.
You may not access private member variables of PriorityQueue, MinHeap, or MaxHeap.
You may not use heap_sort.
Array: A list of numeric values
Returns: List of current medians in order data was read in
Time Complexity: O(Nlog(N))N is the number of values in all lists
Function must strictly be of the given time complexity
Space Complexity: O(N]
Turning It In/ Deliverables
Be sure to upload the following deliverables in a .zip folder to Mimir by 11:59p ET on Friday, April 2nd.
Project7.zip
|— Project7/
|— __init__.py (for proper Mimir testcase loading)
|— README.xml (for project feedback)
|— PriorityQueue.py (your solution source code for pqueues, heaps, application)
|— PriorityNode.py (your solution source code for min and max node)
__init__.py, a python3 file.
PriorityQueue.py, a python3 file.
PriorityNode.py, a python3 file.
README.xml, a text file that includes:Your name
Feedback on the project
How long it took to complete
Project difficulty
A list of any external resources that you used, especially websites (make sure to include the URLs) and which function you used this information for. Please note that you can not use outside resources to solve the entire project, or use outside sources for multiple functions that leads to the solution of the project, then you are submitting someone else's work not yours. Outside sources use should not be used for more than one function, and it should not be more than few lines of code to solve that function. Please note that there is no penalty for using the programming codes shared by Onsay in Lectures and posted on D2L.
Assignment Notes
No use of the module heapq allowed.
You must write docstrings for every completed function.
The only containers allowed are lists to be used in the Application Problem and heap_sort functions.
You may not access member variables with leading underscores outside their class.
Methods/attributes with leading underscores cannot be called outside of the class definition.
Keys and values can be ints, strings, or floats.
String methods are provided for debugging purposes.
Some heaps start at an index of one. However, for this assignment, indexing will start at index zero, just like normal lists.
Using/doing any of the following will result in a 0 for the function/class/assignment:
heapq module
sorted()
Initializing a list in heap_sort (besides the list that is initialized within MaxHeap)
Important Resources
Magic Methods
Inheritance overview:Base classes + inheritance
Extending base class with new methods
Method overriding of base class methods
super():Used to upcall in base classes.
Amortized Time Complexity
Rubric/ Grading
Tests(70)Coding Standard: __/5
MinNode & MaxNode: (2)
Min/Max Nodes: __/2
PriorityQueue: (40)Simple: __/1
Left/Right Child Index: __/1
Parent & Min/Max Child Index: __/1
Percolate Up: __/4
Percolate Down: __/4
Push: __/2
Push Comprehensive: __/6
Pop: __/2
Pop Comprehensive: __/6
Comprehensive: __/8
MaxHeap: (3)MaxHeap: __/3
HeapSort: (5)HeapSort: __/5
Current Medians: (15)
Simple: __/2
Current Medians: __/5
Comprehensive: __/8
Manual (30)
Time Complexity: (15)
__len__, empty, peek __/1 (All or nothing)
get_left_child_index __/0.5
get_right_child_index __/0.5
get_parent_index __/0.5
get_minmax_child_index __/0.5
percolate_up __/1
percolate_down __/1
pop __/2
push __/2
heap_sort __/3
current_medians __/3
Space Complexity: (10)
All non-listed methods* __/2 (All or nothing)
pop __/1
push __/1
heap_sort __/3
current_medians __/3
README.xml is completely filled out with (1) Name, (2) Feedback, (3) Difficulty, (4) Time to Completion, (5) Citations: __/5
* __len__, empty, peek, get_left_child_index, get_right_child_index, get_parent_index, get_minmax_child_index.