$35
Project 5: Graphs
Assignment Overview
Graphs are particularly useful data structures for modeling connections and relationships among objects. In fact, you've likely used an application that relies on graphical modeling today - a few examples include
Facebook / Twitter / InstagramUsers are modeled as vertices storing posts, photos, videos, etc.
Edges are modeled as "friendships," "followers," "likes," "favorites," etc.
Google / Apple / Bing MapsIntersections, cities, and other points of interest are modeled by vertices
Road segments between intersections are modeled by weighted edges, where weights represent the relative speed/traffic of the road segment
You will be implementing a directed, weighted Graph ADT using the adjacency map design, in which a graph object consists of a map (ordered dictionary) of vertices, and each vertex holds its own map (ordered dictionary) of adjacent vertices, i.e. vertices to which that vertex is connected by an outgoing edge.
In some ways, this project also serves as a capstone to the course- in completing it one utilizes recursion, queues, two-dimensional arrays, hash maps, dynamic programming, and more. You may also notice that trees, linked lists, and even heaps are special cases of the general graph structure and that many graph algorithms can be applied to these earlier structures without modification. To highlight this inheritance, consider the inorder traversal algorithm we applied to AVL trees - really, it was nothing more than a graph depth-first search with a tendency to go left before right.
The goal of this project is to introduce the versatile and flexible nature of graphs, along with the operations and search algorithms that make them so useful.
Assignment Notes
A plotting function is provided to help you visualize the progression of various search algorithmsBe sure to read the specs explaining plot()
If you don't want to use it, just comment out the related import statements and plot() function
Python allows representation of the value infinity using float('inf')
No negative edge weights will ever be added to the graphAll edge weights are numeric values greater than or equal to zero
Time complexities are specified in terms of V and E, where V represents the number of vertices in the graph and E represents the number of edges in a graphRecall that E is bounded above by V^2; a graph has E = V^2 edges if and only if every vertex is connected to every other vertex
Recall that list.insert(0, element) and list.pop(0) are both O(N) calls on a Python listRecall that python's 'lists' are not lists in the more common sense of the word: linked lists. They are dynamically managed tuples, stored in memory as contiguous arrays of pointers to elements elsewhere in memory. This allows indexing into a 'list' in constant time. The downside of this, however, is that adding to a python 'list' at a specific index, i, requires shifting the pointer to every element past i by one in the underlying array: a linear operation.
Be careful when implementing bfs, dfs, and the Application Problem to ensure you do not break time complexity by popping or inserting from the front of a list when reconstructing a path
Instead of inserting into / popping from the front of the list, simply append to or pop from the end, then reverse the list once at the endIf you have N calls to list.insert(0, element), that is O(N^2)
If you instead have N calls to list.append(element), followed by a single call to list.reverse(), that is O(N)
Both methods will result in the same list being constructed, but the second is far more efficient
Assignment Specifications
class Vertex:
Represents a vertex object, the building block of a graph.
DO NOT MODIFY the following attributes/functions
Attributesid: A string used to uniquely identify a vertex
adj: A dictionary of type {other_id : number} which represents the connections of a vertex to other vertices; the existence of an entry with key other_id indicates connection from this vertex to the vertex with id other_id by an edge with weight numberNote that as of Python 3.7, insertion ordering in normal dictionaries is guaranteed and ensures traversals will select the next neighbor to visit deterministically
visited: A boolean flag used in search algorithms to indicate that the vertex has been visited
x: The x-position of a vertex (used in Application Problem) (defaults to zero)
y: The y-position of a vertex (used in Application Problem) (defaults to zero)
__init__(self, idx: str, x: float=0, y: float=0) -> None:
Constructs a Vertex object
__eq__(self, other: Vertex) -> bool:Compares this vertex for equality with another vertex
__repr__(self) -> str:Represents the vertex as a string for debugging
__str__(self) -> str:Represents the vertex as a string for debugging
__hash__(self) -> int:Allows the vertex to be hashed into a set; used in unit tests
IMPLEMENT the following functions
degree(self) -> int:Returns the number of outgoing edges from this vertex; i.e., the outgoing degree of this vertex
Time Complexity: O(1)
Space Complexity: O(1)
get_edges(self) -> Set[Tuple[str, float]]:Returns a set of tuples representing outgoing edges from this vertex
Edges are represented as tuples (other_id, weight) where other_id is the unique string id of the destination vertex
weight is the weight of the edge connecting this vertex to the other vertex
Returns an empty set if this vertex has no outgoing edges
Time Complexity: O(degV)
Space Complexity: O(degV)
euclidean_distance(self, other: Vertex) -> float:Returns the euclidean distance [based on two-dimensional coordinates] between this vertex and vertex other
Used in Application problem
Time Complexity: O(1)
Space Complexity: O(1)
taxicab_distance(self, other: Vertex) -> float:Returns the taxicab distance [based on two-dimensional coordinates] between this vertex and vertex other
Used in Application problem
Time Complexity: O(1)
Space Complexity: O(1)
class Graph:
Represents a graph object
DO NOT MODIFY the following attributes/functions
Attributessize: The number of vertices in the graph
vertices: A dictionary of type {id : Vertex} storing the vertices of the graph, where id represents the unique string id of a Vertex objectNote that as of Python 3.7, insertion ordering in normal dictionaries is guaranteed and ensures get_edges(self) and get_vertices(self) will return deterministically ordered lists
plot_show: If true, calls to plot() display a rendering of the graph in matplotlib; if false, all calls to plot() are ignored (see plot() below)
plot_delay: Length of delay in plot() (see plot() below)
__init__(self, plt_show: bool=False) -> None:
Constructs a Graph object
Sets self.plot_show to False by default
__eq__(self, other: Graph) -> bool:Compares this graph for equality with another graph
__repr__(self) -> str:Represents the graph as a string for debugging
__str__(self) -> str:Represents the graph as a string for debugging
add_to_graph(self, start_id: str, dest_id: str=None, weight: float=0) -> float:Adds a vertex / vertices / edge to the graphAdds a vertex with id start_id to the graph if no such vertex exists
Adds a vertex with id dest_id to the graph if no such vertex exists and dest_id is not None
Adds an edge with weight weight if dest_id is not None
If a vertex with id start_id or dest_id already exists in the graph, this function will not overwrite that vertex with a new one
If an edge already exists from vertex with id start_id to vertex with id dest_id, this function will overwrite the weight of that edge
matrix2graph(self, matrix: Matrix) -> None:Constructs a graph from a given adjacency matrix representation
matrix is guaranteed to be a square 2D list (i.e. list of lists where # rows = # columns), of size [V+1] x [V+1]matrix[0][0] is None
The first row and first column of matrix hold string ids of vertices to be added to the graph and are symmetric, i.e. matrix[i][0] = matrix[0][i] for i = 1, ..., n
matrix[i][j] is None if no edge exists from the vertex matrix[i][0] to the vertex matrix[0][j]
matrix[i][j] is a number if an edge exists from the vertex matrix[i][0] to the vertex matrix[0][j] with weight number
graph2matrix(self) -> None:Constructs and returns an adjacency matrix from a graph
The output matches the format of matrices described in matrix2graph
If the graph is empty, returns None
graph2csv(self, filepath: str) -> None:Encodes the graph (if non-empty) in a csv file at the given location
USE the following function however you'd like
plot(self) -> None:Renders a visual representation of the graph using matplotlib and displays graphic in PyCharmFollow this tutorial to install matplotlib and numpy if you do not have them, or follow the tooltip auto-suggested by PyCharm
Provided for use in debugging
If you call this in your searches and self.plot_show is true, the search process will be animated in successive plot renderings (with time between frames controlled by self.plot_delay)
Not tested in any testcasesAll testcase graphs are constructed with self.plot_show set to False
If vertices have (x,y) coordinates specified, they will be plotted at those locations
If vertices do not have (x,y) coordinates specified, they will be plotted at a random point on the unit circle
To install the necessary packages (matplotlib and numpy), follow the auto-suggestions provided by PyCharm
Vertices and edges are labeled; edges are color-coded by weightIf a bi-directional edge exists between vertices, two color-coded weights will be displayed
IMPLEMENT the following functions
reset_vertices(self) -> None:Resets visited flags of all vertices within the graph
Used in unit tests to reset graph between searches
Time Complexity: O(V)
Space Complexity: O(V)
get_vertex(self, vertex_id: str) -> Vertex:
Returns the Vertex object with id vertex_id if it exists in the graph
Returns None if no vertex with unique id vertex_id exists
Time Complexity: O(1)
Space Complexity: O(1)
get_vertices(self) -> Set[Vertex]:
Returns a set of all Vertex objects held in the graph
Returns an empty set if no vertices are held in the graph
Time Complexity: O(V)
Space Complexity: O(V)
get_edge(self, start_id: str, dest_id: str) -> Tuple[str, str, float]:
Returns the edge connecting the vertex with id start_id to the vertex with id dest_id in a tuple of the form (start_id, dest_id, weight)
If the edge or either of the associated vertices does not exist in the graph, returns None
Time Complexity: O(1)
Space Complexity: O(1)
get_edges(self) -> Set[Tuple[str, str, float]]:Returns a set of tuples representing all edges within the graph
Edges are represented as tuples (start_id, other_id, weight) where start_id is the unique string id of the starting vertex
other_id is the unique string id of the destination vertex
weight is the weight of the edge connecting the starting vertex to the destination vertex
Returns an empty set if the graph is empty
Time Complexity: O(V+E)
Space Complexity: O(E)
bfs(self, start_id: str, target_id: str) -> Tuple[List[str], float]:Perform a breadth-first search beginning at vertex with id start_id and terminating at vertex with id end_id
As you explore from each vertex, iterate over neighbors using vertex.adj (not vertex.get_edges()) to ensure neighbors are visited in proper order
Returns tuple of the form ([path], distance) where[path] is a list of vertex id strings beginning with start_id, terminating with end_id, and including the ids of all intermediate vertices connecting the two
distance is the sum of the weights of the edges along the [path] traveled
If no path exists from vertex with id start_id to vertex with end_id or if one of the vertices does not exist, returns tuple ([],0)
Guaranteed that start_id != target_id (since that would be a trivial path)
Because our adjacency maps use insertion ordering, neighbors will be visited in a deterministic order, and thus you do not need to worry about the order in which you visit neighbor vertices at the same depth
Use the SimpleQueue class to guarantee O(1) pushes and pops on the queue
Time Complexity: O(V+E)
Space Complexity: O(V)
dfs(self, start_id: str, target_id: str) -> Tuple[List[str], float]:Wrapper function for dfs_inner, which MUST BE CALLED within this functionThe majority of the work of dfs should be done in dfs_inner
This function makes it simpler for client code to call for a dfs
This function makes it possible to avoid inserting vertex ids at the front of the path list on path reconstruction, which leads to sub-optimal performance (see Assignment Notes)Hint: construct the path in reverse order in dfs_inner, then reverse the path in this function to optimize time complexity
Hint: call dfs_inner with current_id as start_id, then reverse the path here and return it
Time Complexity: O(V+E) (including calls to dfs_inner)
Space Complexity: O(V) (including calls to dfs_inner)
dfs_inner(self, current_id: str, target_id: str, path: List[str]=[]) -> Tuple[List[str], float]Performs the recursive work of depth-first search by searching for a path from vertex with id current_id to vertex with id target_id
MUST BE RECURSIVE
As you explore from each vertex, iterate over neighbors using vertex.adj (not vertex.get_edges()) to ensure neighbors are visited in proper order
Returns tuple of the form ([path], distance) where[path] is a list of vertex id strings beginning with start_id, terminating with end_id, and including the ids of all intermediate vertices connecting the two
distance is the sum of the weights of the edges along the [path] traveled
If no path exists from vertex with id current_id to vertex with target_id or if one of the vertices does not exist, returns tuple ([],0)
Guaranteed that start_id != target_id (since that would be a trivial path)
Because our adjacency maps use insertion ordering, neighbors will be visited in a deterministic order, and thus you do not need to worry about the order in which you visit neighbor vertices
Time Complexity: O(V+E)
Space Complexity: O(V)
detect_cycle(self) -> bool:returns True if the graph contains a cycle, otherwise returns False
Be careful with time and space complexity, they are easy to violate here
The testcase should give a good idea of what constitutes a cycle and what does not
Time Complexity: O(V+E)
Space Complexity: O(V)
Application Problem
Welcome to Among Us, the wildly popular 2018 online multiplayer social deduction game! If you aren't familiar with how Among Us is played, see here. After gathering several friends, you decide to join a lobby and start a game of Among Us!
It looks like you're a Crewmate (and always will be for the sake of this assignment)! Off to go do your tasks!
Oh, my! After completing a task, you just saw the Pink Crewmate murder somebody! They must be the impostor! You want to report the body, but Pink is too close to it and you don't want to end up like your late friend there! All hope is lost until you remember that there is an emergency button in one of the rooms on the spaceship that would also allow you to call an emergency meeting. If you are able to call an emergency meeting, then surely you would be able to convince the crew that Pink is "sus" and have them voted off for the victory!
Uh oh, Pink sees you and knows you will attempt to push the emergency button and is on their way to the button right now! You can't let them get there before you or else you won't be able to push the button! Unfortunately, you are quite new to the game and are very bad with navigating the ship; in addition, it seems that the room containing the emergency button is different every time it is pressed! However, you do have a graph representing the layout of the ship (which also tells you the current room with the emergency button) and one all-powerful ace in the hole: your incredible knowledge of path-finding algorithms!
The objective is simple: you must find the shortest path between the room you are currently in (where you witnessed a murder) and the current room with the emergency button (it is not always the same room). Everyone knows that the impostor can use vents to traverse the ship much faster than you, so your path-finding algorithm needs to be as fast as possible! Depth-first and breadth-first search won't cut it. In fact, even Dijkstra's algorithm falls short of your high-performance standards.
Your only option is to implement A* Search (A-Star Search), an algorithm that accounts for both straight-line and along-edge distance to find the shortest path in fewer iterations. Unlike Dijkstra's algorithm, A* will tend to avoid searching vertices that are close but in the wrong direction (gif from this video). In fact, you're 100% sure that if you do not use A* Search, the impostor will beat you to the emergency button leading to your demise.
What is A* Search?
Instead of searching the "next closest" vertex as is done in Dijkstra's algorithm, A* Search picks the vertex which is "next closest to the goal" by weighting vertices more cleverly.
Recall that in Dijkstra's algorithm, vertices are stored in a priority queue with a priority key equal to the current shortest path to that vertex. If we denote the current shortest path to a vertex v by g(v), then on each iteration of Dijkstra's algorithm, we search on the vertex with min(g(v)).
A* search takes the same approach to selecting the next vertex, but instead of setting the priority key of a vertex equal to g(v) and selecting min(g(v)), it uses the value f(v) and selects the vertex with min(f(v)) where
f(v) = g(v) + h(v)
= current_shortest_path_to_v + estimated_distance_between_v_and_target
In English, Please...
A* Search prioritizes vertices v to search based on the value f(v), which is the sum of
g(v), or the current shortest known path to vertex v, and
h(v), which is the estimated (Euclidean or Taxicab) distance between the vertex v and the target vertex you're searching for
The result is that A* prioritizes vertices to search that (1) are close to the origin along a known path AND which (2) are in the right direction towards the target. Vertices with a small g(v) are close to the origin along a known path and vertices with a small h(v) are in the right direction towards the target, so we pick vertices with the smallest sum of these two values.
We strongly recommend you watch this video to build your intuition behind A* Search!
[A* is extremely versatile. Here we use Euclidean and Taxicab distances to prioritize certain directions of search, but note that any metric h(v, target) could be used should the need arise. See here for more information on situations where different metrics may be practical.]
Your Task
Implement A* Search on your Graph ADT according to the following specifications
a_star(self, start_id, target_id, metric)Perform an A* search beginning at vertex with id start_id and terminating at vertex with id end_id
As you explore from each vertex, iterate over neighbors using vertex.adj (not vertex.get_edges()) to ensure neighbors are visited in proper order
Use the metric parameter to estimate h(v), the remaining distance, at each vertexThis is a callable parameter and will always be Vertex.euclidean_distance or Vertex.taxicab_distance
Returns tuple of the form ([path], distance) where[path] is a list of vertex id strings beginning with start_id, terminating with end_id, and including the ids of all intermediate vertices connecting the two
distance is the sum of the weights of the edges along the [path] traveled
If no path exists from vertex with id start_id to vertex with end_id or if one of the vertices does not exist, returns tuple ([],0)
Guaranteed that start_id != target_id (since that would be a trivial path)
Recall that vertices are prioritized in increasing order of f(v) = g(v) + h(v), whereg(v) is the shortest known path to this vertex
h(v) is the Euclidean distance from v to the target vertex
Use the given AStarPriorityQueue class to simplify priority key updates in a search priority queue
Implementations of this function which do not utilize the heuristic metric will not receive any credit, visible and hidden testcases included.Do not simply implement Dijkstra's Algorithm
Time Complexity: O(V+E)
Space Complexity: O(V)
To simplify the operation of updating a priority key in your search priority queue, use the following given class.
class AStarPriorityQueue:
DO NOT MODIFY the following attributes/functions
Attributesdata: Underlying data list of priority queue
locator: Dictionary to locate vertices within the priority queue
counter: Used to break ties between vertices
__init__(self)
Constructs an AStarPriorityQueue object
__repr__(self)Represents the priority queue as a string for debugging
__str__(self)Represents the priority queue as a string for debugging
empty(self)Returns boolean indicating whether priority queue is empty
push(self, priority, vertex)
Push the vertex object onto the priority queue with a given priority key
This priority queue has been specially designed to hold Vertex objects as values ranked by priority keys; be sure you push Vertex objects and NOT vertex id strings onto the queue
pop(self)
Visit, remove and return the Vertex object with the highest priority (i.e. lowest priority key) as a (priority, vertex) tuple
update(self, new_priority, vertex)Update the priority of the vertex object in the queue to have a new_priority
Example
To test your algorithm's performance, you use it to find optimal paths between locations on Among Us's most popular map: The Skeld. Below is a map of The Skeld,
and below is a modified and simplified version of The Skeld represented as a graph. Fun fact: the graph below was made with the plot() function we provide you (see above). The yellow points represent visited vertices from the top example testcase below. Be sure to initialize your Graph object with "plt_show=True" before calling plot().
a_star('Reactor', 'Cafeteria') would return(['Reactor', 'A', 'B', 'G', 'J', 'M', 'Cafeteria'], 22)
This finds the same optimal path as Dijkstra
a_star('Reactor', 'Communications') would return(['Reactor', 'A', 'B', 'G', 'H', 'D', 'E', 'Communications'], 35)
This finds the optimal path (note that BFS would find a sub-optimal path of path length 37)
a_star('Cafeteria', 'Med Bay') would return(['Cafeteria', 'M', 'J', 'K', 'Med Bay'], 8)
Although two equally-optimal paths exist, A* chooses the one that's closer to the straight line connecting the Cafeteria to the Med Bay
Note: maps/graphs may not be accurate to actual Among Us maps.
Submission:
Deliverables:
Be sure to upload the following deliverables in a .zip folder to Mimir by 8:00 pm EST, on Thursday, December 2nd.
Your .zip folder can contain other files (for example, description.md and tests.py), but must include (at least) the following:
|- Project5.zip
|— Project5/
|— README.xml (for project feedback)
|— __init__.py (for proper Mimir testcase loading)
|— Graph.py (contains your solution source code)
Grading
Tests (70)
Manual (30)Time & Space Complexity (30) M2 - degree, get_edges, distances, reset_vertices __/3
M3 - get_vertex, get_vertices, get_edge, get_edges __/3
M4 - bfs __/6
M5 - dfs __/6
M6 - detect_cycle __/6
M7 - a_star __/6
Extra:
Get to that emergency button before the impostor! You can do it!