Starting from:

$29

Data Structures Programming Assignment 3


CENG 242
Data Structures

Programming Assignment 3

1 Introduction
In this assignment, you are going to practice on hash table and graph implementations in two stages. First, you are asked to
implement a generic hash table along with its accompanying hash function using C++. Notice that the runtime of insertion,
deletion, and retrieval operations are expected to be constant time for hash tables. Afterwards, you need to develop a
program that allows users to do some operations on the world metal trade dataset in 1994 which is provided on Cengclass.
For this purpose, you will first construct a weighted, directed graph from the dataset and then implement the given methods
for users to be able to analyze the dataset.
Keywords: Hash table, Quadratic probing, Directed graphs, C++
2 World Metal Trade Dataset
The dataset contains information on trade miscellaneous manufactures of metal among 80 countries in 1994. All countries
with entries in the paper version of the Commodity Trade Statistics published by the United Nations were included, but
for some countries, the 1993 data (Austria, Seychelles, Bangladesh, Croatia, and Barbados) or 1995 data (South Africa and
Ecuador) were used because they were not available for 1994. You can read more about the dataset at:
http://vlado.fmf.uni-lj.si/pub/networks/data/esna/metalWT.htm
The nodes represent countries with its additional information such as the country id, country name, continent, and gross
domestic product (gdp). The edges represent imports by one country from another for the class of commodities designated
as ’miscellaneous manufactures of metal’, which represents high technology products or heavy manufacture. The absolute
value of imports (in 1000 US$) is used but imports with values less than 1% of the country’s total imports were omitted.
There is no self loop within the dataset. Each field is separated by tab character in the dataset. The file structure is given
below:
∗ Vertices 80 // he ade r f o r the v e r t i c e s , 80 i n d i c a t e s the number o f v e r t i c e s i n the d a t a s e t
1 Algeria Africa 1209 // 1 i s the c oun t ry id , ” Al g e ri a ” i s the c oun t r y name , ” A f ri c a ” i s the ←-
c o n ti n e n t o f Al g e ri a , ”1209” i s the gdp pe r c a pi t a f o r Al g e ri a
.
.
.
∗ Edges // he ade r f o r the ed ge s
78 25 556 // t h e r e i s a d i r e c t e d edge o r i g i n a t e d from 78 towards 25 where both 78 and 25 a r e ←-
the c oun t ry i d s . 556 i s the edge wei gh t t h a t r e p r e s e n t s the metal import v al u e
. . .
In this assignment, you are supposed to use the following class definitions to create nodes and edges. Note that, node
and edge objects are created in the main function while parsing the dataset, so you are just expected to use the generated
objects in your hash table and graph implementations. You are not allowed to modify these classes.
c l a s s Node {
p u bli c :
// C o n s t r u c t o r s
Node ( ) : country ( ”” ) , continent ( ”” ) {}
1
Node ( i n t vid_ , c o n s t std : : string &country_ , c o n s t std : : string &continent_ , l o n g gdp_ ) :
vid ( vid_ ) , country ( country_ ) , continent ( continent_ ) , gdp ( gdp_ ) {}
// G e t t e r s & s e t t e r s
i n t getVid ( ) c o n s t { r e t u r n vid ; }
v oid setVid ( i n t vid ) { Node : : vid = vid ; }
c o n s t std : : string &getCountry ( ) c o n s t { r e t u r n country ; }
v oid setCountry ( c o n s t std : : string &country_ ) { country = country_ ; }
c o n s t std : : string &getContinent ( ) c o n s t { r e t u r n continent ; }
v oid setContinent ( c o n s t std : : string &continent_ ) { continent = continent_ ; }
l o n g getGdp ( ) c o n s t { r e t u r n gdp ; }
v oid setGdp ( l o n g gdp_ ) { gdp = gdp_ ; }
p r i v a t e :
i n t vid ; // c oun t ry i d
std : : string country ; // c oun t ry name
std : : string continent ; // c o n ti n e n t o f the c oun t r y
l o n g gdp ; // g r o s s d ome s tic p r oduc t pe r c a pi t a
} ;
c l a s s Edge {
p u bli c :
// C o n s t r u c t o r s
Edge ( ) {}
Edge ( c o n s t Node &tailNode , l o n g import_ ) : tailNode ( tailNode ) , import ( import_ ) {}
// G e t t e r s & s e t t e r s
c o n s t Node &getTailNode ( ) c o n s t { r e t u r n tailNode ; }
v oid setTailNode ( c o n s t Node &tailNode_ ) { tailNode = tailNode_ ; }
l o n g getImport ( ) c o n s t { r e t u r n import ; }
v oid setImport ( l o n g import_ ) { import = import_ ; }
p r i v a t e :
Node tailNode ; // the c oun t ry t h a t imp o r t s the metal
l o n g import ; // import v al u e ( i n 1000 US$)
} ;
3 Part 1: Hash Table Implementation (30 POINTS)
In the first step, you will design a generic hash table that should be able to perform the following operations:
• Inserting an item with a key into the hash table
• Deleting an item with a key from the hash table
• Getting an item with a key from the hash table
3.1 Specifications
• In this hash table, for every possible place in the table, you are supposed to store at most 2 values, called a bucket. If
the bucket for a key is used, then your hash table should use the standard open addressing (quadratic probing) to find
the next bucket for that key. Note that you should perform quadratic probing only for inter-bucket traversal. Since a
bucket consists of only two entries, you do not need such operation within the bucket.
2
Example
There is a sample hash table implementation given above. For simplicity, we assume that the hash function is h(key) =
key % 10. Therefore, (13, ”Afyon”), (23, ”Bitlis”) and (33, ”Mersin”) should be located in the same bucket. However,
since the bucket 3 is full, (33, ”Mersin”) tuple is located in Bucket 4 using quadratic probing.
• You are given a utility class, ”HashUtils” which contains two hash functions for the keys of type string and int
respectively, and a function for finding the next appropriate size for the hash table. It should not be modified in any
case.
// Returns the hash v al u e o f the gi v e n key
i n t Hash ( c o n s t std : : string& key ) ;
// Returns the hash v al u e o f the gi v e n key
i n t Hash ( i n t key ) ;
// Finds the next a p p r o p ri a t e hash t a bl e s i z e from a gi v e n number
i n t NextCapacity ( i n t prev ) ;
• You will implement the functions (marked with ”TODO”) in the ”HashTable.h”, and must use open addressing with
quadratic probing as a collision resolution strategy. The bare header file, ”HashTable.h”, is given below. You are
free to add any private helper variables/functions to it. However, your hash table should support at least the given
functions.
c l a s s ItemNotFoundException : p u bli c std : : exception {
p u bli c :
c o n s t ch a r ∗ what ( ) c o n s t throw ( ) { r e t u r n ”Item Not Found i n the Hash Table ! ” ; }
} ;
templ a te <c l a s s K , c l a s s T
c l a s s HashTable {
s t r u c t Entry {
K Key ; // the key o f the e n t r y
T Value ; // the v al u e o f the e n t r y
b o ol Deleted ; // f l a g i n d i c a t i n g whether t h i s e n t r y i s d el e t e d
b o ol Active ; // f l a g i n d i c a t i n g whether t h i s item i s c u r r e n t l y used
Entry ( ) : Key ( ) , Value ( ) , Deleted ( f a l s e ) , Active ( f a l s e ) {}
} ;
s t r u c t Bucket {
Entry entries [ 2 ] ;
} ;
i n t _capacity ; // INDICATES THE TOTAL CAPACITY OF THE TABLE
i n t _size ; // INDICATES THE NUMBER OF ITEMS IN THE TABLE
Bucket ∗ _table ; // THE HASH TABLE
3
// == DEFINE HELPER METHODS & VARIABLES BELOW ==
p u bli c :
// TODO: IMPLEMENT THESE FUNCTIONS.
// CONSTRUCTORS, ASSIGNMENT OPERATOR, AND THE DESTRUCTOR
HashTable ( ) ;
// COPY THE WHOLE CONTENT OF RHS INCLUDING THE KEYS THAT WERE SET AS DELETED
HashTable ( c o n s t HashTable<K , T& rhs ) ;
HashTable<K , T& o p e r a t o r =( c o n s t HashTable<K , T& rhs ) ;
˜ HashTable ( ) ;
// TODO: IMPLEMENT THIS FUNCTION.
// INSERT THE ENTRY IN THE HASH TABLE WITH THE GIVEN KEY & VALUE
// IF THE GIVEN KEY ALREADY EXISTS , THE NEW VALUE OVERWRITES
// THE ALREADY EXISTING ONE. IF THE LOAD FACTOR OF THE TABLE IS GREATER THAN 0 . 6 ,
// RESIZE THE TABLE WITH THE NEXT PRIME NUMBER.
v oid Insert ( c o n s t K& key , c o n s t T& value ) ;
// TODO: IMPLEMENT THIS FUNCTION.
// DELETE THE ENTRY WITH THE GIVEN KEY FROM THE TABLE
// IF THE GIVEN KEY DOES NOT EXIST IN THE TABLE, THROW ItemNotFoundException ( )
v oid Delete ( c o n s t K& key ) ;
// TODO: IMPLEMENT THIS FUNCTION.
// IT SHOULD RETURN THE VALUE THAT CORRESPONDS TO THE GIVEN KEY.
// IF THE KEY DOES NOT EXIST, THROW ItemNotFoundException ( )
T& Get ( c o n s t K& key ) c o n s t ;
// TODO: IMPLEMENT THIS FUNCTION.
// AFTER THIS FUNCTION IS EXECUTED THE TABLE CAPACITY MUST BE
// EQUAL TO newCapacity AND ALL THE EXISTING ITEMS MUST BE REHASHED
// ACCORDING TO THIS NEW CAPACITY.
// WHEN CHANGING THE SIZE , YOU MUST REHASH ALL OF THE ENTRIES
v oid Resize ( i n t newCapacity ) ;
// THE IMPLEMENTATION OF THESE FUNCTIONS ARE GIVEN TO YOU
// DO NOT MODIFY!
i n t Capacity ( ) c o n s t ; // RETURN THE TOTAL CAPACITY OF THE TABLE
i n t Size ( ) c o n s t ; // // RETURN THE NUMBER OF ACTIVE ITEMS
v oid getKeys ( K∗ keys ) ; // PUTS THE ACTIVE KEYS TO THE GIVEN INPUT PARAMETER
} ;
• You can assume that country names and country ids for each record will be unique.
• You have to call hash method (with either string or integer keys) in the ”HashUtils.h” to calculate the
hash value of an item.
• ”NextCapacity” function takes an unsigned integer and returns a bigger prime number. You have to
call this method while resizing the table.
• While calculating the load factor, you should use bucket size (2) ∗ capacity as the total number of slots in the hash
table.
• In this part, you are not allowed to use vector or any other standard libraries except the ones that are included.
3.2 Entry Struct
• Each ”Entry” object stores information regarding a single key-value pair in the hash table.
• ”Key” variable should store the original key variable given in the ”Insert” method of the entry
• ”Value” variables stores the value given in the ”Insert” method.
• ”Active” variable denotes that the key-value pair stored in this entry is valid. For example, 0th entry of the 0th bucket
in the example above is not valid, however 0th entries of the 1st, 2nd, 3rd and 4th bucket are valid and their ”Active”
fields are set as true. Initially, ”Active” variable of each ”Entry” object is set to false.
• ”Deleted” variable stores whether this entry has been deleted before. Initially, every ”Entry” object has this variable
as false.
4
4 Part 2: Graph Implementation (70 POINTS)
In the second phase, you will implement the ”TODO” functions in ”Graph.cpp”. In the graph class, each vertex represents
a country and each directed edge represents imports by one country from another. The edge should be directed towards the
country that imports the commodity. Each node has a country id, country name, continent, and gdp. Each edge has an
import value in the US dollar which you should consider as the edge weight. The adjacency list of the graph should actually
be stored in a hash map as follows:
HashT able < string, list < Edge adjList
You can assume that country id and country name fields are unique. Therefore you can also select country id as the key.
In other words, you can change the types of key-value pairs of ”adjList”, it depends on your design. You can declare more
than one hash tables, to store different key-value pairs which may facilitate your work. In graph class, you are expected
to implement constructor, copy constructor, assignment operator, destructor and 8 more functions whose specifications are
given below:
• void addNode(const Node& node);
addNode function gets a node as a parameter and then adds this node to the hash table, where country name (or
country id) is the key and an empty Edge list (list<Edge) is the value.
• void addConnection(const Node& headNode, const Node& tailNode, int import);
addConnection function adds a new Edge to the end of list of headNode using tailNode and import data. You must add
the new edge to the back of headNode’s list<Edge. For example, lets create 3 nodes for Algeria, Tunisia and Turkey
respectively. Then, each node is added to the graph using addNode function. At this stage, their adjacency lists are
empty. Finally, we execute the last two statements and insert new edges from Turkey to Tunisia and to Algeria.
The table representation for Turkey is given below. Since the relationship between Turkey and Tunisia is created first,
it appears in front.
• list<Node getAdjacentNodes(const Node& node);
getAdjacentNodes function gets a node as a parameter and returns all adjacent nodes of the given node as a list of
Node. You should only consider the given node’s adjacency list. Therefore, only consider the edges that are originated
from the given node. If the input parameter does not exist in the graph, throw ItemNotFoundException().
• long getTotalImports(const Node& node);
getTotalImports function gets a node as a parameter and returns the total imports from the given node. In other
words, it returns the sum of the values of the edges that are originated from the given node.
• void deleteNode(const Node& node);
deleteNode function deletes the given node from the graph together with its incident connections (edges). You may
use ”getKeys” function of ”HashTable” to retrieve all active keys and delete the given node from their adjacency lists.
• list<string findLeastCostPath(const Node& srcNode, const Node& destNode);
findLeastCostPath function gets two nodes as parameter, calculates the least cost path from srcNode to destNode
and returns the calculated path as a list of country names from srcNode to destNode. For this function, you should
consider import values as the edge cost. If there are more than one shortest path between these nodes, it is enough
to return only one of them. You may implement Dijkstra’s weighted shortest path algorithm using priority queue of
C++ Standard Template Library (STL). For instance the leastCostPath from Honduras to Guatemala is
Honduras → El Salvador → Panama → Guatemala
5
• bool isCyclic();
If the graph contains any cycles return true, otherwise return false. A cycle is a path v1, v2, . . . , vk−1, vk in which
v1 = vk and k 2. For instance, in the given figure below, 0,1,2,3 and 2,4 are the graph cycles. Therefore, for this
graph, isCyclic function must return true. You can use either DFS or BFS to traverse and you should mark which
nodes you have visited so far. You may use additional data structures such as queue, stack, vector of stl library that
are included in the Graph.h.
• list<string getBFSPath(const Node& srcNode, const Node& destNode);
getBFSPath function takes srcNode and destNode as parameters and returns the BFS path from srcNode to destNode
as a list of country names. You may use queue of stl library to find the path. Note that in order to find correct result,
in addConnection function, you should insert new edges to the end of the list. For example, lets find the BFS path
from Turkey to Honduras. The adjacency list of Turkey is as follows:
[Tunisia, Algeria, Greece, Romania, Jordan]
The order of the adjacency list is preserved by the order of insertion. The only connection of Tunisia is Morocco and
Algeria has no connections. Next, we need to check Greece’s incident edges. Greece’s adjaceny list is as follows:
[Cyprus, Honduras, Romania, Jordan]
Therefore, the output of getBFSPath should look like (in this order):
Turkey, Tunisia, Algeria, Greece, Romania, Jordan, Morocco, Cyprus, Honduras
5 Regulations
1. You will use C++.
2. You are not allowed to use map or any other standard libraries except the included ones.
3. Use the classes that are currently included; do not include any class (other than those specified) from the C++ standard
library
4. You are not allowed to modify already implemented functions and variables.
5. You can define private methods and variables if necessary.
6. If your submission fails to follow the specifications or does not compile, there will be a ”significant” penalty of grade
reduction.
7. In HashTable.h, those who do the operations (insert, delete, resize, get) without utilizing the hashing, there will be a
”significant” penalty of grade reduction.
8. In Graph class, those who do not store adjacency list as HashTable, there will be a ”significant” penalty of grade
reduction.
9. Those who modify already implemented functions and those who insert other data variables or public functions and
those who change the prototype of given functions will be penalized as well.

6 Submission
1. Submission will be done via Moodle (cengclass.ceng.metu.edu.tr).
2. Do not submit a main function in any of your source files.
3. A test environment will be ready in Moodle.
• You can submit your source files to Moodle and test your work with a subset of evaluation inputs and outputs.
• Another set of test cases will be used for evaluation of your final grade. So, your actual grades may be different
than the ones you get in Moodle.
• Only the last submission before the deadline will be graded.
6

More products