$30
INF 551
Homework #4
100 points
1. [40 points] Consider the following B+tree for the search key “age”. Suppose the degree d of the tree
= 2, that is, each node (except for root) must have at least two keys and at most 4 keys.
a. Describe the process of finding keys for the query condition “age = 10 and age <= 50”. How
many blocks I/O’s are needed for the process?
b. Draw the updated B+tree after inserting 14, 16, and 15 into the tree. Show the tree after all
insertions. How many blocks I/O’s are needed for the process?
c. Draw the updated tree after deleting all odd ages (one by one) from the leaf nodes of the
tree obtained in part b. Follow this order in deletion from left to right: 5, 7, … Show the tree
after all deletions. How many blocks I/O’s are needed for the process?
2. [60 points] Consider natural-joining tables R(a, b) and S(a,c). Suppose we have the following
scenario.
i. R is a clustered relation with 20,000 blocks and 200,000 tuples
ii. S is a clustered relation with 10,000 blocks and 200,000 tuples
iii. S has a clustered index on the join attribute a
iv. V(S, a) = 100 (recall that V(S, a) is the number of distinct values of a in S)
v. 102 pages available in main memory for the join.
vi. Assume the output of join is given to the next operator in the query execution plan
(instead of writing to the disk) and hence the cost of writing the output is ignored.
Describe the steps (including input, output at each step, and their sizes) in each of the following
join algorithms. What is the total number of block I/O’s needed for each algorithm? Which
algorithm is most efficient?
a. Nested-loop join with R as the outer relation
b. Nested-loop join with S as the outer relation
c. Sort-merge join (assume only 100 pages used for sorting and 101 pages for merging). When
the number of runs of a relation is too large for merging, the runs will be further merged
first. Select the relation with larger number of runs for further merging if both have too
many runs.
d. Simple sort-based join (same assumption as above)
INF 551 – Fall 2019
2
e. Partitioned-hash join (assume 101 pages used in partitioning of relations and no hash table
used for lookup in joining buckets)
f. Index join (ignore the cost of index lookup)
Submission Criteria:
• You may submit pdf copies of your answers.
• Make sure you specifically answer ALL the sub-questions asked in every question.