Starting from:

$30

Database Systems Homework #3

CS143: Database Systems
Homework #3
Problem 1. We want to store the table created by the following SQL statement into a disk.
CREATE TABLE Class(
dept CHAR(2),
cnum INTEGER,
sec INTEGER,
unit INTEGER,
year INTEGER,
quarter INTEGER,
title CHAR(30),
instructor CHAR(20)
)
We need to store tuples for 1,000 classes that have been offered so far. 10 classes are offered every
year. The tuples are stored in random order (i.e., they are not sequenced by any attribute). A disk of
the following parameters is used for storing the table.
• 3 platters (6 surfaces)
• 10,000 cylinders
• 500 sectors per track
• 1024 bytes per sector
• 6,000 RPM rotational speed
• 10ms average seek time
(a) What is the capacity of this disk?
(b) What is the average time to read a random sector from the disk?
(c) Assume one disk block corresponds to one disk sector. How many disk blocks are needed to store
the above table with 1,000 tuples?
(d) We want to run the following query by scanning the entire table.
SELECT * FROM Class WHERE year = 2005
Assuming that all blocks for the table is allocated sequentially, how long will it take to run the
query? Assume that the disk head is not on the same track where the first block of the table is
stored.
(e) Now assume that due to frequent updates to the table, disk blocks are allocated such that, on
average, sequentiality is broken every three blocks. That is, the table is stored in 24 randomly
located “clusters” of 3 consecutive blocks. Assuming that we scan the entire table to execute the
above query, how long will it take?
(f) Now assume that we have a B+tree on the year attribute and the tree has already been loaded
into main memory. None of the disk blocks containing the Class table has been cached in main
memory. What is the expected time to run the above query? Is it helpful to create a B+tree to
run this query?
1
Problem 2: Indexes
The table taken(StNo, CourseID, Year, Quarter, Sec, Grade, Remarks) contains the
grades for the courses completed by UCLA students during the last 20 years. If 10,000 new
students enter UCLA every year, we can assume that in taken there are 200,000 different
students, each identified by a StudentID. Thus will assume there are 40,000 students enrolled each quarter, and that each student takes four classes per quarter (160,000 classes
taken by students each quarter) and that there are three quarters in each year (480,000
classes taken every year). Thus we get a total of 9,600,000 tuples recording the grades of
all students over the last 20 years. Also assume that the average number of students per
class is 100; this implies that 4,800 classes are offered each year.
On this table, we have a sparse index on StNo and a dense index on the combination:
(CourseID,Year,Quarter,Sec,StNo). Both indexes are implemented as B+ trees where
CourseNo, Year, Quarter, Sec, and StNo take 8 bytes each, and the pointers used in the
B+ trees take 10 bytes.
B1 If the file blocks have 4096 bytes and each tuple in taken requires 100 bytes, how
many blocks will be needed to store the unspanned tuples of this relation ?
B2 Compute the levels and the number of blocks at each level of the B+ tree, assuming
a worst-case scenario.
B3 How many blocks of B+ tree and file will the DBMS retrieve from disk to answer the
following query: Find the average grade in a given class (e.g. find the average grade
for: CS143, 2010, Fall, sec. 1). Assume the worst-case scenario, and that all the
buffers are initially empty.
B4 We now want to compute the average grade over the (480,000 or so) classes taken in
year 2011 (assume that they all have the same credit). Explain how the DBMS will
go about searching and retrieving blocks from disk for this query, and estimate the
number of blocks the system will have to fetch if those 200,000 students each took 48
classes on the average. Assume the worst-case scenario, and that all the buffers are
initially empty.
Problem3: Joins and Optimization
In addition to the table taken whose schema and index have been described previously,
we also have the table student(StNo,Level,FirstName,LastName,Major) describing our
200,000 students. This table has a primary B+ index on StNo which is the key for this
relation and a foreign key for taken.There are five different levels: freshman, softmore,
junior, senior, others.
C1 How many blocks will student use, if each tuple requires 100 bytes and each block
contains 4096 bytes?
C2 For the query πStNo(σLevel=“others”(student)) ./ taken estimate the size of
the results (i.e., how many tuples). Also estimate the cost of implementing the query
measured by the number if blocks read from disks. You can assume that the join
takes advantage of the sparse index on taken.StNo (Also, for the sake of simplicity,
assume that all indexes used are already in main memory).
i

More products