$30
Assignment 3
CMPT 215
Total: 35
Please show all your work to get full marks
Problem 1.
(4 marks) Describe the modifications to the single clock cycle datapath that
would be needed to implement the jal instruction and give the control signal
setting that would be required for this instruction.
Problem 2.
(4 marks) Determine the clock cycle at which each of the instructions in
the sequence given below would be completed, assuming the 5-stage pipeline
without forwarding, and numbering the clock cycle at which the first of the
instructions is fetched as clock cycle 1. Assume that if one instruction reads
a register during the same clock cycle as another instruction is writing it,
the new value will be read. Do not reorder the instructions. Instructions
are fetched and executed exactly in the order given below, with the pipeline
stalling if necessary.
lw $s1 , 0 ( $ s2 )
lw $t1 , 0 ( $ s1 )
addi $s3 , $s1 , 4
addi $s1 , $t1 ,−1
add $t0 , $s3 , $ s1
Problem 3.
(4 marks) Repeat problem 2 but now assume forwarding.
1
Problem 4.
(9 marks) Consider the following code fragments:
l i $s1 , 1
l i $s3 , 6
add $s4 , $ze ro , $ z e ro
o u t e r l o o p : add $s2 , $ze ro , $ z e ro
i n n e r l o o p : addi $s2 , $s2 , 1
mul $t0 , $s2 , $ s1
add $s4 , $s4 , St0
i n n e r t e s t bne $s2 , $s1 , i n n e r l o o p
addi $s1 , $s1 , 1
bne $s1 , $s3 , o u t e r l o o p
This code simple runs the inner loop i+ 1 times each time, with i starting at
1, going until 5, meaning the inner test will be executed 1+2+3+4+5 = 15
times. Give the branch prediction accuracy for inner test for the following
schemes:
i Branch prediction of always Not Taken
ii 1-bit branch prediction of starting at Not Taken
iii 2-bit branch prediction of starting at Strongly Not Taken
Problem 5.
(4 marks) Suppose that a program does read operations on the following
memory addresses 96, 508. Give the number of the memory block that each
of these addresses belongs to, for each of the following memory block sizes.
Remember the above addresses are byte addressed.
i block size of one word (4 bytes)
ii block size of four words (16 bytes)
2
Problem 6.
(8 marks) Give the position (or set) in the cache that would be checked on
each of the read operations of the above question, for each of the following
caches.
i Direct-mapped cache with total capacity of 16 one-word blocks
ii Direct-mapped cache with total capacity of 4 four-word blocks
iii 4-way set-associative cache with total capacity of 16 one-word blocks
iv 2-way set-associative cache with total capacity of 4 four-word blocks
Problem 7.
(2 marks) Consider a computer system in which a physical page number is
24 bits, a virtual page number is 52 bits, and a virtual address is 64 bits.
What is the maximum amount of physical memory, in GiB, that this system
could have?
Bonus:
Problem 8.
(5 marks) Given two threads that share memory:
i n t a = 0 ;
i n t b = 4 ;
i n t c = 0 ;
i n t d = 0 ;
i n t z = 0
Thread 1 : Thread 0 :
c = a+b ; d = a+b ;
c = 0 ; z = a+d ;
a = d+b ; a = c+d ;
end
What is the values of each variable at the end of the program? What is the
technical term for what is going on? What are two ways to keep mutual
exclusion?
3