$29.99
Assignment 2 02155 - Computer Architecture and Engineering
02155 - Assignment 2
Practical information
This second deliverable assignment consists of a collection of three exam problems (one
per page) from the previous years.
For this assignment you have to write a small report in English with detailed problem
solutions (not only the results). The grade you will obtain for the report will be part
of the final grade. This deliverable assignment is individual, hence there should be one
report per person.
The report should have a front page similar to the one used in Assignment 1. There are
no other requirements on the template to be used for the report.
The report should be handed-in only in electronic format (PDF) using the Assignment
utility in CampusNet.
Feel free to ask or send an e-mail to the TA if you have questions regarding the practical
information and the assignment in general.
1
Assignment 2 02155 - Computer Architecture and Engineering
Problems
Exercise A2.1 - Exam problem (fall 2009)
Consider the following processors:
P1: single cycle RISC-V processor (each instruction is executed in one clock cycle) with
Tclock = 5 ns.
P2: 5-stage pipelined RISC-V processor (stages: F, D, E, M, W) without data-forwarding,
with Tclock = 1 ns. Branches are assumed not-taken. In beq the decision on taking the
branch is made at the end of stage E.
P3: as P2, but with data-forwarding implemented.
The processors are used to execute the following fragment of RISC-V code.
add x9 , x0 , x0
lw x3 , 0( x5 )
addi x4 , x3 , 1
beq x1 , x0 , skip
sub x6 , x5 , x7
add x2 , x5 , x6
skip : add x6 , x5 , x7
sub x2 , x5 , x6
addi x9 , x6 , -4
a) Assuming that at execution time the content of register x1 = 0 (branch is taken),
for each processor:
i. Show the timing diagram1
(instructions executed in each clock cycle) of the
execution of the above fragment of RISC-V code.
ii. Compute the execution time of the fragment of RISC-V code.
b) What is the speed-up of P3 over P1 and P2?
1 For the timing diagram you should use the following compact representation, for example:
1 2 3 4 5 6 7 ... —————————————————–
add x9, x0, x0 F D E M W
lw x3, 0(x5) F D E M W
... ...
2
Assignment 2 02155 - Computer Architecture and Engineering
Exercise A2.2 - Exam problem (fall 2011)
Consider the following processors and compilers:
P1: 5-stage pipelined RISC-V processor (stages: F, D, E, M, W) without data-forwarding,
with Tclock = 1 ns.
P2: as P1, but with data-forwarding implemented.
P1R: as P1, but with compiler that can re-arrange instructions to reduce stalls.
P2R: as P2, but with compiler that can re-arrange instructions to reduce stalls.
The processors are used to execute the following fragment of RISC-V code.
add x5 , x0 , x0
lw x4 , 0( x3 )
add x5 , x5 , x4
lw x6 , 4( x3 )
add x5 , x5 , x6
sw x5 , 8( x3 )
a) Assuming that at execution time the content of register x3 is available, for P1, P2,
P1R and P2R:
i. Show the timing diagram1
(instructions executed in each clock cycle) of the
execution of the above fragment of RISC-V code. Clearly, for P1R and P2R
you are free to re-arrange the instructions to obtain less stalls.
ii. Compute the execution time of the fragment of RISC-V code.
b) What is the speed-up of P2R over P1 and P2?
3
Assignment 2 02155 - Computer Architecture and Engineering
Exercise A2.3 - Exam problem (fall 2012)
To reduce power dissipation in modern processors, voltage scaling and frequency scaling
are used.
A single core processor can operate at the following frequencies:
1.2 GHz - 1.6 GHz - 2.0 GHz - 2.4 GHz
And it can operate at the following supply voltages:
0.8 V - 0.9 V - 1.0 V
We run a C program on the processor and we want to determine its performance under
different combinations of frequency and voltage scaling.
a) If the C program execution time is 52 ms when the frequency is set to 2.4 GHz, what
is the execution time when only the frequency is scaled to 2.0 GHz? What is the
speed-up?
b) When the voltage is scaled, the delay of the circuits increases. If by scaling the
voltage from 1.0 V to 0.9 V the delay of the slowest stage in the processor becomes
600 ps, to what value must the frequency be scaled? What is the execution time of
the C program under these voltage and frequency scaling modes?
c) The C program is compiled into machine language and, when it runs on the processor,
83.2 million instructions are executed. What is the CPI?
d) The C program is re-compiled with a higher optimization level and the number of
instructions is reduced by 18%. The CPI does not change. Which is the lowest
frequency that can be selected with this optimized recompilation to maintain the
execution time of 52 ms (or less)?
4