Starting from:

$30

Computer Organization Assignment 1Β 


CS3350B, Computer Organization
Assignment 1 
Useful Facts:
1 𝐺𝐻𝑧 = 1 × 109 𝐻𝑧
1 𝑏𝑦𝑑𝑒 = 8 𝑏𝑖𝑑𝑠
1 𝐾𝑏𝑦𝑑𝑒 (𝐾𝐡) = 1024 𝑏𝑦𝑑𝑒𝑠
1
Exercise 1. [6 Marks] Define the processor-memory gap. How have computer architectures changed to combat this gap? What metric looks to be minimized by these changes?
Be specific with this metric; of course minimizing one metric might also decrease another,
higher-level metric.
Exercise 2. [8 Marks] Consider the following function fib, which computes a Fibonacci
number of a given index. Discuss the instruction locality of the function. Is there good
locality or bad locality? Is there spatial locality or temporal locality? Explain why. For ease
of discussion, you may consider a particular execution of the function, say, fib(4).
int fib(int n) {
if (n < 2) {
return n;
}
int n1 = fib(n-1);
int n2 = fib(n-2);
return n1 + n2;
}
Exercise 3. [6 Marks] Discern a single formula for the average memory access time for
a computer with three levels of cache. The hit rates of those levels are 90%, 85%, and 80%,
respectively. The latency of main memory is 500 clock cycles. Leave anything in the AMAT
formula which has not been given a specific value as a parameter.
2
Exercise 4. [30 Marks] The following tables summarizes the instructions present in some
program (Table 1) as well as two different processors (Table 2).
π‘‚π‘π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘› πΌπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘› πΆπ‘œπ‘’π‘›π‘‘
ALU 2000
Load 1000
Store 600
Branch 400
Table 1: Program Instructions
π‘‚π‘π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘› 𝑃 π‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘œπ‘Ÿ 1 𝑃 π‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘œπ‘Ÿ 2
ALU 3 1
Load 4 6
Store 3 3
Branch 2 2
Table 2: Processor Instruction Cycles
(a) [5 Marks] Calculate the ideal CPI of this program running on Processor 1.
(b) [5 Marks] Calculate the ideal CPI of this program running on Processor 2.
(c) [5 Marks] Assuming that Processor 1 operates at 2.8 GHz and Processor 2 operates at
2.4 GHz calculate the CPU time for the program when run on each processor. On which
processor is the program faster? (Hint: for easy calculations, 1𝐺𝐻𝑧 = 1 × 109𝐻𝑧)
(d) [5 Marks] In terms of the three factors calculating CPU time (instruction count, CPI,
clock frequency) explain why the processor from part (c) is the faster one. How could a
programmer change the program described in Table 1 so that it would run faster on the
other processor?
(e) [10 Marks] Calculate the CPU time of the program described in Table 1 running on
Processor 1 (at 2.8 GHz) when also considering memory stall cycles (and thus have CPU
time calculated using CPIstall). Assume this processor has one level of cache, an instruction
miss rate of 2%, a data miss rate of 5% and a miss penalty of 100 cycles.
3
Exercise 5. [30 Marks] Consider a 32-bit computer with a simplified memory hierarchy.
This hierarchy contains a single cache and an unbounded backing memory. The cache has
the following characteristics:
• Direct-Mapped, Write-through, Write allocate.
• Cache blocks are 16 words each.
• The cache has 128 sets.
(a) [4 Marks] Calculate the cache’s size in bytes.
(b) [12 Marks] Consider the following code fragment in the C programming language to be
run on the described computer. Assume that: program instructions are not stored in cache,
arrays are cache-aligned (the beginning of the array aligns with the beginning of a cache
line), ints are 32 bits, and all other variables are stored only in registers.
int N = 16384;
int A[N];
for (int i = 1; i < N; i += 2) {
A[i-1] = A[i];
}
Determine the following:
(i) The number of cache misses.
(ii) The cache miss rate.
(iii) The type of cache misses which occur.
(c) [14 Marks] Consider the following code fragment in the C programming language to
be run on the described computer. In addition to the assumptions of part (b), assume that
that the array B immediately follows the array A in memory.
int N = 16384;
int A[N];
int B[N];
for (int i = 0; i < N; ++i) {
A[i] = B[i];
}
Determine the following:
(i) The number of cache misses.
(ii) The cache miss rate.
(iii) The type of cache misses which occur.
4
Exercise 6. [20 Marks] Below is a sequence of references to 16-bit memory word addresses.
1, 18, 14, 16, 2, 3, 17, 18, 14, 15, 20, 21
(a) Consider an initially empty, direct-mapped cache with 2-word cache lines and a capacity
of 32 bytes. Using the sequence of address references above, determine if each address
referenced results in a hit or a miss. If the reference results in a cache miss, say which
type of cache miss (cold, conflict, capacity) You may use the table below to help answering
this question, but for the purposes of marking, we do not care about the tag, index, or block
offset.
(b) Consider an initially empty, 2-way set associative cache with 2-word cache lines, a capacity of 32 bytes, and an LRU (least recently used) eviction policy. Using the sequence of
address references above, determine if each address referenced results in a hit or a miss. If the
reference results in a cache miss, say which type of cache miss (cold, conflict, capacity) You
may use the table below to help answering this question, but for the purposes of marking,
we do not care about the tag, index, or block offset.
Word
Address Tag Index Block Offset Hit/Miss Type of Miss
1
18
14
16
2
3
17
18
14
15
20
21
5

More products