$29
EEL 6764 Principles of Computer Architecture
Homework #6
1 Problems
Total points: 210
1 Complete 5.1 (10 pts each) at the end of Chapter 5
2 For each part of this exercise, assume the initial cache and memory state in Figure 5.38. Each
part of this exercise specifies a sequence of one or more CPU operations of the form:
P#: <op <address [ <-- <value ]
where P# designates the CPU (e.g., P0,0), <op is the CPU operation (e.g., read or write),
¡address¿ denotes the memory address, and <value indicates the new word to be assigned on a
write operation. What is the final state (i.e., coherence state, sharers/owners, tags, and data) of
the caches and memory after the given sequence of CPU operations has completed? Also, what
value is returned by each read operation? (10 pts each)
(a) P0,0: read 100
(b) P0,0: read 128
(c) P0,0: write 128 <-- 78
(d) P0,0: read 120
(e) P0,0: read 120; P1,0: read 120
(f) P0,0: read 120; P1,0: write 120 <-- 80
(g) P0,0: write 120 <-- 80; P1,0: read 120
(h) P0,0: write 120 <-- 80; P1,0: write 120 <-- 90
Note: In Figure 5.38, the processor P0,1 should be P1,0.
3 Directory protocols are more scalable than snooping proto- cols because they send explicit request
and invalidate messages to those nodes that have copies of a block, while snooping protocols
broadcast all requests and invalidates to all nodes. Consider the eight-processor system illustrated
in Figure 5.37 and assume that all caches not shown have invalid blocks. For each of the sequences
below, identify which nodes (chip/processor) receive each request and invalidate. (10 pts each)
(a) P0,0: write 100 <-- 80
(b) P0,0: write 108 <-- 88
(c) P0,0: write 118 <-- 90
(d) P1,0: write 128 <-- 98
4 Show how the basic snooping protocol of Figure 5.7 can be changed for a write-through cache.
What is the major hardware functionality that is not needed with a write-through cache compared
with a write-back cache? (20 pts)
1
EEL 6764 Fall 2018 Homework
Case Studies and Exercises by Amr Zaky and David A. Wood ■ 419
where P# designates the CPU (e.g., P0,0), <op is the CPU operation (e.g., read
or write), <address denotes the memory address, and <value indicates the
new word to be assigned on a write operation. What is the final state (i.e., coherence state, sharers/owners, tags, and data) of the caches and memory after the
given sequence of CPU operations has completed? Also, what value is returned
by each read operation?
Figure 5.37 Multichip, multicore multiprocessor with DSM.
Figure 5.38 Cache and memory states in the multichip, multicore multiprocessor.
Chip0 Chip1
P0
P3
P1
P2
P0
P3
P1
P2
M0 M1
L2$ L2$
P3,1
. . . . . .
L2$,1
M1
L2$, 0
M0
Address Address
Address
Address
tag Data
Data
State
State
B0 DM P0,1 100 00 10
08
68
18
00
00
00
108
130
118
P0,0; E
P1,0
P1,0
DM
DS
DS
B1
B2
B3
100 DM C0 00
00
00
00
10
08
10
C0 18
C0, C1
-
DS
DI
DS
108
110
118
P0,0
Coherency state
B0
B1
M
S
100
108 00
00
08
10
Address
tag Data
P0,1
B0
B1
M
S
130
118 00
00
18
68
Coherency state Address
tag Data
Coherency state
Address
tag Data
B0
B1
B0 DS P3,1 120 00
00
00
00
20
08
10
20
P3,1; E 108
-
- -
-
DS
DI
DI
B1
B2
B3
120 DS C1 00
00
00
00
20
28
68
96
C0
-
-
DI
DM
DI
128
130
138
S
S
120
108 00
00
08
20
Address
tag Data
Data
State
Address State
Owner/
sharers
Owner/
sharers
Owner/sharers Owner/sharers
Hennessy, John L., and David A. Patterson. Computer Architecture : A Quantitative Approach, Elsevier Science, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/usf/detail.action?docID=787253.
Created from usf on 2017-11-04 19:31:51. Copyright © 2011. Elsevier Science. All rights reserved.
2 Requirement
• All homeworks should be done and submitted individually.
• Show all necessary steps to get full points.
• Writing and drawings if necessary must clear and readable. Otherwise, substantial loss of points
may occur.
• You must submit your solutions electronically via Canvas.
• The file for your solutions must be in PDF or MS-Word DOCX format.
2