$30
EECS 492 –
Problem Set 5
Overview
This assignment covers topics in probability and Bayesian networks. It may be helpful to read
and understand the material in the textbook (especially chapters 13, 14.1-14.5, and 17.1-17.4)
and to look over notes from lecture and discussion. You should also read through each task
fully before beginning to solve it.
Submission Instructions
● Submit your PS5 solutions in a single pdf file on Gradescope and Canvas. Also submit
your code for the final task on Canvas.
Task 1: Probability (20 points)
You are watching your favorite contestant answer trivia questions on a TV game show. The
questions come from three categories: Science, History, and Entertainment. The show is not
perfectly balanced, so 35% of the questions are Science, 33% are History, and 32% are
Entertainment. Over the course of this year, your contestant has answered 74 out of 93
Science questions correctly, 31 out of 88 History questions correctly, and 55 out of 84
Entertainment questions correctly. Answer the following questions using the statistics above.
a. Derive the conditional probabilities that the contestant answers the next question
correctly, given that it is from a specific category. Do this for each of the three
categories.
b. Of course you as an observer have no idea what category the next question will be.
What is the probability that the contestant will answer the next question right?
c. You get up to get more chips and dip and missed the last question. Your brainy friend
who knows nothing about entertainment (and would thus have a 0% chance of getting
an entertainment question right) knew the answer (another friend of yours, who is a
reliable source, confirms this). What is the probability that the contestant also answered
the question correctly?
d. You get up again to get your friends more chips and dip (you’re such a good friend) and
miss the next question too. From the shouts of dismay you hear from the other room,
you figure out that the contestant answered incorrectly. What is the probability that the
question was a history question?
e. You are deciding whether or not you should bet one of your friends $100 that the
contestant will answer the last question of the show correctly. That is, if the contestant
answers correctly, your friend will pay you $100; otherwise, you will have to pay your
friend $100. You know a show employee who can tell you for sure what the category of
the last question will be. Now you are considering whether you should call her up and
offer her a bribe to tell you. How much, if anything, should you be willing to pay for this
information? (Assume your interest is in making money.)
Task 2: Building a Bayes Net (20 points):
Jordy drives to work every weekday (Monday-Friday). You will build a Bayes net to describe
Jordy’s music choices in the car (whether, on any given day, it is pop music or not).
10% of the time, Jordy gets a coffee before work. Coffee makes Jordy feel slightly more hipster
and thus with some probability more adventurous: in particular, Jordy feels adventurous 70% of
the time after having coffee, in comparison with only 40% of the time after not having coffee.
Three (random) days a week, Jordy’s roommate carpools to work with Jordy. Whenever Jordy’s
roommate rides in, she says she wants to listen to classical music 90% of the time. However,
she likes the Friday morning program (where the station switches over to jazz) a lot less, so on
a Friday she would only have a 30% chance of saying she wants to listen to that station.
If Jordy has a friend demanding to listen to another station and is feeling adventurous, Jordy will
always be a good friend (and also explore new music) and listen to whatever that friend wants
to listen to. If Jordy is not feeling adventurous but a friend demands to listen to another station,
however, half the time Jordy will invoke driver’s right to choose the music and listen to pop
music anyway. And if no one else is in the car or expresses a preference, Jordy will always
listen to pop music when not feeling adventurous. If Jordy happens to be feeling adventurous
and no one else expresses a musical preference, Jordy just hits a random radio preset button
and listens to whatever music that station is playing. There are five preset buttons, two of which
are set to the two pop music stations in town.
Represent this information as a Bayesian network. Use node titles that represent the action
(e.g. GetsCoffee for Jordy getting coffee) and include all CPTs.
Task 3: Using Bayesian Networks (20 points)
Consider the following Bayesian network about the weather in a certain small town in Michigan.
Answer the following questions by hand, and show all your work.
a. Only knowing that it is currently 3pm, what is the probability that it’s raining — that is, that
there are more than 0 inches of precipitation?
b. Suppose you are only interested in using the time of day and the atmospheric pressure to
predict precipitation. Use variable elimination and node clustering to reduce the Bayes net
to one with just the A, G, and P nodes.
c. Using the result from part b, if we know that it is currently 3pm and that it is snowing (ie.
there is precipitation), what is the probability that the pressure is 20 inHg?
Task 4: JavaBayes (20 pts)
You should use the JavaBayes software package for this task. It can be found at
http://www.cs.cmu.edu/~javabayes/Home/ and can be run as an applet from the link
http://www.cs.cmu.edu/~javabayes/Home/applet.html (no software to install, but note that
modern security protocols in your browser may not allow the Java applet to run). Documentation
can be found on the first webpage.
There are a few things to keep in mind:
- There will be two applet windows; make sure you can see both of them because the text
window gives useful feedback.
- To specify CPTs, click on the “Edit Function” button, and make sure that your specified
values sum to 1 where appropriate. Also make sure that you use the pull-down lists to
specify all values of parent nodes.
- When specifying evidence (click observe), the nodes will change color when observed.
Encode the following network in JavaBayes, and use it to answer the following questions. For
each question, include in your submission the output generated by JavaBayes for the
query/queries relevant to that question.
a. What is the unconditional probability that F = true?
b. What is the probability that D = true and E = true, given that A = false? (Note that this
cannot be computed directly by JavaBayes; you will have to put the query into an
appropriate form, performing multiple queries if necessary, and combining them
algebraically.)
c. The properties of conditional independence can be demonstrated using probability
calculations (using the equation on page 498 of R&N).
1. Are D and E conditionally independent given C?
2. Demonstrate your claim by selecting a set of truth assignments for the random
variables D, E, A, and C and showing whether or not the equation holds.
3. Does the equality/inequality for this single set of truth assignments prove your claim?
Why or why not?
Task 5: Markov Decision Processes (20 points)
Consider a Markov Decision Process (MDP) that describes a (much simplified) environment
for an autonomous Mars rover designed to investigate Martian terrain. Because the landing
spot for the rover cannot be precisely determined ahead of time, it is necessary to know the
best course of action for the rover from any position within the grid shown below. That is, we
would like to find the optimal policy π
∗
for the environment. Satellite imaging has already
determined the type of terrain present at each location (F - Flat, R - Rocky, C - Crater) as
shown in the grid below.
(0,3)
F
(1,3)
R
(2,3)
F
(3,3)
R
(0,2)
F
(1,2)
C
(2,2)
F
(3,2)
F
(0,1)
R
(1,1)
F
(2,1)
R
(3,1)
R
(0,0)
R
(1,0)
F
(2,0)
F
(3,0)
C
The rover may attempt to move in any direction (W - West, E - East, S - South, N - North) as
long as the move would not take it outside the grid. The rover may also choose to remain
(R) in its current location. The results of an action depend on the terrain at the rover’s
current location:
F - Flat Moves from flat locations transition to the adjacent location in the chosen direction.
R - Rocky Moves from a rocky location are dangerous and the rover may become damaged.
They work correctly with probability Pr
, but otherwise transition to BROKEN.
C - Crater Only the Remain action is allowed - the rover is not able to climb out of a crater.
BROKEN No actions allowed (terminal state).
Note that the success of a move action depends only on the location from which the agent is
moving and not on the location into which it is moving. There are 17 possible states the
rover can be in: each of the 16 locations and BROKEN. The only terminal state is BROKEN.
Because the ultimate goal is being in craters in order to investigate them, states of type
Crater have a large positive reward of +5. As rocks are also somewhat interesting, each
Rock state has a small positive reward of +1. The BROKEN state and Flat states have no
reward. To review:
Available actions: W, E, S, N, R
Rewards:
Type of State F-Flat R-Rocky C-Crater BROKEN
Reward 0 1 5 0
a. Using a programming language of your choice, implement value iteration (R&N Figure
17.4) for this problem. You should design your implementation such that changing the
rewards for each type of state and changing the discount factor are easy. Your
implementation should update the values for Ui(s) in a batch fashion (i.e. don’t use
values just calculated previously in the same iteration). You should initialize the expected
utilities to 0 and assume γ = 0.95 and Pr = 0.20. Please turn in your code and a
readme.txt file explaining how to run your code. If you were only able to implement
part of the algorithm, please specify in your readme exactly what parts you were able to
implement.
b. Run value iteration until convergence, with an epsilon value of 0.0001. What are the
expected utilities for each location and the resulting policy? Write your answer in the
format shown below, where for each location (x,y), you report the expected utility for the
location (shown in the grid as U(x,y)) along with the action determined for that cell by the
policy (shown in the grid as [action]).
c. Our rover is pretty fragile, given the 80% chance of breaking in rocky locations. Let’s say
we replace our rover with a more robust rover with only a 1% chance of breaking in a
rocky location. Run your algorithm again with Pr = 0.99 and γ = 0.95. Describe in a few
sentences how the rover’s behavior changes compared with part b) as a result of this
change. (You don’t have to show the expected utilities and the complete policy for this
part).
d. Now run your code again with Pr = 0.99 and γ = 0.1. What change in rover’s behavior do
you notice compared to the previous parts as a result of this change? (Again, you don’t
have to show the expected utilities and the complete policy for this part).