Starting from:

$30

Homework 1 CS 4650/7650 Natural Language Understanding

Homework 1
CS 4650/7650
Natural Language Understanding
Instructions
1. Submit your answers as a pdf file on Canvas.
2. We recommend students type answers with LaTeX or word processors. A scanned
handwritten copy would also be accepted. If writing by hand, write as clearly as
possible. No credit may be given to unreadable handwriting.
3. Write out all steps required to find the solutions so that partial credit may be awarded.
4. The first question is meant for undergraduate students only while the last question is
meant for graduate students only. Both these questions carry 5 pts each. Each of the
other three questions is for 10 pts. There is no extra credit for answering additional
questions than what is required.
5. We generally encourage collaboration with other students. You may discuss the questions
and potential directions for solving them with another student. However, you need to
write your own solutions and code separately, and not as a group activity. Please list
the students you collaborated with.
1 of 3
Homework 1
Deadline: Jan 15th, 3:00pm
CS 4650/7650
Natural Language Understanding
Questions
1. [Undergraduate Students Only] Provide answers to the following operations or
write “invalid” if not possible: [5 pts]
(a) ?
3 5
4 1? ?2
7
?
(b) ?
3 5
4 1?

2 7
(c) ?
3 5 4
4 1 1? ?4 3 2
1 1 2?|
(d)
3 5 4|

4 3|
(e)
3 5 4|

4 3
Solution.
Solution goes here. ?
2. The entropy of a discrete random variable X is defined as (use base e for all log
operations unless specified otherwise):
H(X) = −
X
x∈X
P(x) log P(x)
(a) Compute the entropy of the distribution P(x) = Multinoulli([0.2, 0.3, 0.5]). [3
pts]
(b) Compute the entropy of the uniform distribution P(x) = 1
m
∀x ∈ [1, m]. [3 pts]
(c) Consider the entropy of the joint distribution P(X, Y):
H(X, Y ) = −
X
x∈X
X
y∈Y
P(x, y) log P(x, y)
How does this entropy relate to H(X) and H(Y), (i.e. the entropies of the marginal
distributions) when X and Y are independent? [4 pts]
Solution.
Solution goes here. ?
3. You are investigating articles from the New York Times and from Buzzfeed. Some of
the articles contain fake news, while others contain real news (assume that there are
only two types of news).
Note: for the following questions, write your answer using up to 3 significant figures.
(a) Fake news only accounts for 5% of all articles in all newspapers. However, it
is known that 30% of all fake news comes from Buzzfeed. In addition, Buzzfeed
generates 25% of all news articles. What is the probability that a randomly chosen
Buzzfeed article is fake news? [3 pts]
2 of 3
Homework 1
Deadline: Jan 15th, 3:00pm
CS 4650/7650
Natural Language Understanding
(b) Suppose that 15% of all fake news comes from the New York Times (NYT).
Furthermore, suppose that 60% of all real news comes from the NYT. Under all
assumptions so far, what is the probability that a randomly chosen NYT article
is fake news? [3 pts]
(c) Mike is an active reader of the New York Times: Mike reads 80% of all NYT
articles. However, he also has a suspicion that the NYT is a bad publisher, and
he believes that 25% of all NYT articles are fake news. Furthermore, the NYT
generates 30% of all news articles. Under all assumptions so far, what is the
probability that a randomly chosen article (from all newspapers) will be from the
NYT, will be read by Mike and will be believed to be fake news? [4 pts]
Solution.
Solution goes here. ?
4. Suppose we have a probability density function (pdf) defined as:
f(x, y) = (
C(x
2 + 2y), 0 < x < 1 and 0 < y < 1,
0, otherwise.
(a) Find the value of C. [2pts]
(b) Find the marginal distribution of X and Y . [4pts]
(c) Find the joint cumulative density function (cdf) of X and Y . [4pts]
Solution.
Solution goes here. ?
5. [Graduate Students Only] A 2-D Gaussian distribution is defined as:
G(x, y) = 1
2πσ2
exp ?

x
2 + y
2

2
?
Compute the following integral:
Z ∞
−∞
Z ∞
−∞
G(x, y) (5x
2
y
2 + 3xy + 1) dx dy
Hint: Think in terms of the properties of probability distribution functions. [5 pts]
Solution.
Solution goes here. ?
3 of 3

More products