Starting from:

$30

Cmput 466 Assignment 5

Problem 1.
Consider the training objective 𝐽 = ||𝑋𝑀 − 𝑑|| subject to for some constant .
2
||𝑀||
2 ≤ 𝐢 𝐢
How would the hypothesis class capacity, overfitting/underfittting, and bias/variance vary
according to 𝐢?
Larger 𝐢 Smaller 𝐢
Model capacity (large/small?) _____ _____
Overfitting/Underfitting? __fitting __fitting
Bias variance (how/low?) __ bias / __ variance __ bias / __ variance
Note: No proof is needed
Problem 2.
Consider a one-dimensional linear regression model 𝑑 with a Gaussian prior
(π‘š) ∼ 𝑁(𝑀π‘₯
(π‘š)
, σ
Ο΅
2
)
𝑀 ∼ 𝑁(0, σ . Show that the posterior of is also a Gaussian distribution, i.e., 𝑀
2
) 𝑀
𝑀|π‘₯ . Give the formulas for .
(1)
, 𝑑
(1)
, ···, π‘₯
(𝑀)
, 𝑑
(𝑀) ∼ 𝑁(µ
π‘π‘œπ‘ π‘‘
, σ
π‘π‘œπ‘ π‘‘
2
) µ
π‘π‘œπ‘ π‘‘
, σ
π‘π‘œπ‘ π‘‘
2
Hint: Work with 𝑃(𝑀|𝐷) ∝ 𝑃(𝑀)𝑃(𝐷|𝑀). Do not handle the normalizing term.
Note: If a prior has the same formula (but typically with different parameters) as the posterior, it
is known as a conjugate prior. The above conjugacy also applies to multi-dimensional Gaussian,
but the formulas for the mean vector and the covariance matrix will be more complicated.
Problem 3.
Give the prior distribution of 𝑀 for linear regression, such that the max a posteriori estimation is
equivalent to 𝑙 -penalized mean square loss.
1
Note: Such a prior is known as the Laplace distribution. Also, getting the normalization factor in
the distribution is not required.
END OF W5

More products