$25
Computer Science Department Regression and Error 16:198:536
• A Small Regression Example Consider regression in one dimension, with a data set {(xi
, yi)}i=1,...,m.
– Find a linear model that minimizes the training error, i.e., ˆw and ˆb to minimize
Xm
i=1
( ˆwxi + ˆb − yi)
2
. (1)
– Assume there is some true linear model, such that yi = wxi + b + ?i
, where noise variables ?i are i.i.d.
with ?i ∼ N(0, σ2
). Argue that the estimators are unbiased, i.e., E [ ˆw] = w and E
h
ˆb
i
= b. What are the
variances of these estimators?
– Assume that each x value was sampled from some underlying distribution with expectation E[x] and
variance Var(x). Argue that in the limit, the error on ˆw and ˆb are approximately
Var( ˆw) ≈
σ
2
m
1
Var(x)
Var(ˆb) ≈
σ
2
m
E
x
2
Var(x)
.
(2)
– Argue that recentering the data (x
0
i = xi − µ) and doing regression on the re-centered data produces the
same error on ˆw but minimizes the error on ˆb when µ = E[x] (which we approximate with the sample
mean).
– Verify this numerically in the following way: Taking m = 200, w = 1, b = 5, σ2 = 0.1.
∗ Repeatedly perform the following numerical experiment: generate x1, . . . , xm ∼ Unif(100, 102), yi =
wxi + b + ?i (with ?i as a normal, mean 0, variance σ
2
), and x
0
i = xi − 101; compute ˆw, ˆb based on
the {(xi
, yi)} data, and ˆw
0
,
ˆb
0 based on the {(x
0
i
, yi)} data.
∗ Do this 1000 times, and estimate the expected value and variance of ˆw, wˆ
0
,
ˆb, ˆb
0
. Do these results make
sense? Do these results agree with the above limiting expressions?
– Intuitively, why is there no change in the estimate of the slope when the data is shifted?
– Consider augmenting the data in the usual way, going from one dimensions to two dimensions, where the
first coordinate of each x is just a constant 1. Argue that taking Σ = XTX in the usual way, we get in
the limit that
Σ → m
"
1 E[x]
E[x] E[x
2
]
#
(3)
Show that re-centering the data (Σ0 = (X0
)
T(X0
), taking x
0
i = xi − µ), the condition number κ(Σ0
) is
minimized taking µ = E[x].
1