$30
Assignment Ten
ECE 4200
Provide credit to any sources other than the course staff that helped you solve the problems.
This includes all students you talked to regarding the problems.
You can look up definitions/basics online (e.g., wikipedia, stack-exchange, etc)
The due date is 11/29/2020, 23.59.59 eastern time.
Submission rules are the same as previous assignments.
Problem 1. (10 points). Suppose W is a k × d matrix, where each entry of W is picked independently from the set {− √
1
k
, √
1
k
}. In other words, for each i, j,
Pr
Wij = −
1
√
k
= Pr
Wij =
1
√
k
=
1
2
.
1. Let −→x ∈ R
d
. If we pick W with this distribution, show that
E
kW−→x k
2
2
= k
−→x k
2
2
.
2. Just like the Gaussian matrix we considered in the class, we might as well take a random
matrix W designed like this for JL transform. What is an advantage of this matrix over the
Gaussian matrix?
Problem 2. (10 points). Suppose d = 1. Come up with a set of n real numbers, and an initial
set of k distinct cluster centers such that the k-means algorithm does not converge to the best
solution of the k-means clustering problem. You can choose any value of n, and k that you want!
(Hint: small n, k are easier to think about.)
Problem 3. (15 points). Let C =
x¯1 · · · x¯|C|
be a cluster where ¯xi ∈ R
d
. Let
cav =
1
|C|
X
x¯i∈C
x¯i
Prove that for any c ∈ R
d
, X
x¯i∈C
kx¯i − ck
2
2 ≥
X
x¯i∈C
kx¯i − cavk
2
2
(Hint: ¯xi − c = ¯xi − cav + cav − c)
Problem 4. (30 points). Please see attached jupyter notebook.
1