$30
Data Mining: 625.740
Homework for Module 9
1. Fisher’s linear discriminant is
wˆ
∗ = arg max
wˆ
J(wˆ ) = arg max
wˆ
wˆ
TSbwˆ
wˆ
TSwwˆ
,
where Sb = (m1 − m2)(m1 − m2)
T and Sw =
P
j
P
α
(xα − mj )(xα − mj )
T
.
(a) By writing ∂J
∂wˆ = 0, show that
S
−1
w Sbwˆ = J(wˆ )wˆ .
(b) Explain why wˆ
∗
is the eigenvector for which J(wˆ ) is the maximum eigenvalue of S
−1
w Sb.
(c) Explain why Sbwˆ is always in the direction of m1 − m2 and thus show that
wˆ
∗ = const. · S
−1
w (m1 − m2).
2. Another way to optimize Fisher’s linear discriminant (suggested by Barry Fridling):
(a) Show that for any two real vectors x and y
(x
T y)
2 ≤ (x
T x)(y
T y), (Cauchy-Schwarz).
(b) Show that if the λk are positive,
X
N
k=1
xkyk
!2
≤
X
N
k=1
λkx
2
k
! X
N
k=1
y
2
k
/λk
!
.
(c) Thus, show that for A positive definite
(x
T y)
2 ≤ (x
T Ax)(y
T A−1y).
(d) By letting A = Sw in the expression above, and writing
J(wˆ ) = |wˆ
T
(m1 − m2)|
2
wˆ
TSwwˆ
,
show again that
wˆ
∗ = const. · S
−1
w (m1 − m2).
3. Using the Optdigits dataset from the UCI repository, implement PCA. Reconstruct the digit
images and calculate the reconstruction error E(n) = P
j
||xˆj − x||2
for various values of n,
the number of eigenvectors. Plot E(n) versus n.