Starting from:

$30

Homework 8: SVM and kernels

CS/ECE/ME 532
Homework 8: SVM and kernels

1. Classification and the SVM. Revisit the iris data set from Homework 4. For this problem, we
will use the 3rd and 4th features to classify whether an iris is versicolor or virginica. Here is a plot of
the data set for this restricted set of features.
We will look for a linear classifier of the form: xi3w1 +xi4w2 +w3 ≈ yi
. Here, xij is the measurement
of the j
th feature of the i
th iris, and w1, w2, w3 are the weights we would like to find. The yi are the
labels; e.g. +1 for versicolor and −1 for virginica.
a) Reproduce the plot above, and also plot the decision boundary for the least squares classifier.
b) This time, we will use a regularized SVM classifier with the following loss function:
minimize
w
Xm
i=1
(1 − yix
T
i w)+ + λ(w
2
1 + w
2
2
)
Here, we are using the standard hinge loss, but with an `2 regularization that penalizes only w1
and w2 (we do not penalize the offset term w3). Solve the problem by implementing gradient
descent of the form wt+1 = wt − γ∇f(wt). For your numerical simulation, use parameters
λ = 0.1, γ = 0.003, w0 = 0 and T = 20,000 iterations. Plot the decision boundary for this SVM
classifier. How does it compare to the least squares classifier?
c) Let’s take a closer look at the convergence properties of wt
. Plot the three components of wt
on the same axes, as a function of the iteration number t. Do the three curves each appear to
be converging? Now produce the same plots with a larger stepsize (γ = 0.01) and a smaller
stepsize (γ = 0.0001). What do you observe?
1 of 2
2. Classification with kernels. Consider the two-dimensional classification problem based on the
training data shown in the plot below. The data is provided in the file circledata.csv, where the
first two columns are the x and y coordinates and the third column is the label ±1.
a) Solve the standard least squares (linear) classification problem with `2 regularization parameter
λ = 10−5
. Also solve the dual formulation and verify that both produce the same solution.
b) Design a least squares classifier using the Gaussian kernel
k(ai
, aj ) = exp

1
2
kai − ajk
2

and regularization parameter λ = 10−5
. Compare its classification of the training data to the
least squares classification.
c) Design a least squares classifier using the polynomial kernel
k(ai
, aj ) = (a
T
i aj + 1)2
and regularization parameter λ = 10−5
. Compare its classification of the training data to the
least squares classification and Gaussian kernel classifier.
d) Now tackle the problem using hinge loss instead of squared error loss. You may do this using the
Matlab function svmtrain, another package, or by writing your own code (e.g., GD or SGD).
Design SVM classifiers using both Gaussian and polynomial kernels and compare your results
to those obtained using least squares.
2 of 

More products