Starting from:

$30

HOMEWORK 6 Kernel SVM

HOMEWORK 6
>>NAME HERE<<
>>ID HERE<<
Instructions: You can choose any programming language as long as you implement the algorithm from scratch.
Use this latex file as a template to develop your homework. Submit your homework on time as a single pdf file to
Canvas. Please check Piazza for updates about the homework.
1 Kernel SVM [30pts]
Consider the following kernel function defined over z, z
′ ∈ Z, where Z is some set:
κ(z, z

) = (
1 if z = z

,
0 otherwise.
1. (10 pts) Prove that for any positive integer m, any z1, . . . , zm ∈ Z, the m × m kernel matrix K = [Kij ]
is positive semi-definite, where Kij = κ(zi
, zj ) for i, j = 1 . . . m. (Let us assume that for i ̸= j, we have
zi ̸= zj ) Hint: An m × m matrix K is positive semi-definite if ∀u ∈ R
d
: u
⊤Ku ≥ 0.
2. (10 pts) Given a training set (z1, y1), . . . ,(zn, yn) with binary labels, the dual SVM problem with the
above kernel κ will have parameters a1, . . . , an, b ∈ R. (Let us assume that for i ̸= j, we have zi ̸= zj )
The predictor for input z takes the form
f(z) = Xn
i=1
aiyiκ(zi
, z) + b .
Recall that the label prediction is sgn(f(z)). Prove that there exist a1, . . . , an, b such that f correctly
separates the training set. In other words, κ induces a feature space rich enough such that in it, any training
set can be linearly separated.
3. (10 pts) How does that f predict input z that is not in the training set?
Comment: One useful property of kernel functions is that the input space Z does not need to be a vector space;
in other words, z does not need to be a feature vector. For all we know, Z can be all the turkeys in the world. As
long as we can compute κ(z, z

), kernel SVM works on turkeys.
2 Chow-Liu Algorithm [30 pts]
Suppose we wish to construct a directed graphical model for 3 features X, Y , and Z using the Chow-Liu algorithm.
We are given data from 100 independent experiments where each feature is binary and takes value T or F. Below
is a table summarizing the observations of the experiment:
X Y Z Count
T T T 36
T T F 4
T F T 2
T F F 8
F T T 9
F T F 1
F F T 8
F F F 32
1
Homework 6 CS 760 Machine Learning
1. Compute the mutual information I(X, Y ) based on the frequencies observed in the data. (5 pts)
2. Compute the mutual information I(X, Z) based on the frequencies observed in the data. (5 pts)
3. Compute the mutual information I(Z, Y ) based on the frequencies observed in the data. (5 pts)
4. Which undirected edges will be selected by the Chow-Liu algorithm as the maximum spanning tree? (5 pts)
5. Root your tree at node X, and assign directions to the selected edges. (10 pts)
3 Game of Classifiers [60pts]
3.1 Implementation
Implement the following models in the choice of your programming language. Include slack variables in SVM
implementation if needed. You can use autograd features of PyTorch, TensorFlow, etc., or derive gradients on
your own (but do not use inbuilt models for SVM, Kernel SVM, and Logistic Regression from libraries).
• Implement Linear SVM (without kernels).
• Implement Kernel SVM, with options for linear, rbf, and polynomial kernels. You should keep the kernel
parameters tunable (e.g., do not fix the degree of polynomial kernels but keep it as a variable and play with
different values of it.) Is Linear SVM a special case of Kernel SVMs?
• Implement Logistic Regression with and without kernels (use same kernels as above).
3.2 Synthetic Dataset-1 (20 pts)
Generate a 2-D dataset as follows: Let µ = 2.5 and I2 be the 2 × 2 identity matrix. Generate points for the
positive and negative classes, respectively from N ([µ, 0], I2), and N ([−µ, 0], I2). For each class, generate 750
points (1500 in total). Randomly create train, validation, and test splits of 1000, 250, and 250 points, respectively.
Do the following with this dataset:
1. (5 pts) Train your Linear SVM, Logistic Regression models and report decision boundaries and test accuracies.
2. (5 pts) Show the decision boundaries with k-NN and Naive Bayes Classifiers. (You can use library implementations or implement from scratch. Figure out the hyper-parameters using the validation set.)
3. (5 pts) Repeat the process by varying µ from 1.0 to 2.4 with a step size of 0.2 for each value of µ to obtain
test accuracies of the models and plot ( µ on x-axis and test accuracy on y-axis). (You will have a curve for
each of the 4-classifiers mentioned above.)
4. (5 pts) What are your conclusions from this exercise?
3.3 Synthetic Dataset-2 (20 pts)
Generate 1500 data points from the 2-D circles dataset of sklearn:
sklearn.datasets.make_circles
Randomly create train, validation, and test splits of 1000, 250, and 250 points, respectively. Evaluate the above
classifiers in this setting.
1. ( 5 pts) Show decision boundaries for Linear SVM and Logistic Regression classifiers.
2. ( 5 pts) Show decision boundaries for Kernel SVM and Kernel Logistic Regression ( use rbf, polynomial
kernels). Try different values of hyperparameters, and report results with whichever works best.
3. ( 5 pts ) Train Neural Network from HW4, and k-NN classifiers on this dataset and show decision boundaries. ( You can use library implementation for these classifiers).
4. ( 5 pts ) What are your conclusions from this exercise?
2
Homework 6 CS 760 Machine Learning
3.4 Evaluation on Real Dataset (20 pts)
Let’s put all this to some real use. For this problem, use the Wisconsin Breast Cancer dataset. You can download
it from the sklearn library:
sklearn.datasets.load_breast_cancer
1. (10 pts) Do all the points of Section 3.3 in this dataset. Since these are high-dimensional data, you do not
have to show the decision boundaries. Report test accuracies for these classifiers and discuss your findings.
2. (10 pts) In addition, you also want to figure out the important features which determine the class. Which
regularization will you use for this? Upgrade your SVM, Kernel SVM implementation to include this
regularization. Discuss the important features that you obtain by running your regularized SVM on this
dataset. (You might need to normalize this dataset before training any classifier.)
3

More products