$30
EECS 442 Computer Vision: Homework 4
Instructions
• The submission includes two parts:
1. To Gradescope: a pdf file as your write-up. Please reading the grading checklist for each part
before you submit it.
You might like to combine several files to make a submission. Here is an example online link
for combining multiple PDF files: https://combinepdf.com/.
Please mark where each question is on gradescope.
2. To Canvas: a zip file including all your code.
• The write-up must be an electronic version. No handwriting, including plotting questions. LATEX is
recommended but not mandatory.
1 Optimization and Fitting [20 pts]
For Problem 1, you will work on the same problem as you did in HW3 1.2.6. Instead of using RANSAC to
fit a homography, you need to fit an affine transformation
y = Sx + t
by gradient descent.
1. (15 pts) Implement the Linear layer and L2 loss in layers.py. In forwards propagation, you’ll
store all inputs in cache and save it for backwards propagation. The functions are shown below.
• Linear layer.
y = Wx + b
.
• L2 loss layer.
L(x, label) = 1
N
X
N
i=1
(xi − labeli)
2
,
2. (5 pts) Implement the gradient descent in fitting.py, report the hyperparameters you choose and
the results you get. Include the figure in your report as well.
1
2 Softmax Classifier with One Layer Neural Network [40 pts]
For Problem 2 and Problem 3, you will implement a softmax classifier from scratch to classify images. You
cannot use any deep learning libraries such as PyTorch in this part.
Implement the ReLU layer and softmax layer in layers.py. These functions are shown below:
• ReLU layer.
y =
(
x, x ≥ 0,
0, otherwise.
• Softmax layer.
yi =
e
xi
PN
j=1 e
xj
Note: When you exponentiate even large–ish numbers in your softmax layer, the result could be quite large
and numpy would return inf. To prevent this case, you can calculate softmax layer with max subtraction:
yi =
e
xi−max(x)
PN
j=1 e
xj−max(x)
It’s also not hard to see these two softmax equations are in fact equivalent since
e
xi−max(x)
PN
j=1 e
xj−max(x)
=
e
xi/emax(x)
PN
j=1 e
xj /emax(x)
=
e
xi
PN
j=1 e
xj
In softmax.py, you need implement your network. You’re only allowed to use layer functions you
implement. However, you’re encouraged to implement additional features in layers.py and use it here.
Filling in all TODOs in skeleton codes will be sufficient.
After making sure your network works, you need train it on CIFAR–10 [1] dataset, which is available at
https://www.cs.toronto.edu/∼kriz/cifar-10-python.tar.gz. CIFAR–10 has 10 classes, 50000 training images,
and 10000 test images. You need split the training set/validation set on the training images by yourself. After
decompressing the downloaded dataset, you can use the provided python inferface to read the CIFAR–10
dataset in train.py, although you’re free to modify it. Preprocess your images and tune the model
hyperparameters in train.py.
2
Figure 1: Example images from CIFAR10 dataset [1]
This task is open-ended. However, here are some suggestions:
• When working on the backward function in layer.py, you need to calculate the gradients of the
objective function. To check your implementation, you can take use of this:
∂f(· · · , xi
, · · ·)
∂xi
= lim
δ→0
f(· · · , xi + δ, · · ·) − f(· · · , xi
, · · ·)
δ
where f : [x1, · · · , xn] ∈ R
n
7→ R. So, you can derive some approximation for the gradients
by setting some small value δ. You can compare the gradient your implementation gets with the
numerical value of the approximation.
• Image preprocessing is important, you can either normalize them to zero mean and unit standard
deviation, or simply scale them to the range [0,1].
• Training neural networks can be hard and slow. Start early!
Grading checklist:
1. For all layers, we will have a series of tests that automatically test whether your functions are written
correctly. We won’t use edge cases. You do not need to include any discussion of your code for part
one unless you have decided to implement some extra features.
2. Your report should detail the architecture you used to train on CIFAR–10. Include information on
hyperparameters chosen for training and a plot showing both training and validation accuracy across
iterations.
3. Report the accuracy on the test set of CIFAR–10. You should only evaluate your model on the test set
once. All hyperparameter tuning should be done on the validation set. We expect you to achieve 40%
accuracy on the test set.
4. Include discussion of the parameter you chose for experimentation, and include your results and
analysis. In terms of discussion, we expect to see plots and data instead of a general description.
3
3 Softmax Classifier with Hidden Layers [20 pts]
Continue to work on softmax.py, add a hidden layer with N dimension on your neural network if
hidden dim is set to a positive integer N, there’s no hidden layer if hidden dim=None. Use ReLU
as your activation function.
Use this model to do the classification as you did in Problem 2 again. We expect you to achieve 50%
accuracy on the test set. Also include the number of hidden dimension in your report. The grading checklist
is the same as the one for Problem 2.
Once you finish the training, save the model with highest test accuracy for your next problem, we’ve already
provided save/load functions for you in softmax.py.
4 Fooling Images [20 pts]
Fooling images is a good way to see where the model works and fails, it’s highly related to another important
field of machine learning: adversarial attack and robustness. In the following example, a few spots on the
image will make the model to misclassify this ship image as airplane although it still looks like a ship.
Figure 2: Original ship image (left) and the fooling image classified as airplane (right).
In this part, you will use gradient ascent to generate a fooling image from a correctly classified image from
CIFAR-10.
1. (5 pts) Finish the remaining code in softmax.py, return the gradient of loss with respect to the input if
return dx flag is True.
2. (15 pts) Gradient ascent is similar to gradient descent. You’ll fix the parameters of the model and
compute the gradient of the classification score with respect to the input image and update the image
iteratively. In this problem, you will use a fooling class to compute the gradient of loss with respect to
the input image instead, in which way you’re doing gradient descent to minimize the loss but gradient
ascent to maximize the classification score of that fooling class.
You will implement the following steps in fooling image.py
• load the trained model you get from the last problem.
4
• load an image that is correctly classified by your model, choose a different class as the fooling
class, fix the model parameters and compute the gradient of the loss with respect to your input
image.
• update your input image with the gradient, repeat this process until your model classifies the
input image as the fooling class.
Include the original image, the fooling image and their difference in your report, you can magnify the
difference if it’s too small. Comment on the robustness of your model.
References
[1] Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009.