Starting from:

$30

Homework 8 Train a regression CNN

Homework 8
1.Download the files cnntrain.zip and cnntest.zip from Canvas, containing
training and test patches of size 64 × 64 for the problem of predicting the resolution of
an image. The first two digits in each file name represent the resolution yi of the patch
(between 10% and 96%).
The goal of this project is to train a regression CNN to predict the resolution. We
will use the square loss functions on the training examples (xi
, yi), i = 1, ..., n:
S(w) = 1
n
Xn
i=1
(yi − fw(xi))2 + λkwk
2
(1)
Besides the loss function, we will measure the R2
, defined as:
R
2 = 1 −
Pn
i=1(yi − yˆi)
2
Pn
i=1(yi − y¯)
2
where yˆi = fw(xi) and y¯ =
1
n
Pn
i=1 yi
.
Experiment with different CNN architectures to obtain a good result. One example of a CNN you could use contains five convolutional layers with stride 1 and zero
padding, the first four with filters of size 5 × 5 with or without holes (atrous), and the
last of the appropriate size to obtain a 1 × 1 output. The first two convolutions have
16 filters, the next two have 32 filters, and the last has one filter. The first three convolutions are followed by 2 × 2 max pooling with stride 2 respectively. The fourth
convolution layer is followed by ReLU.
Try to use a GPU and CUDA for faster training.
a) Train a CNN for 100 epochs with momentum 0.9 using the square loss (1). Use
the SGD optimizer with an appropriate learning rate and λ = 0.0001 (weight
decay). Start with minibatch size 32 and double it every 20 epochs and to obtain
a good training R2
(at least 0.9). Show a plot of the loss function vs epoch
number for the training set and the test set. Show another plot of the training and
test R2 vs epoch number. (4 points)
b) Repeat point a) with the same CNN architecture, but with a fixed minibatch of
512. It’s ok if you cannot get a training R2 of at least 0.9 in this case. (3 points)
c) Report the CNN architecture in a table, where each row describes one layer,
including the layer description, size and number of parameters, and the last row
containing the total number of parameters. (2 points)
d) Plot with two different colors the train and test residuals ri = yi − yˆi vs yi for
the model obtained at a). (1 point)
1

More products