Starting from:

$30

Assignment 6 - Object recognition in images with deep learning

Assignment 6 - Object recognition in images with deep learning¶
Goals
In this assignment you will get to know the main ingredients of deep learning and get to use the GPUs available in the Big Data Lab.

You'll learn to use

tensors and automatic differentiation
layered models
p(re)trained networks for image classification.
Check the GPU setup
When you are logged in to a lab machine, run nvidia-smi to see the available card and its memory usage.

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04   Driver Version: 450.102.04   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0  On |                  N/A |
| 45%   24C    P8    N/A /  75W |   3087MiB /  4038MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      3627      G   /usr/lib/xorg/Xorg                           169MiB |
|    0     10843      C   ...d/CMPT/big-data/tmp_py/dlenv/bin/python  2897MiB |
+-----------------------------------------------------------------------------+
This shows that the machine has an NVIDIA GTX 1050 with 4G of RAM. Also, you can see that it is running a process (pid=10843) that currently takes up close to 3 GB of GPU memory. On our current blu9402 lab machines you will notice a difference, as they have 8GB of RAM.

$ pstree -ls 10843
screen───bash───jupyter-noteboo───python─┬─4*[python]
                                         └─26*[{python}]
Inside a terminal window you may use who, ps -aux | less, or pstree -ls <PID> as above to find out who is using the shared resources. In my case, it turns out that I'm running a jupyter notebook related to process 10843. Halting the notebook frees up the GPU memory.

PyTorch setup in the lab
To build deep learning models in this assignment we are using PyTorch, a replacement for numpy that provides accelerated computation on the GPU, automatic differentiation, and various utilities to train and deploy neural networks. Its popularity relative to tensorflow has been steadily increasing and it also has a high-level API, the NN module similar to tf.keras.

In case you have trouble configuring a conda environment that has a CUDA version of pytorch installed, you could use the one that's provided under the prefix
conda activate /usr/shared/CMPT/big-data/condaenv/gt

Save disk space in the lab: Use shared downloaded pre-built models
To save disk space in your home folder, we recommend that you let pytorch use the pre-built models that we already downloaded for you (about 1.9G):

mkdir -p ~/.cache/torch/hub/checkpoints
ln -s /usr/shared/CMPT/big-data/dot_torch_shared/checkpoints/* ~/.cache/torch/hub/checkpoints
Learn about Pytorch usage
To familiarize yourself with PyTorch, have a look at the Examples on Tensors, or the NN module, or briefly skim over the 60 min blitz tutorial.

Task 1: Finding rectangles
A nice blog-post by Johannes Rieke presents a simple setup from scratch that finds rectangles in a black & white image. In order to play with it, we just have to translate a few calls from Keras to PyTorch.

To familiarize yourself with using pytorch, have a look at the Examples. The following code is preparing our training setup.

Running Environment for this Assignment
I use Google Colab as the running environment. So some of the code is commented out after submitted for lab testing environment.

# to check GPU memory, uncomment and run the following line
# from google.colab import drive
# drive.mount('/content/drive')
!{'nvidia-smi'}
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Fri Feb 17 22:14:06 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   58C    P0    28W /  70W |   4430MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2932      C                                    4427MiB |
+-----------------------------------------------------------------------------+
import numpy as np
from IPython.display import display, Markdown
import matplotlib.pyplot as plt
import matplotlib

# Create images with random rectangles and bounding boxes. 
num_imgs = 50000

img_size = 8
min_object_size = 1
max_object_size = 4
num_objects = 1

bboxes = np.zeros((num_imgs, num_objects, 4))
imgs = np.zeros((num_imgs, img_size, img_size))  # set background to 0

for i_img in range(num_imgs):
    for i_object in range(num_objects):
        w, h = np.random.randint(min_object_size, max_object_size, size=2)
        x = np.random.randint(0, img_size - w)
        y = np.random.randint(0, img_size - h)
        imgs[i_img, x:x+w, y:y+h] = 1.  # set rectangle to 1
        bboxes[i_img, i_object] = [x, y, w, h]

display(Markdown(f"The training data consists of ${num_imgs:,}$ example images.  \n"
                 f"`imgs` is an array of shape {imgs.shape}, giving an ${img_size}\\times{img_size}$ pixel image for each example.  \n"
                 f"`bboxes` is an array of shape {bboxes.shape}, giving a $1\\times4$ row vector [x, y, w, h] for each rectangle."
                ))
display(Markdown('**Here is an example of the training data:**'))
i = 0
plt.imshow(imgs[i].T, cmap='Greys', interpolation='none', origin='lower', extent=[0, img_size, 0, img_size])
for bbox in bboxes[i]:
    plt.gca().add_patch(matplotlib.patches.Rectangle((bbox[0], bbox[1]), bbox[2], bbox[3], ec='r', fc='none', lw=2))
The training data consists of  50,000
  example images.
imgs is an array of shape (50000, 8, 8), giving an  8×8
  pixel image for each example.
bboxes is an array of shape (50000, 1, 4), giving a  1×4
  row vector [x, y, w, h] for each rectangle.

Here is an example of the training data:


# Reshape and normalize the image data to mean 0 and std 1. 
X = (imgs.reshape(num_imgs, -1) - np.mean(imgs)) / np.std(imgs)
display(Markdown(f"New shape of `imgs`: {X.shape}, with normalized mean {np.mean(X):.2f} and stdev {np.std(X):.2f}"))

# Normalize x, y, w, h by img_size, so that all values are between 0 and 1.
# Important: Do not shift to negative values (e.g. by setting to mean 0), because the IOU calculation needs positive w and h.
y = bboxes.reshape(num_imgs, -1) / img_size
y.shape, np.mean(y), np.std(y)

# Split training and test.
i = int(0.8 * num_imgs)
train_X = X[:i]
test_X = X[i:]
train_y = y[:i]
test_y = y[i:]
test_imgs = imgs[i:]
test_bboxes = bboxes[i:]
New shape of imgs: (50000, 64), with normalized mean -0.00 and stdev 1.00

Task 1a
Construct a Pytorch model that resembles the Keras one in the original blog post, i.e. have a fully connected, hidden layer with 200 neurons, ReLU nonlinearity and dropout rate of 20%.

import torch
import torch.nn as nn
from torch.autograd import Variable

EPOCHES = 2000
USE_CUDA = torch.cuda.is_available()

class MyModel(nn.Module):
  def __init__(self, n_feature, n_hidden, n_output):
    super(MyModel,self).__init__()
    self.hidden = nn.Linear(n_feature, n_hidden)
    self.fc = nn.Linear(n_hidden, n_hidden)
    self.dropout = nn.Dropout(p=0.2)
    self.output = nn.Linear(n_hidden, n_output)

  def forward(self, input):
    h1 = torch.relu(self.hidden(input))
    h1 = self.dropout(h1)
    out = self.output(h1)
    return out

model = MyModel(n_feature = int(train_X.shape[-1]), n_hidden = 200, n_output = int(train_y.shape[-1]))
if USE_CUDA:
  model = model.cuda()
optimizer = torch.optim.Adam(model.parameters())
loss_fn = torch.nn.MSELoss(size_average=False)
/usr/local/lib/python3.8/dist-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
  warnings.warn(warning.format(ret))
inputs = Variable(torch.Tensor(train_X))
labels = Variable(torch.Tensor(train_y))

if USE_CUDA:
  inputs = inputs.cuda()
  labels = labels.cuda()
phase = 'train'
model.train()
do_test_loss = False

loss_record = []
loss_test_record = []
for epoch in range(EPOCHES):
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = loss_fn(outputs, labels)
    loss_record.append(loss.data.item())

    if phase == 'train':
        loss.backward()
        optimizer.step()
    
    if do_test_loss:
        outputs_test = model(inputs_test)
        loss_test = loss_fn(outputs_test, labels_test)
        loss_test_record.append(loss_test.data.item())
plt.plot(loss_record)
if do_test_loss:
    plt.plot(loss_test_record)
    plt.legend(["train loss", "test loss"])
else:
    plt.legend(["train loss"])
plt.grid(True)

Change the model from training to evaluation mode to improve testing performance.

phase = 'test'
# TODO
model.eval()

if USE_CUDA:
  test_X = Variable(torch.Tensor(test_X)).cuda()
# Predict bounding boxes on the test images.
pred_y = model(Variable(torch.Tensor(test_X)))
pred_bboxes = pred_y.data * img_size
if USE_CUDA:
  pred_bboxes = pred_bboxes.cpu()
pred_bboxes = pred_bboxes.numpy().reshape(len(pred_bboxes), num_objects, -1)
pred_bboxes.shape
(10000, 1, 4)
def IOU(bbox1, bbox2):
    '''Calculate overlap between two bounding boxes [x, y, w, h] as the area of intersection over the area of unity'''
    x1, y1, w1, h1 = bbox1[0], bbox1[1], bbox1[2], bbox1[3]
    x2, y2, w2, h2 = bbox2[0], bbox2[1], bbox2[2], bbox2[3]

    w_I = min(x1 + w1, x2 + w2) - max(x1, x2)
    h_I = min(y1 + h1, y2 + h2) - max(y1, y2)
    if w_I <= 0 or h_I <= 0:  # no overlap
        return 0.
    I = w_I * h_I
    U = w1 * h1 + w2 * h2 - I
    return I / U
# Show a few images and predicted bounding boxes from the test dataset. 
plt.figure(figsize=(12, 3))
for i_subplot in range(1, 5):
    plt.subplot(1, 4, i_subplot)
    i = np.random.randint(len(test_imgs))
    plt.imshow(test_imgs[i].T, cmap='Greys', interpolation='none', origin='lower', extent=[0, img_size, 0, img_size])
    for pred_bbox, exp_bbox in zip(pred_bboxes[i], test_bboxes[i]):
        plt.gca().add_patch(matplotlib.patches.Rectangle((pred_bbox[0], pred_bbox[1]), pred_bbox[2], pred_bbox[3], ec='r', fc='none'))
        plt.annotate('IOU: {:.2f}'.format(IOU(pred_bbox, exp_bbox)), (pred_bbox[0], pred_bbox[1]+pred_bbox[3]+0.2), color='r')
# Calculate the mean IOU (overlap) between the predicted and expected bounding boxes on the test dataset. 
summed_IOU = 0.
for pred_bbox, test_bbox in zip(pred_bboxes.reshape(-1, 4), test_bboxes.reshape(-1, 4)):
    summed_IOU += IOU(pred_bbox, test_bbox)
mean_IOU = summed_IOU / len(pred_bboxes)
mean_IOU
0.909628008588464

Task 1b:
Move the computation that is currently done on the CPU over to the GPU using CUDA and increase the number of epochs. Improve the training setup, possibly also changing model or optimizer, until you reach a test IOU above 0.9.

You can make the changes that move computation to the GPU directly in the cells above as part of 1a.

You may get stuck not achieving test IOU above 0.6. In that case, learn about switching the model to evaluation mode and apply the change above.

Question 1c:
Why does eval mode above have such a significant effect on test performance? Please give a short answer below.

Since we use dropout in our network, which randomly chooses neurons and temporarily invalidates them during the training stage, this would still work and lead to defunctioning neurons during prediction. So we use eval mode which would skip the dropout step while predicting.

Task 2: Use a pretrained model
As mentioned in class, deep learning systems are hardly ever developed from scratch, but usually work by refining existing solutions to similar problems. For the following task, we'll work through the Transfer learning tutorial, which also provides a ready-made jupyter notebook.

2.1. Download the notebook and get it to run in your environment. This also involves downloading the bees and ants dataset.
2.2. Perform your own training with the provided setup, fill out the answer to Task 2.2 below.
2.3. Change the currently chosen pretrained network (resnet) to a different one. At least try out VGG and one other type and use the "conv net as fixed feature extractor" approach, fill out the answer to Task 2.3 below.
2.4. Load a picture that you took yourself and classify it with an unmodified pretrained network (e.g. the original VGG network) that can detect one out of 1000 classes. Fill out the answer to Task 2.4 below.

Your solution for Task 2
Before you start, get the data from here and extract it into a subfolder data. The following import is going to attempt loading the image data from there.

Initialize much of the source code from the tutorial notebook located at https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html using this module

# #Upload dataset and unzip it in Google drive

# from google.colab import drive
# drive.flush_and_unmount()
# drive.mount('/content/drive', force_remount=False)

# !unzip '/content/drive/MyDrive/cmpt733/assignment6-DL/hymenoptera_data.zip' -d '/content/drive/MyDrive/cmpt733/assignment6-DL/'

# #Set working directory to the current path
# import os
# path="/content/drive/MyDrive/cmpt733/assignment6-DL"
# os.chdir(path)
# os.listdir(path)
Mounted at /content/drive
from tfl_tut import *
Please study the original notebook and then continue to use its functions as imported from the tfl_tut model for convenience to minimize source code copy & paste.

# Get a batch of training data
inputs, classes = next(iter(dataloaders['train']))

# Make a grid from batch
out = torchvision.utils.make_grid(inputs)

imshow(out, title=[class_names[x] for x in classes])
/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py:554: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(

Answer for Task 2.2
# TODO paste and maybe modify relevant code to perform your own training
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
# Here the size of each output sample is set to 2.
# Alternatively, it can be generalized to nn.Linear(num_ftrs, len(class_names)).
model_ft.fc = nn.Linear(num_ftrs, 2)

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)


model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)
/usr/local/lib/python3.8/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Epoch 0/24
----------
train Loss: 0.5424 Acc: 0.7119
val Loss: 0.2373 Acc: 0.9346

Epoch 1/24
----------
train Loss: 0.4267 Acc: 0.8477
val Loss: 0.2235 Acc: 0.9281

Epoch 2/24
----------
train Loss: 0.5008 Acc: 0.7819
val Loss: 0.5661 Acc: 0.8039

Epoch 3/24
----------
train Loss: 0.4033 Acc: 0.8436
val Loss: 0.3863 Acc: 0.8758

Epoch 4/24
----------
train Loss: 0.4427 Acc: 0.7984
val Loss: 0.4029 Acc: 0.8758

Epoch 5/24
----------
train Loss: 0.4931 Acc: 0.8107
val Loss: 0.4291 Acc: 0.8497

Epoch 6/24
----------
train Loss: 0.4386 Acc: 0.8148
val Loss: 0.5124 Acc: 0.8497

Epoch 7/24
----------
train Loss: 0.2829 Acc: 0.8971
val Loss: 0.2875 Acc: 0.9150

Epoch 8/24
----------
train Loss: 0.3737 Acc: 0.8642
val Loss: 0.2706 Acc: 0.9216

Epoch 9/24
----------
train Loss: 0.3727 Acc: 0.8724
val Loss: 0.2529 Acc: 0.9281

Epoch 10/24
----------
train Loss: 0.3978 Acc: 0.8436
val Loss: 0.2338 Acc: 0.9216

Epoch 11/24
----------
train Loss: 0.2236 Acc: 0.9053
val Loss: 0.2341 Acc: 0.9216

Epoch 12/24
----------
train Loss: 0.2559 Acc: 0.8930
val Loss: 0.2063 Acc: 0.9412

Epoch 13/24
----------
train Loss: 0.3466 Acc: 0.8395
val Loss: 0.2371 Acc: 0.9281

Epoch 14/24
----------
train Loss: 0.2652 Acc: 0.9012
val Loss: 0.2241 Acc: 0.9412

Epoch 15/24
----------
train Loss: 0.2140 Acc: 0.8930
val Loss: 0.2322 Acc: 0.9346

Epoch 16/24
----------
train Loss: 0.4510 Acc: 0.8230
val Loss: 0.2227 Acc: 0.9412

Epoch 17/24
----------
train Loss: 0.2177 Acc: 0.9177
val Loss: 0.2240 Acc: 0.9281

Epoch 18/24
----------
train Loss: 0.3583 Acc: 0.8477
val Loss: 0.2216 Acc: 0.9346

Epoch 19/24
----------
train Loss: 0.2548 Acc: 0.8848
val Loss: 0.2292 Acc: 0.9412

Epoch 20/24
----------
train Loss: 0.2647 Acc: 0.8807
val Loss: 0.2289 Acc: 0.9346

Epoch 21/24
----------
train Loss: 0.2768 Acc: 0.8601
val Loss: 0.2184 Acc: 0.9346

Epoch 22/24
----------
train Loss: 0.3165 Acc: 0.8724
val Loss: 0.2215 Acc: 0.9412

Epoch 23/24
----------
train Loss: 0.3683 Acc: 0.8560
val Loss: 0.2255 Acc: 0.9281

Epoch 24/24
----------
train Loss: 0.3441 Acc: 0.8889
val Loss: 0.2297 Acc: 0.9281

Training complete in 2m 18s
Best val Acc: 0.941176
visualize_model(model_ft)






Answer for Task 2.3
Hints for this task
Focus on the section Conv net as fixed feature xtractor of the transfer learning tutorial. First, change the line

model_conv = models.resnet18(pretrained=True)
to load VGG16 instead. Set all its parameters to not require gradient computation, as shown in the tutorial.

Next, print out the new model_conv and identify the last step of the classification. This is not named the same way as the fc layer for resnet, but it works similarily. The last classification step of the VGG model determines the probabilities for each of the 1000 classes of the dataset. Change this layer to identify only 2 classes to distinguish ants and bees as in the example.

To change the structure of some Sequential component called model_conv.module_name and to modify its last layer into a DifferentLayer type, you can use this syntax:

nn.Sequential(*list(model_conv.module_name.children())[:-1] +
                     [nn.DifferentLayer(...)])
and replace the old model_conv.module_name with this differently structured version.

#perform fixed feature training on VGG16

model_conv = models.vgg16(pretrained = True)
for param in model_conv.parameters():
    param.requires_grad = False

#alter the laster fc later to only produce 2-type classification
model_conv.classifier._modules['6'] = nn.Linear(4096, 2)

print(model_conv.classifier)

model_conv = model_conv.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_conv.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)


model_conv = train_model(model_conv, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)

visualize_model(model_conv)
/usr/local/lib/python3.8/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Sequential(
  (0): Linear(in_features=25088, out_features=4096, bias=True)
  (1): ReLU(inplace=True)
  (2): Dropout(p=0.5, inplace=False)
  (3): Linear(in_features=4096, out_features=4096, bias=True)
  (4): ReLU(inplace=True)
  (5): Dropout(p=0.5, inplace=False)
  (6): Linear(in_features=4096, out_features=2, bias=True)
)
Epoch 0/24
----------
train Loss: 0.3096 Acc: 0.8848
val Loss: 0.0850 Acc: 0.9608

Epoch 1/24
----------
train Loss: 0.2117 Acc: 0.9383
val Loss: 0.1226 Acc: 0.9673

Epoch 2/24
----------
train Loss: 0.1934 Acc: 0.9342
val Loss: 0.0994 Acc: 0.9346

Epoch 3/24
----------
train Loss: 0.1686 Acc: 0.9259
val Loss: 0.0863 Acc: 0.9608

Epoch 4/24
----------
train Loss: 0.2152 Acc: 0.9259
val Loss: 0.0583 Acc: 0.9804

Epoch 5/24
----------
train Loss: 0.1135 Acc: 0.9588
val Loss: 0.1321 Acc: 0.9477

Epoch 6/24
----------
train Loss: 0.0807 Acc: 0.9753
val Loss: 0.1126 Acc: 0.9477

Epoch 7/24
----------
train Loss: 0.1755 Acc: 0.9424
val Loss: 0.0917 Acc: 0.9542

Epoch 8/24
----------
train Loss: 0.1411 Acc: 0.9588
val Loss: 0.0922 Acc: 0.9477

Epoch 9/24
----------
train Loss: 0.0906 Acc: 0.9630
val Loss: 0.0945 Acc: 0.9477

Epoch 10/24
----------
train Loss: 0.1342 Acc: 0.9465
val Loss: 0.0901 Acc: 0.9608

Epoch 11/24
----------
train Loss: 0.1343 Acc: 0.9424
val Loss: 0.0876 Acc: 0.9477

Epoch 12/24
----------
train Loss: 0.1245 Acc: 0.9506
val Loss: 0.0889 Acc: 0.9477

Epoch 13/24
----------
train Loss: 0.1379 Acc: 0.9547
val Loss: 0.0889 Acc: 0.9477

Epoch 14/24
----------
train Loss: 0.1212 Acc: 0.9465
val Loss: 0.0886 Acc: 0.9477

Epoch 15/24
----------
train Loss: 0.0799 Acc: 0.9753
val Loss: 0.0886 Acc: 0.9477

Epoch 16/24
----------
train Loss: 0.1128 Acc: 0.9465
val Loss: 0.0882 Acc: 0.9477

Epoch 17/24
----------
train Loss: 0.1164 Acc: 0.9547
val Loss: 0.0880 Acc: 0.9477

Epoch 18/24
----------
train Loss: 0.1269 Acc: 0.9506
val Loss: 0.0871 Acc: 0.9542

Epoch 19/24
----------
train Loss: 0.0957 Acc: 0.9465
val Loss: 0.0872 Acc: 0.9542

Epoch 20/24
----------
train Loss: 0.1145 Acc: 0.9506
val Loss: 0.0873 Acc: 0.9542

Epoch 21/24
----------
train Loss: 0.0584 Acc: 0.9835
val Loss: 0.0872 Acc: 0.9542

Epoch 22/24
----------
train Loss: 0.0748 Acc: 0.9671
val Loss: 0.0872 Acc: 0.9542

Epoch 23/24
----------
train Loss: 0.1281 Acc: 0.9547
val Loss: 0.0873 Acc: 0.9542

Epoch 24/24
----------
train Loss: 0.1350 Acc: 0.9588
val Loss: 0.0873 Acc: 0.9542

Training complete in 2m 20s
Best val Acc: 0.980392






# use Alexnet 
model_conv = models.alexnet(pretrained=True)

#change the last fc layer to only produce 2-type calssfication 
model_conv.classifier._modules['6'] = nn.Linear(4096, 2)

print(model_conv.classifier)

model_conv = model_conv.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_conv.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)


model_conv = train_model(model_conv, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)

visualize_model(model_conv)
Sequential(
  (0): Dropout(p=0.5, inplace=False)
  (1): Linear(in_features=9216, out_features=4096, bias=True)
  (2): ReLU(inplace=True)
  (3): Dropout(p=0.5, inplace=False)
  (4): Linear(in_features=4096, out_features=4096, bias=True)
  (5): ReLU(inplace=True)
  (6): Linear(in_features=4096, out_features=2, bias=True)
)
Epoch 0/24
----------
train Loss: 0.8216 Acc: 0.6584
val Loss: 0.6030 Acc: 0.7255

Epoch 1/24
----------
train Loss: 0.6448 Acc: 0.6626
val Loss: 0.6187 Acc: 0.6928

Epoch 2/24
----------
train Loss: 0.6228 Acc: 0.6872
val Loss: 0.5152 Acc: 0.7712

Epoch 3/24
----------
train Loss: 0.6126 Acc: 0.6296
val Loss: 0.5904 Acc: 0.7124

Epoch 4/24
----------
train Loss: 0.5708 Acc: 0.6955
val Loss: 0.5082 Acc: 0.6797

Epoch 5/24
----------
train Loss: 0.5363 Acc: 0.7449
val Loss: 0.5243 Acc: 0.8105

Epoch 6/24
----------
train Loss: 0.4811 Acc: 0.8066
val Loss: 0.4209 Acc: 0.7582

Epoch 7/24
----------
train Loss: 0.3995 Acc: 0.8272
val Loss: 0.3625 Acc: 0.8301

Epoch 8/24
----------
train Loss: 0.3217 Acc: 0.8848
val Loss: 0.3529 Acc: 0.8431

Epoch 9/24
----------
train Loss: 0.2856 Acc: 0.8683
val Loss: 0.3788 Acc: 0.8235

Epoch 10/24
----------
train Loss: 0.2557 Acc: 0.8807
val Loss: 0.3510 Acc: 0.8497

Epoch 11/24
----------
train Loss: 0.2586 Acc: 0.9012
val Loss: 0.3974 Acc: 0.8366

Epoch 12/24
----------
train Loss: 0.2143 Acc: 0.9012
val Loss: 0.3624 Acc: 0.8497

Epoch 13/24
----------
train Loss: 0.2310 Acc: 0.8889
val Loss: 0.3208 Acc: 0.8627

Epoch 14/24
----------
train Loss: 0.1930 Acc: 0.9259
val Loss: 0.3214 Acc: 0.8627

Epoch 15/24
----------
train Loss: 0.2165 Acc: 0.8971
val Loss: 0.3215 Acc: 0.8627

Epoch 16/24
----------
train Loss: 0.1821 Acc: 0.9136
val Loss: 0.3298 Acc: 0.8693

Epoch 17/24
----------
train Loss: 0.2037 Acc: 0.9383
val Loss: 0.3279 Acc: 0.8693

Epoch 18/24
----------
train Loss: 0.1924 Acc: 0.9177
val Loss: 0.3361 Acc: 0.8627

Epoch 19/24
----------
train Loss: 0.1976 Acc: 0.9053
val Loss: 0.3341 Acc: 0.8627

Epoch 20/24
----------
train Loss: 0.2055 Acc: 0.9136
val Loss: 0.3341 Acc: 0.8627

Epoch 21/24
----------
train Loss: 0.1716 Acc: 0.9218
val Loss: 0.3342 Acc: 0.8627

Epoch 22/24
----------
train Loss: 0.2026 Acc: 0.9259
val Loss: 0.3349 Acc: 0.8627

Epoch 23/24
----------
train Loss: 0.1744 Acc: 0.9259
val Loss: 0.3349 Acc: 0.8627

Epoch 24/24
----------
train Loss: 0.1813 Acc: 0.9218
val Loss: 0.3352 Acc: 0.8627

Training complete in 2m 7s
Best val Acc: 0.869281






Answer for Task 2.4
#load my pet dog's picture (He is a mixed German Dachshund by the way)
from PIL import Image
from torchvision import transforms
img = Image.open("stickyrice.jpg")
display(img)

#transform the jpg photo
transform = transforms.Compose([            
 transforms.Resize(256),                    
 transforms.CenterCrop(224),                
 transforms.ToTensor(),                     
 transforms.Normalize(                     
 mean=[0.485, 0.456, 0.406],                
 std=[0.229, 0.224, 0.225]
 )])

img_t = transform(img)
batch_t = torch.unsqueeze(img_t, 0)

#load alexnet and set it to evaluation mode for classification
alexnet = models.alexnet(pretrained=True)
alexnet.eval()

out = alexnet(batch_t)
_, index = torch.max(out, 1)

fig = plt.figure()

print('The predicted label is: ', torchvision.models.AlexNet_Weights.IMAGENET1K_V1.value.meta["categories"][index])

The predicted label is:  Rhodesian ridgeback
<Figure size 432x288 with 0 Axes>
Please include the picture so we can view it and its class label in the saved notebook. It's OK, if we don't have the actual image file to reproduce the output.

Submission
Your submission should be based on a modified version of this notebook containing answers to Task 1 and for Task 2, saved with figures including some portions of the transfer learning tutorial notebook in the sections for tasks 2.1 - 2.4.

More products