2 Layer Neural Network SOLVED

Your shopping cart is empty.

BBM 103: Introduction to Programming Laboratory I
PROGRAMMING ASSIGNMENT 5
1 Introduction
In this experiment, you will implement a 2 Layer Neural Network to solve a real world machine
learning problem with a real world data. In this experiment, you will get familiar Python
data science libraries such as; Pandas, NumPy, and Matplotlib. At the end, you will build a
deep neural network that classifies cat vs. non-cat images.
2 Assignment
Packages
• numpy is the fundamental package for scientific computing with Python.
• matplotlib is a library to plot graphs in Python.
• h5py is a common package to interact with a dataset that is stored on an H5 file.
• PIL and scipy are used here to test your model with your own picture at the end.
• np.random.seed(1) is used to keep all the random function calls consistent. It will help
us grade your work.
In this assignment, you will build a 2 neural network to solve a real world problem.
3 Part 1. Reading Data
You will use the same ”Cat vs non-Cat” dataset. You are given a dataset (”data.h5”) containing
• a training set of m train images labelled as cat (1) or non-cat (0)
• a test set of m test images labelled as cat and non-cat
• each image is of shape (num px, num px, 3) where 3 is for the 3 channels (RGB).
Figure 1: Image to vector conversion.
You will reshape and standardize the images before feeding them to the network.
Programming Assignment 5 1
Fall 2021
BBM 103: Introduction to Programming Laboratory I
4 Part 2. Implementing a 2-layer Neural Network
Neural networks are simply modeled after the human neuron. Just as the brain is made up of
numerous neurons, Neural Networks are made up of many activation units. In this assignment
you will build a 2-layer neural network. The architecture of the neural network is given in
Figure 2.
Figure 2: 2-layer neural network.
The model can be summarized as: INPUT - LINEAR - RELU - LINEAR - SIGMOID -
OUTPUT.
Detailed Architecture of Figure 2:
• The input is a (64,64,3) image which is flattened to a vector of size (12288,1).
• The corresponding vector: [x0, x1, ..., x12287]
[T]
is then multiplied by the weight matrix
W[1] of size (n
[1]; 12288).
• You then add a bias term and take its relu to get the following vector: [a
[1]
0
, a
[1]
1
, ..., a
[1]
n[1]−1
]
[T]
• You then repeat the same process.
• You multiply the resulting vector by W[2] and add your intercept (bias).
• Finally, you take the sigmoid of the result. If it is greater than 0.5, you classify it to be
a cat.
5 General Methodology
To build the model you will follow the steps given below:
1. Initialize parameters / Define hyperparameters
2. Loop for num iterations:
• Forward propagation
• Compute cost function
• Backward propagation
• Update parameters (using parameters, and grads from backprop)
Programming Assignment 5 2
Fall 2021
BBM 103: Introduction to Programming Laboratory I
3. Use trained parameters to predict labels
Forward propagation
After initialized your parameters, you will do the forward propagation module. You will
apply LINEAR and ACTIVATION where ACTIVATION will be either ReLU or Sigmoid.
The forward propagation module computes the following equations:
Z
[1] = W[1]X + b
[1]
A = RELU(Z
[1]) = max(0, Z[1])
Z
[2] = W[2]A + b
[2]
AL = σ(Z
[2]) = 1
1 + e−Z[2]
where X is the input, Z
[1] is the output of the first LINEAR, A is the output of the Relu
ACTIVATION, Z
[2] is the output of the second LINEAR, and σ(AL) is the output of the
neural network. The details of the activation functions are given below:
Sigmoid : σ(Z) = σ(W A + b) = 1
1+e−(W A+b)
RELU : A = RELU(Z) = max(0, Z)A = RELU(Z) = max(0, Z).
Cost Function
You need to compute the cost function, because you want to check if your model is actually
learning.
Compute the cross-entropy cost J, using the following formula:
−
1
m
Xm
i=1
(y
(i)
log(AL(i)
) + (1 − y
(i)
) log(1 − AL(i)
))
where m is the number of samples in your training set, y
i
is the ith output and AL(i)
is the
output of the neural network for ith sample.
Backward Propagation
You have already calculated the derivative dZ[l] =
∂L
∂Z[l]
. You want to get (dW[l]
, db[l]
, dA[l−1]).
The formulas you need: The three outputs (dW[l]
, db[l]
, dA[l−1]) are computed using the input
dZ[l]
.Here are the formulas you need:
dW[2] =
∂J
∂W[2] =
1
m
dZ[2]A
T
db[2] =
∂J
∂b[2] =
1
m
Xm
i=1
dZ[2](i)
dW[1] =
∂J
∂W[1] =
1
m
dZ[1]A
T
Programming Assignment 5 3
Fall 2021
BBM 103: Introduction to Programming Laboratory I
db[1] =
∂J
∂b[1] =
1
m
Xm
i=1
dZ[1](i)
Update Parameters
W[2] = W[2] − αdW[2]
b
[2] = b
[2] − αdb[2]
W[1] = W[1] − αdW[1]
b
[1] = b
[1] − αdb[1]
where α is the learning rate.
Plot Loss
In your training part, in each iteration you get cost value and append to costs array. You will
plot loss figure with this array.
Calculate Accuracy
In test set you will use the linear activation forward function to calculate the accuracy. As
mentioned above, if the output value is greater than 0.5, then your prediction is 1 and else is
0.
Accuracy will be calculated as; accuracy = T rueP ositiveCount(T P)/T otalNumberofSamples
TP = is an outcome that model predicts the true class correctly.
For instance; the predictions matrix and label matrices are for 4 sample;
predictions matrix;




1
0
0
1




label matrix;




1
0
1
0




TP count=2 and Total Number of Samples=4
Accuracy = 2/4 = 50%
You will fill the functions with the given parameters given in the below:
def initialize parameters(n x, n h, n y) :
...
return parameters
def linear activation forward(X, parameters) :
...
return AL
def compute cost(AL, y) :
...
return cost
def linear activation backward(X, cost) :
...
return grads
Programming Assignment 5 4
Fall 2021
BBM 103: Introduction to Programming Laboratory I
def update parameters(parameters, grads, learning rate) :
...
return parameters
def predict(test x, test y, parameters) :
...
return accuracy
def plot loss(costs) :
...
def two layer model(X, Y, layers dims, learning rate, num iterations) :
...
return parameters, costs
def main() :
...
Inputs to a unit is the features from training set. Weights W[1] and W[2] ; b
[1] and b
[2] are the
parameters that you will learn. Activation functions (RELU and Sigmoid] modify the input
data in order to learn complex relations between input and output data.
In Neural Networks, you forward propagate to predict output and compare it with the real
value and calculate the loss. Loss value tells us how good our model make its predictions
which tells us how to change our parameters to make our model predict more accurate.
In order to minimize the loss, you propagate backwards with respect to each weight value by
finding the derivative of the error. Thus, weights will be tuned. Neural networks are trained
iteratively. After each iteration error is calculated via the difference between prediction and
target.
You are expected to plot loss in each iteration.
Activation Functions
Sigmoid functions:
hθ(x) = 1
1 + e
x
The derivative of the Sigmoid function is:
∂hθ(x)
∂x = x ∗ (1 − x)
Relu function:
hθ(x) =
x x > 0
0 x <= 0
The derivative of the Relu function is;
∂hθ(x)
∂x =

1 x > 0
0 x < 0

Programming Assignment 5 5
Fall 2021
BBM 103: Introduction to Programming Laboratory I
Helper Links
• Transpose. https://chortle.ccsu.edu/VectorLessons/vmch13/vmch1314.html
• Transpose. https://numpy.org/doc/stable/reference/generated/numpy.transpose.html
• Dot product. https://www.mathsisfun.com/algebra/vectors-dot-product.html
• Dot product. https://numpy.org/doc/stable/reference/generated/numpy.dot.html
6 Grading
Task Point
initialize parameters 2
linear activation forward 25
compute cost 15
linear activation backward 25
update parameters 5
predict 10
plot loss 5
two layer model 10
main 3
7 Important Notes
• You will use the given code for the assignment. Please fill the functions given in the
code.
• Do not miss the submission deadline.
• Save all your work until the assignment is graded.
• The assignment must be original, individual work. Duplicate or very similar assignments
are both going to be considered as cheating. You can ask your questions via Piazza and
you are supposed to be aware of everything discussed on Piazza. You cannot share
algorithms or source code. All work must be individual! Assignments will be checked
for similarity, and there will be serious consequences if plagiarism is detected.
• You must submit your work with the file as stated below:
assignment5.ipynb
Programming Assignment 5 6

Shopping cart

US$0

2 Layer Neural Network SOLVED

More products