Starting from:

$30

Assignment 4 Segmentation

Introduction to Computer Vision (ECSE 415)
Assignment 4

 More details on
the format of the submission can be found below. Attempt all parts of this assignment. The assignment will be graded out of total of 85 points.
Submission Instructions
1. Submit two different jupyter notebooks (1 per person in a group) consisting of answers to 2 questions each from the assignment. All questions
should be answered within these two notebooks. This will allow us to
estimate the work done by each member of the group. Feel free to divide
work as per your wish between members of the group. Submit one single
zip file per group.
2. Comment your code appropriately.
3. Make sure that the submitted code is running without error. Add a
README file if required.
4. If external libraries were used in your code, please specify their name and
version in the README file.
5. Answers to reasoning questions should be comprehensive but concise.
6. Submissions that do not follow the format will be penalized 10%.
1 Segmentation (65 Points)
Use image ‘flower.jpg’ (Fig.1) for this question.
1
Figure 1: Image to be used for segmentation questions.
1.1 K-means clustering and Expectation Maximization
In this question, you will implement K-means and EM from scratch. The
algorithms should be implemented using only the numpy library. You
can use opencv and matplotlib libraries only to read and display images but not
for clustering.
1.1.1 K-means clustering
• Implement K-means clustering from scratch. (10 points)
• Apply your implementation of K-means to the provided image with K=2
and K=3. Use R, G, B color channels of the image as three features.
Display (a) the segmentation evolving during the first 5 iterations and (b)
the final segmentation. (5 points)
• Convert the given image into gray-scale. Apply K-means to the provided
image with K=2 and K=3. For every pixel, use gray-scale intensity as
feature. Display (a) the segmentation evolving during the first 5 iterations
and (b) the final segmentation. (4 points)
• Compare final segmentation maps of color and gray-scale images. Which
feature results in better segmentation? (1 points)
1.1.2 Expectation Maximization – Gaussian Mixture Model
• Implement EM from scratch. (15 points)
• Apply your implementation of EM to the provided image with 2 and 3
Gaussian components. Use R, G, B color channels of the image as three
features. Display (a) the segmentation evolving during the first 5 iterations
and (b) the final segmentation. (5 points)
2
• Convert the given image into gray-scale. Apply EM to the provided image
with 2 and 3 Gaussian components. For every pixel, use gray-scale intensity as feature. Display (a) the segmentation evolving during the first 5
iterations and (b) the final segmentation. (4 points)
• Compare final segmentation maps of color and gray-scale images. Which
feature results in better segmentation? (1 points)
1.2 Normalized Graph-cut and Mean-shift Segmentation
You can use functions from OpenCV or skimage libraries for this question.
• Segment the given image using normalized graph-cuts. Vary the following
parameters (try several values of each parameter): compactness and n segments (slic function), thresh (cut normalized function). Display segmentation results for several parameters and state their effect on the output
(10 Points).
• Segment the given image using mean-shift. Vary the following parameters
(try several values of each parameter): ratio, kernel size, max dist. Display
segmentation results for several parameters and state their effect on the
output. (10 points)
2 Image Classification with Convolutional Neural Network (20 Points)
In this part, you will classify MNIST digits into 10 categories using a CNN. You
may chose to run the code on GPU.
1. Use Pytorch class torchvision.datasets.MNIST to (down)load the dataset.
Use batch size of 32. (3 points)
2. Implement a CNN with the layers mentioned below. (6 points)
• A convolution layer with 32 kernels of size 3×3.
• A ReLU activation.
• A convolution layer with 64 kernels of size 3×3.
• A ReLU activation.
• A maxpool layer with kernels of size 2×2.
• A convolution layer with 64 kernels of size 3×3.
• A ReLU activation.
• A convolution layer with 64 kernels of size 3×3.
• A ReLU activation.
3
• A flattening layer. (This layer resizes 2D feature map to a feature
vector. The length of this feature vector should be 4096.)
• A Linear layer with output size of 10.
(Suggestion: You can start with the code from the Tutorial and adapt it
for the current problem.)
3. Create an instance of SGD optimizer with learning rate of 0.001. Use
the default setting for rest of the hyperparameters. Create an instance of
categorical cross entropy criterion. (2 point)
4. Train the CNN for 12 epochs. (6 points)
5. Predicts labels of the test images using the above trained CNN. Measure
and display classification accuracy. (3 points)
4

More products