$30
HOMEWORK #1
ECBM E6040
INSTRUCTIONS: This homework contains two parts - theory (I) and programming (II). Please submit your homework via your bitbucket repository. Your submission should consist of
1. a file called hw1 writeup.pdf providing the solution to the theory questions;
2. completed code in hw1a.py and hw1b.py ;
3. an output folder containing png files created by executing hw1a.py and hw1b.py.
Part I: Programming
For this homework, you will be using the JAFFE dataset.
Clone the repository created for you via bitbucket. Message the instructors via Piazza
if a repository for Homework #1 was not created for you by end of Tuesday, Feb. 2,
2016. The repository contains a script download.sh that will download and unzip
the dataset.
A public AMI E6040 hw1 ami has been created with all necessary packages and
patches preinstalled for solving this assignment.
If you would like to use your own Ubuntu machine/AMI, the following instructions
might be helpful
• Install libjpeg by executing sudo apt-get install libjpeg-dev
• Install libx11 by executing sudo apt-get install libx11-dev
• Install PIL in your theano environment by executing pip install Pillow
• Apply changes from this PR to your theano installation.
An alternative to the last step above is downgrading numpy by running the following
commands in your conda environment
1. conda uninstall numpy
2. conda uninstall scipy
3. conda uninstall matplotlib
4. conda install numpy==1.9.3
5. conda install scipy==0.15.1
6. conda install matplotlib==1.4.2
7. pip uninstall theano
8. pip install theano
If you will be executing the code on a different OS, please search online for how to
perform the above steps for your configuration.
Use single precision for both problems in this assignment.
PROBLEM a: (35 points)
In this problem, you will be dividing the images into blocks of sizes (16, 16),(32, 32),
(64, 64) and performing principal component analysis in each case. For each case you
will visualize reconstructions using different number of principal components and also
visualize the top components.
Skeleton code for this problem has been provided in hw1a.py in the repository.
Some useful links:-
1. PIL documentatiom
2. PIL convert documentation
3. theano.tensor.nnet.neighbours documentation
4. numpy.linalg.eigh documentation
PROBLEM b: (35 points)
In this problem, you will be essentially performing the same tasks as in PROBLEM
a, but on the whole image instead of blocks. Since the images are of size 256 × 256,
the matrix XTX will be of size 65, 536 × 65, 536. You most likely won’t be able to
even load this matrix (unless you have enormous amount of RAM available), let alone
2
perform eigenanalysis on it. Hence, you will be solving this problem by using gradient
descent.
Recall from class, that the top principal component can be extracted by solving the
following optimization problem
argmin
d
−d
TXTXd,
subject to d
Td = 1.
The above can be resolved by using gradient descent with the cost function f(d) =
−d
TXTXd while normalizing d after each update (descent).
Other principal components can be found by “taking out the contribution of the
already determined components ”. This can be done as in the following pseudocode.
Algorithm 1 Multiple principle components via gradient descent
Input: Data Matrix X, number of components to extract N, learning rate η, Max
steps T, Stopping condition
Returns: Principal components di
for i = 0, · · · , N
1: for i = 0, · · · , N do
2: Ai ← XTX −
i
P−1
j=0
λjdjdj
T
3: Initialize di randomly and let t = 1
4: while (t ≤ T & Stopping condition is not True) do
5: y ← di − η∇di
−d
T
i Aidi
?
6: di ←
y
kyk
7: t ← t + 1
8: λi ← di
TXTXdi
Gradient descent can be performed very easily in theano as it supports symbolic
differentiation. Please note that there shouldn’t be a need to compute large matrices
of order 65, 536×65, 536 at any point. You should write your theano expressions and
functions such that it avoids computing the large matrices. Choose an appropriate
learning rate and an appropriate stopping criteria (for example, when the change in
the cost is below some small ? or the change in kdik is below some small ?).
Skeleton code for this part is available in hw1b.py.
Some useful links:-
1. Theano Logistic Regression Example
Part II: Theory
PROBLEM c (15 points)
(i)
px(x) = (
1 if 0 ≤ x ≤ 1
0 otherwise
y = −
1
λ
ln(x)
Find py(y).
(ii)
p(x = x, y = y) = (
3(xy2 + yx2
) for x, y ∈ [0, 1]
0 otherwise
Find p(x = x), p(y = y), E(x), E(y), E(xy). Are x and y independent?
PROBLEM d (15 points)
Let us assume that we model certain data X = {x
(1)
, · · · , x
(m)}, x
(i) ∈ R
n
, to have
been drawn (independently) from a multivariate gaussian distribution N (µ, Σ).
(i) Find maximum likelihood estimators for µ, Σ;
(ii) Are the estimators biased or unbiased?
Note:- The multivariate normal distribution is given by
N (µ, Σ) ∼
1
(2π)
n/2
1
|Σ|
1/2
exp ?
−
1
2
(x − µ)
TΣ
−1
(x − µ)
?
,
where µ is the n-dimensional mean vector and Σ is the n × n covariance matrix.
Some useful links:-
1. Matrix Cookbook
NEED HELP:
If you have any questions you are advised to use Piazza forum which is accessible
through courseworks.
GOOD LUCK!
4