$30
AMATH 563
Homework 1: Regression and Sparsity
Download the MNIST data set (both training and test sets and labels): http://yann.lecun.com/exdb/mnist/.
The labels will tell you which digit it is: 1, 2, 3, 4, 5, 6, 7, 8, 9, 0. Let each output be denoted by the vector yj
.
“1” =
1
0
0
.
.
.
0
0
, “2” =
0
1
0
.
.
.
0
0
, · · · , “9” =
0
0
0
.
.
.
1
0
, “0” =
0
0
0
.
.
.
0
1
(1)
Now let B be the set of output vectors
B = [y1 y2 y3 · · · yn] (2)
and let the matrix A be the corresponding reshaped (vectorized) MNIST images
A = [x1 x2 x3 · · · xn] (3)
Thus each vector xj ∈ R
n
2
is an a vector reshaped from the n × n image.
1. Using various AX = B solvers, determine a mapping from the image space to the label space.
2. By promoting sparsity, determine and rank which pixels in the MNIST set are most informative for
correctly labeling the digits. (You’ll have to come up with your own heuristics or empirical rules for
this. Use pcolor to help you visualize the results from X)
3. Apply your most important pixels to the test data set to see how accurate you are with as few pixels as
possible.
4. Redo the analysis with each digit individually to find the most important pixels for each digit.
5. IMPORTANT: Think about the interpretation of what you are doing with this AX = B problem.
This is an exploratory homework. So play around with the data and make sure to make lots of plots. Good
luck, and have fun.