Starting from:

$35

Data Mining Assignment 3

CSE4334/5334 Data Mining Assignment 3

What to turn in:
1. Your submission should include your complete code base in an archive file (zip, tar.gz) and q1/,
q2/, and so on), and a very very clear README describing how to run it.
2. A brief report (typed up, submit as a PDF file, NO handwritten scanned copies) describing what you
solved and implemented and known failure cases. The report is important since we will be evaluating
the grades mostly based on the report.
3. Submit your entire code and report to Blackboard.
Notes from instructor:
• You may ask the TA or instructor for suggestions, and discuss the problem with others (minimally).
But all parts of the submitted code must be your own.
• Use Python for your implementation.
• Make sure that the TA can easily run the code by plugging in our test data.
Problem 1
(Logistic Regression, 60pts) Implement logistric regression. For your training data, generate 1000 training
instances in two sets of random data points (500 in each) from multi-variate normal distribution with
µ1 = [1, 0], µ2 = [0, 1.5], Σ1 =
?
1 0.75
0.75 1 ?
, Σ2 =
?
1 0.75
0.75 1 ?
(1)
and label them 0 and 1. Generate testing data in the same manner but include 500 instances for each class,
i.e., 1000 in total. You will implement a logistic regression from scratch. Use sigmoid function for your
activation function and cross entropy for your objective function. Stop your training when the l1-norm of
your gradient is less than 0.001 (i.e., close to 0) or the number of iteration reaches 100000. Don’t forget to
include bias term!
1. (30pt) Perform batch training using gradient descent. Divide the derivative with the total number of
training dataset as you go through iteration (it is very likely that you will get NaN if you don’t do
this.). Change your learning rate as η = {1, 0.1, 0.01, 0.001}. Your report should include: 1) scatter
plot of the testing data and the trained decision boundary, 2) figure of changes of training error (cross
entropy) w.r.t. iteration, 3) figure of changes of norm of gradient w.r.t. iteration. Also, report the
number iterations it took for training and the accuracy that you have.
2. (30pt) Perform online training using gradient descent. Here, you do not need to normalize the gradient
with the training dataset size since each iteration takes one training sample at a time. Try various
learning rate as η = {1, 0.1, 0.01, 0.001}. Set your maximum number of iterations to 100000. Your
report should include: 1) scatter plot of the testing data and the trained decision boundary, 2) figure
of changes of training error (cross entropy) w.r.t. iteration, 3) figure of changes of norm of gradient
w.r.t. iteration. Also, report the number iterations it took for training and the accuracy that you have.
Write your brief observation comparing this result from the result from batch training.

Problem 2
(Tensorflow and Keras, 40pts) Try out the tutorial for Deep Learning using Tensorflow at https://www.
tensorflow.org/tutorials. It has the following lines of code.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=’adam’,loss=’sparse_categorical_crossentropy’, metrics=[’accuracy’])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
1. (10pt) In the report, write comments for each line of code given above and explain what this framework
is doing.
2. (10pt) Change the number of hidden nodes to 5, 10, 128 and 512. Report how the testing accuracy
changes for the testing data. Report the result and your observation in the report.
3. (20pt) Now, remove the hidden layer in the code and train the model. The trained model contains
the weights that it has learned from training. Plot the “new representation” that it has learned for
each number from training and include them in the report. That is, reshape the learned weights (i.e.,
vector) to the image dimension (in 2D, i.e., 28x28) and show them. You will see some number-like
features.

More products