1 Aim
Please estimate a trend in a time series regression model, especially with deep neural
network methods multilayer perceptrons and LSTM RNN model. The DNN models could
predict the price curve within the finite steps. You may apply new methods or use new
packages to improve the the quality of clustering, but if you do so, you have to give a brief
introduction of the key concepts and provide necessary citations, instead of just direct copy
paste or importing. However, in this assignment, you must have models utilizing multilayer
perceptrons(MLP) and LSTM via keras(ver 2.2)[1] or tensorflow(ver 1.8).contrib.keras.
In order to save your keras.model, please install h5py(ver. 2.8) with pip. Once an algorithm
package is merged or imported into your code, please list the package link in your reference
and describe its mathematical concepts in your report followed by the reason for adoption.
2 Dataset Description
The dataset ‘apple price close.csv’ is the minute APPLE stock close price downloaded
from www.google.com/finance. All the records in the dataset ars ordered by time stamp
covering from May 21 to Jun 9. The headers in the dataset are time Taiwan, index min and
close price. Note based on the trading time window, there are only 391 index min from 0 to
390 within a trading day. In this dataset, it covers the min-trading history for 14 days.
13 Submission Format
You have to submit a compressed file hw4 studentID.zip which contains the following
files:
1. hw4 studentID.ipynb: detailed report, Python codes, results, discussion and mathematical descriptions;
2. hw4 studentID.tplx: extra Latex related setting, including the bibliography;
3. hw4 studentID.bib: citations in the "bibtex" format;
4. hw4 studentID.pdf: the pdf version of your report which is exported by your ipynb
with
(a) %% jupyter nbconvert - -to latex - -template hw4 studentID.tplx
hw4 studentID.ipynb
(b) %% pdflatex hw4 studentID.tex
(c) %% bibtex hw4 studentID
(d) %% pdflatex hw4 studentID.tex
(e) %% pdflatex hw4 studentID.tex
5. Other files or folders in a workable path hierarchy to your jupyter notebook (ipynb).
4 Coding Guidelines
For the purpose of individual demonstration with TA, you are required to create a function code in your jupyter notebook, as specified below, to reduce the data dimensionality,
learn a classification model, and evaluate the performance of the learned model.
• hw4 studentID demo(in x=None,in y=None,mode=None,model type=None,
header=True)
{ in x: [string] CSV file for ‘data’.
{ in y: [string] CSV file for ‘labels’. Assign a file when mode=‘train’.
{ mode: [string] mode=‘preprocessing’ Please set in y=None. If you have any
preprocessing, please do the preprocessing in this mode. After preprocessing, save
the processed data as ‘HW4 studentID MLP data.csv’ and ‘HW4 studentID MLP labels.csv’
(‘HW4 studentID LSTM data.csv’ and ‘HW4 studentID LSTM labels.csv’) with
header if header=True or without header when header=False. You can design
your own ‘* data.csv’ format. For ‘* labels.csv’, there would be only one column
for the prices. Assume timesteps= n and denote the sequence of price as fsrgm r=0 −1.
We want to predict sr under mode=‘train’. For example, when header=True,
data dim=1, the in x=‘HW4 studentID MLP data.csv’ could be
2x1 x2 · · · xn
...
sr−n sr−n+1 · · · sr−1
...
and with (batch size=β,timesteps= n, data size=k),
the in x=‘HW4 studentID LSTM data.csv’ could be
batch idx x
0
x
(0)
0 x
(0)
1 · · · x
(0)
k−1 x
(1)
0 · · · x
(1)
k−1 · · · x
(n−1)
0 · · · x
(n−1)
k−1
x
(1)
0 x
(1)
1 · · · x
(1)
k−1 x
(2)
0 · · · x
(2)
k−1 · · · x
(n)
0 · · · x
(n)
k−1
...
x
(β−1)
0 x
(β−1)
1 · · · x
(β−1)
k−1 x
(β)
0 · · · x
(β)
k−1 · · · x
(β+n−2)
0 · · · x
(β+n−2)
k−1
...
...
and in y‘HW4 studentID MLP labels.csv’ labels.csv’ is set as
y ... sr...
.
and in y=‘HW4 studentID LSTM
batch idx y
0
y(n−1)
y(n)
...
y(β+n−2)
...
...
.
You can design your ‘HW4 studentID MLP data.csv’ and
‘HW4 studentID LSTM data.csv’ base on the timesteps and data dimension in your models with data dim 1, including the price difference, indicator for increase or decrease and so on. Just remember to
reshape it to match your deployment of the model when mode=‘train’.
mode=‘train’ The ‘x in’ is assigned to ‘HW4 studentID MLP data.csv’
(or ‘HW4 studentID LSTM data.csv’). ‘y in’ is assigned to ‘HW4 studentID MLP labels.csv’
(or ‘HW4 studentID LSTM labels.csv’). Please do MLP and LSTM in this mode
and output the corresponding files which are specified next.
{ model type:[string] when mode=‘train’. use model type=‘MLP’ or model type=‘LSTM’
to select training model.
3{ header:[bool] if header=True, there has header in ‘HW4 studentID MLP data.csv’
(or ‘HW4 studentID LSTM data.csv’); if header=False, there is no header in
the ‘HW4 studentID MLP data.csv’ (or ‘HW4 studentID LSTM data.csv’).
In mode=‘train’, please output the following ‘CSV’ files with headers and your model
to ‘h5’ files. Also set your models as globel when mode=‘train’, model type=‘MLP’ or
mode=‘train’, model type=‘LSTM’. Please set the loss function as loss=’mean squared error’.
• MLP:
file 1 ’HW4 studentID MLP.csv’ with header
optimizer,loss, batch size, timesteps, data dim, avg loss, last loss
∗ optimizer: type of optimizer[2];
∗ loss: type of loss[3];
∗ batch size: how many instances in one batch;
∗ timesteps: time steps to look back;
∗ data dim: data dimension for each instance;
∗ avg loss: the average loss for overall batches;
∗ last loss: the loss for the last batch training.
file 2 ‘HW4 studentID MLP model.h5’: save you model with keras
∗ if your keras LSTM model is named with ‘model’
model.save(‘HW4 studentID MLP model.h5’)
• LSTM:
file 1 ’HW4 studentID LSTM.csv’ with header
optimizer,loss, batch size, timesteps, data dim, avg loss, last loss
∗ optimizer: type of optimizer[2];
∗ loss: type of loss[3];
∗ batch size: how many instances in one batch;
∗ timesteps: time steps to look back;
∗ data dim: data dimension for each instance;
∗ avg loss: the average loss for overall batches;
∗ last loss: the loss for the last batch training.
file 2 ‘HW4 studentID LSTM model.h5’: save you model with keras
∗ if your keras LSTM model is named with ‘model’
model.save(‘HW4 studentID LSTM model.h5’)
4In demonstration, the training dataset is the same as homework assignment. The score of
the demonstration has two parts, MLP and LSTM. For each part, the score will be graded by
the ranking of average batch loss value for the test dataset via your model. The test dataset
will not be offered. It is tested only on the given webpage and go through your preprocessing
before evaluation. In the script file ‘hw4 studentID.py’, please add the following code to your
program before you import keras.
import tensorflow as tf from keras.backend.tensorflow backend
import set session config = tf.ConfigProto()
config.gpu options.allow growth = True
set session(tf.Session(config=config))
Note: the script you submit in demonstration will be referred when TA is grading your
report and jupyter notebook. Make sure TA knows how to modify your code if TA can not
execute your jupyter notebook properly.
5 Report Requirement
• List names of packages used in your program;
• A flowchart for Preprocessing with explanation to your method.
• A diagram for your MLP model.
• A diagram for your LSTM model.
• Compare and conclude results between two methods,
5.1 Basic Requirement
• Implement 2 methods after the pre-processing is finished.
• Based on the average of loss value, decide the architecture of your models. The grading
will also refer to this value.
• Please make sure hw4 student ID demo is functional and can output the required
files in both mode=‘preprocessing’ and mode=‘train’.
• If you apply new methods or use new packages to improve the classification performance, you have to give a brief introduction of the key concepts and provide necessary
citations/links, instead of just direct copy paste or importing.
• Please submit your ‘report’ in English. Be aware that a ‘report’ is much more than a
‘program.’
5References
[1] Guideline for keras. https://keras.io/getting-started/sequential-model-guide/.
Accessed: 2018-06-09.
[2] optimizer function in keras. https://keras.io/optimizers/. Accessed: 2018-06-09.
[3] loss function in keras. https://keras.io/losses/. Accessed: 2018-06-09.
6