$29
ENGR 421 DASC 521
Homework 06: Modeling Cash Withdrawals from ATMs
In this homework, you will develop a machine learning solution in R, Matlab, or Python for a
real-life regression problem from finance industry. Your machine learning algorithm needs to
predict the number of cash withdrawals from 47 different ATMs of a bank using the information
given about each ATM and the withdrawal date. Here are the steps you need to follow:
1. You are given two input data files, namely, training_data.csv and test_data.csv. The
training set contains 42,958 labeled data instances (47 ATMs x 457 days x 2 transaction
types), where each training data instance has 7 columns. IDENTITY column gives you
the unique identifier assigned to each ATM. REGION column shows the geographical
region of each ATM. DAY, MONTH, and YEAR columns give the transaction date.
TRX_TYPE column shows the transaction type (1: card present, 2: card not present).
TRX_COUNT is the number of cash withdrawals performed on the specified date. You
are also given a very simple solution strategy using a decision tree classifier in the file
named quick_and_dirty_solution.R.
2. Develop your own machine learning solution for this problem. You are free to use any
publicly available packages in R, Matlab, or Python. The predictive quality of your
solution will be evaluated in terms of its MAE (mean absolute error) and RMSE (root
mean squared error) values on the test set.
3. Use the trained algorithm from the previous step to perform predictions for the test data
set, which contains 940 data instances (47 ATMs x 10 days x 2 transaction types). You
are not given the numbers of cash withdrawals for test instances. You need to predict the
numbers of cash withdrawals and to write these estimates into a file. For example, the
decision tree strategy implemented in quick_and_dirty_solution.R file generates the
estimates for the test set and writes these values into a file named test_predictions.csv.
What to submit: You need to submit your source code in a single file (.R file if you are using R,
.m file if you are using Matlab, or .py file if you are using Python), the estimated numbers of
cash withdrawals that you calculated for the test set (test_predictions.csv), and a detailed report
explaining your approach (.doc, .docx, or .pdf file). You will put these three files in a single zip
file named as STUDENTID.zip, where STUDENTID should be replaced with your 7-digit
student number.
How to submit: Submit the zip file you created to Blackboard. Please follow the exact style
mentioned and do not send a zip file named as STUDENTID.zip.