$30
CS 489/689
INTRODUCTION TO MACHINE LEARNING
Assignment #2
Description:
Ø For this assignment, you will implement two linear regression algorithms in
Python, Julia, or MATLAB to solve a regression problem. You may use any and all
built-in functions including linear regression functions.
Ø Ordinary least squares (OLS) solution: Implement the ordinary least squares
solution as discussed in class.
Ø Linear regression with gradient descent: Implement gradient descent for linear
regression as discussed in class.
Ø For both algorithms, don’t forget to include the bias term in the parameter vector.
Ø Train and test your model with a dataset of your choosing that meets the following
criteria:
o Number of features: 3+
o Feature characteristics: Real-valued
o Output characteristics: Real-valued
Ø Use 80% of the dataset for training and 20% for testing your model.
Ø If you’d like, you may use one of the following sources to find a dataset:
o University of California, Irvine Machine Learning Repository
https://archive.ics.uci.edu/ml/index.php
o Kaggle https://www.kaggle.com/
o Awesome Public Datasets https://github.com/awesomedata/awesome-publicdatasets
o Google Dataset Search Engine https://datasetsearch.research.google.com/
o Microsoft Research Open Data https://msropendata.com/
o U.S. Government’s Open Data https://www.data.gov/
o Registry of Research Data Repositories https://www.re3data.org/
o CMU Libraries https://guides.library.cmu.edu/machine-learning/datasets
Ø Summarize your approach and results in a report that includes at least the following:
o The dataset you used, its source and characteristics.
o The data preprocessing steps you took (if any).
o The solution �" for both algorithms.
o The learning rate(s) you used for gradient descent and how many iterations it
took for gradient descent to converge.
o Relevant evaluation metrics for the training dataset for both algorithms.
o Relevant evaluation metrics for the test dataset with for both algorithms.
o Any additional details you would like to include.
Ø Submit your report along with your dataset and source code. Feel free to include your
code in the report, but you also need to submit your source code files (.py, .jl, or .m)
and your dataset separately, so that your results can be replicated for scoring.
Submission Instructions:
Compress all the files and name the submission file <YourLastName_Assignment2:
Ø If you are completing the assignment individually, your last name is Smith, and you
are submitting a .zip file, the file should be named Smith_Assignment2.zip.
Ø If you are completing the assignment as a team of two, your last names are Rogers
and Smith, and you are submitting a .zip file, the file should be named
Rogers_Smith_Assignment2.zip. Only one of the team members needs to submit the
assignment.