$30
EE466000 introduction to reinforcement learning
Homework 3: Gridworld
Goal
• Use dynamic programming to find an optimal policy in HW2.2
Todo
Implement an algorithm:
Use bellman equation for q∗(𝑠, 𝑎)
Details
File description
o HW3.ipynb: You’ll implement an algorithm in the file.
Gridworld environment
table of an algorithm.
Requirements and Installation
Python version: 3.6
pip install matplotlib
pip install numpy
Report
Title, name, student ID
Implementation
Briefly describe your implementation.
Experiments and Analysis
Plot tables of an algorithm. (As example above)
Whether q_values are reasonable?
Compare the table to the table of HW2.2.
Reminder
Please upload your code main.py and report.pdf to iLMS before 4/25 (Sat.) 23:59. No late
submission allowed.
DO NOT zip your code into a single file.
Please do not copy&paste the code from your classmates.
Please write a README file to explain how to run your code if you implemented extra
functions.