$29.99
CS B551 - Assignment 1: Searching
This assignment will give you practice with posing AI problems as search, and with un-informed and informed
search algorithms. This is also an opportunity to dust off your coding skills. Please read the instructions
below carefully; we cannot accept any submissions that do not follow the instructions given here. Most
importantly: please start early, and ask questions on Q&A Community or in office hours.
Guidelines for this assignment
Coding requirements. For fairness and efficiency, we use an automatic program to grade your submissions.
This means you must write your code carefully so that our program can run your code and understand its
output properly. In particular:
1. You must code this assignment in Python 3, not Python 2.
2. Make sure to use the program file name we specify.
3. Use the skeleton code we provide, and follow the instructions in the skeleton code (e.g., to not change
the parameters of some functions.
4. You may import standard Python modules for routines not related to AI, such as basic sorting algorithms and data structures like queues, as long as they are already installed on silo.sice.indiana.edu.
5. IMPORTANT: In addition to testing your programs on your own, test your code on silo.sice.indiana.edu
using test scripts that are provided in the Github repo. To run these test scripts on your code, type:
python3 -m pytest -v
This will automatically run your programs on some sample test cases, and indicate whether your
programs passed or failed. These test cases are just samples; we will run this on more cases while
grading, so make sure to test your code thoroughly.
For each of these problems, you will face some design decisions along the way. Your primary goal is to
write clear code that finds the correct solution in a reasonable amount of time. To do this, you should
give careful thought to the search abstractions, data structures, algorithms, heuristic functions, etc. To
encourage innovation, we will conduct a competition to see which program can solve the hardest problems
in the shortest time, and a small amount of your grade (or extra credit) may be based on your program’s
performance in this competition.
Groups. You’ll work in a group of 1-3 people for this assignment; we’ve already assigned you to a group
according to your preferences. You should only submit one copy of the assignment for your team, through
GitHub, as described below. All the people on the team will receive the same grade, except in unusual
circumstances; we will collect feedback about how well your team functioned in order to detect these circumstances. The requirements for the assignment are the same no matter how many teammates you have,
but we expect that teams with more people will submit answers that are significantly more “polished” —
e.g., better documented code, faster running times, more thorough answers to questions, etc.
Coding style and documentation. We will not explicitly grade based on coding style, but it’s important
that you write your code in a way that we can easily understand it. Please use descriptive variable and
function names, and use comments when needed to help us understand code that is not obvious.
Report. Please put a report describing your assignment in the Readme.md file in your Github repository.
For each problem, please include: (1) a description of how you formulated the search problem, including
1
precisely defining the state space, the successor function, the edge weights, the goal state, and (if applicable)
the heuristic function(s) you designed, including an argument for why they are admissible; (2) a brief description of how your search algorithm works; (3) and discussion of any problems you faced, any assumptions,
simplifications, and/or design decisions you made. These comments are especially important if your code
does not work as well as you would like, since it is a chance to document the energy and thought you put into
your solution. For example, if you tried several different heuristic functions before finding one that worked,
feel free to describe this in the report so that we appreciate the work that you did.
Academic integrity. We take academic integrity very seriously. To maintain fairness to all students in
the class and integrity of our grading system, we will prosecute any academic integrity violations that we
discover. Before beginning this assignment, make sure you are familiar with the Academic Integrity policy
of the course, as stated in the Syllabus, and ask us about any doubts or questions you may have. To briefly
summarize, you may discuss the assignment with other people at a high level, e.g. discussing general strategies
to solve the problem, talking about Python syntax and features, etc. You may also consult printed and/or
online references, including books, tutorials, etc., but you must cite these materials (e.g. in source code
comments). We expect that you’ll write your own code and not copy anything from anyone else, including
online resources. However, if you do copy something (e.g., a small bit of code that you think is particularly
clever), you have to make it explicitly clear which parts were copied and which parts were your own. You can
do this by putting a very detailed comment in your code, marking the line above which the copying began,
and the line below which the copying ended, and a reference to the source. Any code that is not marked in
this way must be your own, which you personally designed and wrote. You may not share written answers
or code with any other students, nor may you possess code written by another student, either in whole or in
part, regardless of format.
Part 0: Getting started
For this project, we are assigning you to a team. You can find your assigned teammate(s) by logging
into IU Github, at http://github.iu.edu/. In the upper left hand corner of the screen, you should see
a pull-down menu. Select cs-b551-fa2021. Then in the box below, you should see a repository called
userid1 -a1, userid1-userid2 -a1, or userid1-userid2-userid3 -a1, where the other user ID(s) correspond to
your teammate(s). Now that you know their userid(s), you can write them an email at userid@iu.edu.
To get started, clone the github repository:
git clone git@github.iu.edu:cs-b551-fa2021/your-repo-name -a1
If that doesn’t work, instead try:
git clone https://github.iu.edu/cs-b551-fa2021/your-repo-name-a1
where your-repo-name is the one you found on the GitHub website above. (If neither command works, you
probably need to set up IU GitHub ssh keys. See Canvas for help.)
Part 1: The 2021 Puzzle
Consider the 2021 puzzle, which is a lot like the 15-puzzle we talked about in class, but: (1) it has 25 tiles, so
there are no empty spots on the board (2) instead of moving a single tile into an open space, a move in this
puzzle consists of either (a) sliding an entire row of tiles left or right one space, with the left- or right-most
tile ‘wrapping around’ to the other side of the board, (b) sliding an entire column of tiles up or down one
space, with the top- or bottom-most tile ‘wrapping around’ to the other side of the board, (c) rotating the
outer ‘ring’ of tiles either clockwise or counterclockwise, or (d) rotating the inner ring either clockwise or
counterclockwise.
2
For example, here is a sequence of three moves on such a puzzle:
1 2 3 4 5 1 2 3 4 5 1 2 23 4 5 2 23 4 5 10
6 7 8 9 10 L3→ 6 7 8 9 10 D3→ 6 7 3 9 10 Occ → 1 7 3 9 11
11 12 13 14 15 12 13 14 15 11 12 13 8 15 11 6 13 8 15 20
16 17 18 19 20 16 17 18 19 20 16 17 14 19 20 12 17 14 19 25
21 22 23 24 25 21 22 23 24 25 21 22 18 24 25 16 21 22 18 24
2 23 4 5 10
Ic→ 1 13 7 3 11
6 17 8 9 20
12 14 19 15 25
16 21 22 18 24
The goal of the puzzle is to find a short sequence of moves that restores the canonical configuration (on the
left above) given an initial board configuration. We’ve provided skeleton code to get you started. You can
run the skeleton code on the command line:
python3 solver2021.py [input-board-filename]
where input-board-filename is a text file containing a board configuration (we have provided an example).
You’ll need to complete the function called solve(), which should return a list of valid moves. The moves
should be encoded as strings in the following way:
• For sliding rows, R (right) or L (left), followed by the row number indicating the row to move left or
right. The row numbers range from 1-5.
• For sliding columns, U (up) or D (down), followed by the column number indicating the column to move
up or down. The column numbers range from 1-5.
• For rotations, I (inner) or O (outer), followed by whether the rotation is clockwise (c) or counterclockwise (cc).
For example, the above diagram performs the moves L3 (slide row 3 left), D3 (slide column 3 down), Occ
(outer counterclockwise), and Ic (inner clockwise).
The initial code does not work correctly. Using this code as a starting point, implement a fast version, using
A* search with a suitable heuristic function that guarantees finding a solution in as few moves as possible.
Try to make your code as fast as possible even for difficult boards, although it is not necessarily possible to
quickly solve all puzzles. For example, board1.txt can be solved in 11 moves. You will need to be creative
with your heuristic function in order to find this solution in less than 15 minutes.
In your report, answer the following questions:
1. In this problem, what is the branching factor of the search tree?
2. If the solution can be reached in 7 moves, about how many states would we need to explore before we
found it if we used BFS instead of A* search? A rough answer is fine.
In addition to doing your own testing, it is important that you test your program on silo.sice.indiana.edu
using our test script to ensure that we will be able to run it and grade it accurately.
3
Part 2: Road trip!
It’s not too early to start planning a post-pandemic road trip! If you stop and think about it, finding the
shortest driving route between two distant places — say, one on the east coast and one on the west coast of
the U.S. — is extremely complicated. There are over 4 million miles of roads in the U.S. alone, and trying
all possible paths between two places would be nearly impossible. So how can mapping software like Google
Maps find routes nearly instantly? The answer is A* search!
We’ve prepared a dataset of major highway segments of the United States (and parts of southern Canada
and northern Mexico), including highway names, distances, and speed limits; you can visualize this as a
graph with nodes as towns and highway segments as edges. We’ve also prepared a dataset of cities and
towns with corresponding latitude-longitude positions. Your job is to find good driving directions between
pairs of cities given by the user.
The skeleton code can be run on the command line like this:
python3 ./route.py [start-city] [end-city] [cost-function]
where:
• start-city and end-city are the cities we need a route between.
• cost-function is one of:
– segments tries to find a route with the fewest number of road segments (i.e. edges of the graph).
– distance tries to find a route with the shortest total distance.
– time finds the fastest route, assuming one drives the speed limit.
– delivery finds the fastest route, in expectation, for a certain delivery driver. Whenever this
driver drives on a road with a speed limit ≥ 50 mph, there is a chance that a package will fall out
of their truck and be destroyed. They will have to drive to the end of that road, turn around,
return to the start city to get a replacement, then drive all the way back to where they were (they
won’t make the same mistake the second time they drive on that road).
Consequently, this mistake will add an extra 2 ·(troad + ttrip) hours to their trip, where ttrip is the
time it took to get from the start city to the beginning of the road, and troad is the time it takes
to drive the length of the road segment.
For a road of length ` miles, the probability p of this mistake happening is equal to tanh
`
1000
if the speed limit is ≥ 50 mph, and 0 otherwise.1 This means that, in expectation, it will take
troad + p · 2(troad + ttrip) hours to drive on this road.
For example:
python3 ./route.py Bloomington,_Indiana Indianapolis,_Indiana segments
You’ll need to complete the get route() function, which returns the best route according to the specified cost
function, as well as the number of segments, number of miles, number of hours for a car driver, and expected
number of hours for the delivery driver. See skeleton code for details.
Like any real-world dataset, our road network has mistakes and inconsistencies; in the example above, for
example, the third city visited is a highway intersection instead of the name of a town. Some of these “towns”
1This formula is not incredibly special, but what’s important is that it increases as the length of the road l increases, and it
will always be between 0 and 1, which means it can be interpreted as a probability. You can access the tanh function in Python
by using the math module: from math import tanh
will not have latitude-longitude coordinates in the cities dataset; you should design your code to still work
well in the face of these problems.
In addition to doing your own testing, it is important that you test your program on silo.sice.indiana.edu
using our test script to ensure that we will be able to run it and grade it accurately.
Extra credit. Implement an additional cost-function: statetour should find the shortest route from the
start city to the end city, but that passes through at least one city in each of the 48 contiguous U.S. states.
Part 3: Choosing teams
In a certain Computer Science course, students are assigned to groups according to preferences that they
specify. Each student is sent an electronic survey and asked to give answers to three questions:
1. What is your IU email username?
2. Please choose one of the options below and follow the instructions.
(a) You would like to work alone. In this case, just enter your userid in the box and nothing else.
(b) You would like to work in a group of 2 or 3 people and already have teammates in mind. In
this case, enter all of your userids (including your own!) in the box below, in a format like
userid1-userid2 for a team of 2, or userid1-userid2-userid3 for a team of 3.
(c) You would like to work in a group of 2 or 3 people but do not have any particular teammates in
mind. In this case, please enter your user ID followed by one “zzz” per missing teammate (e.g.
djcran-zzz where djcran is your user ID to request a team of 2, or djcran-zzz-zzz for a team of 3).
(d) You would like to work in a group of 3 people and have some teammates in mind but not all.
Enter all of your ids, with zzz’s to mark missing teammates (e.g. if I only have one teammate
(vkvats) in mind so far, I’d enter djcran-vkvats-zzz).
3. If there are any people you DO NOT want to work with, please enter their userids here (separated by
commas, e.g. userid1,userid2,userid3).
Unfortunately — and as we already discovered while assigning teams for Assignment 1 — the student
preferences may not be compatible with each other: student A may request working with student B, but B
may request not to work with A, for example. Students are going to complain, so the course staff decides to
minimize their own work. They estimate that:
• It will take 5 minutes to grade each assignment, so total grading time is 5 times the number of teams.
• Each student who requested a specific group size and was assigned to a different group size will send
a complaint email to an instructor, and it will take the instructor 2 minutes to read this email.
• If a student is not assigned to someone they requested, there is a 5% probability that the two students
will still share code, and if this happens it will take 60 minutes for the instructor to walk through the
Academic Integrity Policy with them. If a student requested to work with multiple people, then this
will happen independently for each person they were not assigned to. If both students requested each
other, there will be two meetings.
• Each student who is assigned to someone they requested not to work with (in question 3 above)
complains to the Dean, who meets with the instructor for 10 minutes. If a student is assigned to a
group with multiple people they did not want to work with, a separate meeting will be needed for each.
5
The total time spent by the course staff is equal to the sum of these four components. You can assume that
each student fills out the survey exactly once, and fills it out according to the instructions. Your goal is to
write a program to find an assignment of students to teams that minimizes the total amount of work the staff
needs to do, subject to the constraint that no team may have more than 3 students. Your program should
take as input a text file that contains each student’s response to these questions on a single line, separated
by spaces. For example, a sample file might look like:
djcran djcran-vkvats-nthakurd sahmaini
sahmaini sahmaini _
sulagaop sulagaop-xxx-xxx _
fanjun fanjun-xxx nthakurd
nthakurd nthakurd djcran,fanjun
vkvats vkvats-sahmaini _
where the underscore character ( ) indicates an empty value.
We have provided skeleton code to get you started, which can be run like:
python3 ./assign.py [input-file]
Your job is to complete the solver() function. The function should return the final groups (each named
according to the students in the group, separated by hyphens), and the total cost (time spent by instructors
in minutes). For example, one assignment for the above file could be:
["djcran-vkvats-nthakurd", "sahmaini", "sulagaop-fanjun"]
which has a cost of 34, computed by the sum of: 1. There are three groups’ assignments to grade (3 ×5 = 15
minutes) 2. Three people (sulagaop, nthakurd, and vkvats) didn’t get the requested number of teammates
(3 × 2 = 6 minutes) 3. One person (nthakurd) had to work with someone they requested not to work with
(djcran) (10 minutes) 4. One person (vkvats) didn’t get to work with a person they requested (sahmaini)
(0.05 × 60 = 3 minutes)
Hint: It may not always be possible to find the actual best solution in a reasonable amount of time, and
“reasonable amount of time” may differ from one problem to the next. Our grading program will eventually
exit your program if it takes too long. Your program is thus allowed to generate multiple solutions, which
may be useful if your approach can quickly produce an estimate of the solution, and then as it performs
more computation, finds better and betters solutions. You’ll call yield() each time you have found an answer
— see skeleton code for details.
In addition to doing your own testing, it is important that you test your program on silo.sice.indiana.edu
using our test script to ensure that we will be able to run it and grade it accurately.
What to turn in
Turn in the three programs on GitHub (remember to add, commit, push) — we’ll grade whatever version
you’ve put there as of 11:59PM on the due date. To make sure that the latest version of your work has
been accepted by GitHub, you can log into the github.iu.edu website and browse the code online. Your
programs must obey the input and output formats we specify above so that we can run them,
and your code must work on the SICE Linux computers.
Tip: These three problems are very different, but they can all be posed as search problems. This means that
if you design your code well, you can reuse or share a lot of it across the three problems, instead of having
to write each one from scratch.
6