$30
Machine Perception
Homework 2
Instructions
• This is an individual homework and worth 100 points
• You must submit your solutions on Gradescope, the entry code is MB8ZJP. We recommend that you use LATEX, but we will accept scanned solutions as well.
• Please complete this homework with Python 3. To run this homework successfully, you
will need the following python packages:
– numpy
– matplotlib
– opencv-python
• Start early! Please post your questions on Piazza or come to office hours!
Submission
• You will submit a concise report to Gradescope that includes a discussion on the algorithms implemented as well as results from multiple frames.
• You will also submit all python files you completed as .py files to Gradescope
1 Barcelona - 50 pts
1.1 Introduction
In this programming assignment, we will use the concepts of projective geometry and homographies to allow us to project an image onto a scene in a natural way that respects
perspective. To demonstrate this, we will project our logo onto the goal during a football
match. For this assignment, we have provided images from a video sequence of a football
match, as well as the corners of the goal in each image and an image of the Penn Engineering
logo. Your task is, for each image in the video sequence, compute the homography between
the Penn logo and the goal, and then warp the goal points onto the ones in the Penn logo to
generate a projection of the logo onto the video frame (e.g. Figure 1).
1
(a) Original image from video (b) Image with Penn logo projected
Figure 1: Example of logo projection
1.2 Technical Details
The Python script project logo.py will be the main script to run this assignment. In this
script, we provide you the following:
• A sequence of images onto which you will project a logo image.
• The corners in each video frame that the logo should project onto.
• The logo image itself.
Your goal will be to complete the functions est homography and warp pts in their respective .py files. est homography estimates the homography that maps the video image
points onto the logo points (i.e. xlogo ∼ xvideo), and warp pts then warps the sample points
according to this homography, returning the warped positions of the points (that is, the corresponding points in the logo image). We then use these correspondences to copy the points
from the logo into the video image. Once you finish these two functions, change your current
directory to the directory of “Part 1”, and run “python project logo.py”. It will use the two
functions you completed to project logo, visualize them in a video, and save sampled results.
1.3 Homography Estimation
To project one image patch onto another, we need, for each point inside the goal in the video
frame, to find the corresponding point from the logo image to copy over. In other words, we
need to calculate the homography between the two image patches. This homography is a 3x3
matrix that satisfies the following:
xlogo ∼Hxvideo (1)
Or, equivalently:
λxlogo =Hxvideo (2)
Where xlogo and xvideo are homogeneous image coordinates from each patch and λ is some
scaling constant. To calculate the homography needed for this projection, we provide, for each
2
image, the corners of the patches that we would like you to warp between in each image. You
must calculate the homography using the provided corner points and the technique covered
in the lectures and Appendix A. You can then warp each image point using H to find its
corresponding point in the logo (note that the homography equation is estimated up to a
scalar, so you will need to divide Hximage by the third term, which is λ), and then return the
set of corresponding points as a matrix.
1.4 Inverse Warping
You may be wondering why we are calculating the projection from the video frames to the logo
image, when we want to project the logo image onto the video frames. We do this because, if
we compute the inverse homography, and project all the logo points into the video frame, we
will most likely have the case where multiple logo points project to one video frame pixel (due
to rounding of the pixels), while other pixels may have no logo points at all. This results in
’holes’ in the video frame where no logo points are mapped. To avoid this, we calculate the
projection from video frame points to logo points to guarantee that every video frame gets a
point from the logo.
We can then replace every point in the video frame (xvideo) with the corresponding point
in the logo (xlogo) using the correspondences (ximage, xlogo). This is already done for you in
inverse warping, and you should not need to change it.
1.5 Files to complete and submit
1. est homography.py
This function is responsible for computing the homography given correspondence
2. warp pts.py
This function is responsible for computing the warped points given correspondence and
a set of sample points
3
1.6 Visualizing Results
To play the projected images as a video, run the project logo.py script. First open a
terminal window and change the current directory to the directory of “Part 1”. Then run
command:
python project logo.py
You can also use an IDE to run the script. If you would like to play with your own data, you
can edit project logo and generate your own video with a set of points.
1.7 Appendix A: Calculating Homographies
As we learned in the lectures, a homography H maps a set of points x =
x y 1
?T
to
another set of points x
0 =
x
0 y
0 1
?
up to a scalar:
x
0 ∼Hx (3)
λ
x
0
y
0
1
=
h11 h12 h13
h21 h22 h23
h31 h32 h33
x
y
1
(4)
λx0 =h11x + h12y + h13 (5)
λy0 =h21x + h22y + h23 (6)
λ =h31x + h32y + h33 (7)
In order to recover x
0 and y
0
, we can divide equations (5) and (6) by (7):
x
0 =
h11x + h12y + h13
h31x + h32y + h33
(8)
y
0 =
h21x + h22y + h23
h31x + h32y + h33
(9)
Rearranging the terms above, we can get a set of equations that is linear in the terms of H:
−h11x − h12y − h13 + h31xx0 + h32yx0 + h33x
0 =0 (10)
−h21x − h22y − h23 + h31xy0 + h32yy0 + h33y
0 =0 (11)
Finally, we can write the above as a matrix equation:
?
ax
ay
?
h =0 (12)
Where:
ax =
−x −y −1 0 0 0 xx0 yx0 x
0
?
(13)
ay =
0 0 0 −x −y −1 xy0 yy0 y
0
?
(14)
h =
h11 h12 h13 h21 h22 h23 h31 h32 h33 ?T
(1
Our matrix H has 8 degrees of freedom, and so, as each point gives 2 sets of equations, we will
need 4 points to solve for h uniquely. So, given four points (such as the corners provided for
this assignment), we can generate vectors ax and ay for each, and concatenate them together:
A =
ax,1
ay1
.
.
.
ax,n
ay,n
(16)
Our problem is now:
Ah = 0 (17)
As A is a 8x9 matrix, there is a unique null space. Normally, we can use MATLAB’s null
function, however, due to noise in our measurements, there may not be an h such that Ah is
exactly 0. Instead, we have, for some small ~?:
Ah =~? (18)
To resolve this issue, we can find the vector h that minimizes the norm of this ~?. To do this,
we must use the SVD, which we will cover in week 3. For this project, all you need to know
is that you need to run the command:
[U, S , V] = svd (A ) ;
The vector h will then be the last column of V , and you can then construct the 3x3 homography
matrix by reshaping the 9x1 h vector.
5
2 Invictus vs Harleysville - 50 pts
2.1 Introduction
In this programming assignment, we will use the concepts of projective geometry and vanishing
points to project a virtual line that would be parallel as seen from above. We have provided
a video sequence, as well as the relevant annotations. The task is to compute the vanishing
point then find the line segment in the image that represents the referee’s location on the
field. (e.g. Figure 2)
Figure 2: Example of the referee’s line (blue) projected onto the field.
2.2 Technical Details
The Python script visualize ref line.py will be the main script for this question. We provide
the following:
• A sequence of images onto which you will draw the referee’s line
• All annotated points (referee and 4 field points)
• Drawing utilities for all points
2.3 Data given
Figure 3 shows the layout of the provided keypoints as seen from a top down perspective.
The mappings from figure to code are as follows:
A lower left keypoint
B lower right keypoint
6
C upper right keypoint
D upper left keypoint
E referee keypoint
A
D
B
C
E
Figure 3: Supplied keypoints for each frame.
2.4 Files to complete and submit
1. line from pts.py
This function is responsible for computing the line that passes through two points in P
2
2. line intersection.py
This function is responsible for computing the intersection between two lines in P
2
3. compute ref line segment.py
This function is responsible for computing the referee’s position on the field, represented
as two end points (on either side of the field).
2.5 Visualizing Results
To visualize the results as a video, run the visualize ref line.py script. First open a terminal
window and change the current directory to the directory of “Part 2”. Then run command:
python visualize ref line.py
You can also use an IDE to run the script.
7