$30
CSC2503—Foundations of Computer Vision
Assignment 1: Robust Estimation
The goal of this part of the assignment is to gain experience using robust estimation to extract simple
image models from noisy image measurements.
To begin: Download the starter code A1handout.zip from the course homepage.
What to submit. Write a short report addressing each of the itemized questions below. Pack your
PDF report and the completed Matlab files for each question into a file called A1.zip, and submit
the zip file electronically.
How to submit. Create a tarfile called assign1.tar.gz and use the following command on a
Teaching Labs (a.k.a CDF) computer to submit it electronically:
> submit -c csc2503h -a assign1 assign1.tar.gz
Late policy. There will be a 10% marks deduction for each day late, for up to three days. Deductions
begin 24 hours after the due date and time (ie. 2pm on October 12th). Assignments submitted more
than 72 hours late will not be accepted.
Robustly Fitting Circles. A geneticist wishes to automatically count cells in microscope images,
such as the one depicted in Figure 1 (left). The geneticist is especially interested in cells at different
stages of cell division. Here we consider an application of robust estimation that will get us started
on a solution for the geneticist. Cells are actually elliptical but we’ll begin by fitting circles.
Figure 1: Cell image (left), foreground pixels (middle), and fitted cells (right, in red).
The main script in the handout code, called findCellScript.m, reads in a microscope image of
cells (Figure 1 (left)). We have already preprocessed the cell images to detect foreground cell pixels
(Figure 1 (middle)). We have also labelled the connected sets of foreground pixels, which are read
in as a second image. Finally, the script computes Canny edgels from the foreground mask (we will
briefly discuss this edge detection method later in the course; knowing and/or understanding it is
not necessary for this assignment). The code then considers each connected component individually,
1
looking for circular cells (see Figure 1 (right)).
Your job is to fit circles to the edgels obtained from the boundaries of the connected components.
Each fitted circle should correspond to one “cell” including, perhaps, a small circle for a cell bud
just being born, by splitting off from another cell (see Figure 1 (right)). This is a vague definition
of what constitutes a “cell”. However, if you study Figure 1 (right) (e.g., by enlarging the electronic
copy), it should be intuitively clear what is meant by a “cell”. Part of your job in this assignment is
to decide on how to operationally define our intuitive notion of one cell.
We aim to fit circles using edgel measurements. Let the k
th edgel be denoted by (~xk, ~nk), where ~xk
is the measured image position and ~nk is the measured edgel normal. The edgel normal points in
the direction of increasing brightness of the foreground regions Figure 1 (middle); i.e., they point
towards the interior of the cells.
Because cells occur in close proximity, a single connected foreground component will often contain
more than one cell. It is therefore natural to formulate circle fitting as a robust estimation problem.
That is, a set of edgel measurements {~xk, ~nk}
N
k=1 from a single connected component will often
correspond to multiple cells. When estimating the parameters of one circle (cell) we can view the
measurements from any other circle (cells) as outliers. To this end we will use the Geman-McLure
(GM) estimator,
ρ(e, σg) = e
2
σ
2
g + e
2
. (1)
In what follows we are going to take a greedy approach to this problem, finding one circle at a time.
The approach comprises four main steps: 1) quickly generating a set of plausible circle hypotheses,
2) choosing one among them as an initial guess for optimization, 3) robust estimation to optimize
the fit, and 4) the model update, for which we decide whether or not the optimized fit is sufficient
and which edgel measurements are owned by the circle.
Below there are several questions, including some derivations and some programming. To simplify
the programming, the assignment handout includes a partial implementation of the general strategy
outlined above, i.e., findCellScript.m. For the programming parts of the assignment you need to
complete several Matlab m-functions, as described below. You may also make small changes to the
handout code.
1. Noise model: To estimate the parameters of a circle, we’ll focus on the use of the edgel position
measurements ~xk. The key constraint on position measurements is that the distance from the
center to any position on the circle must equal the radius. For measurement ~xk, and a circle with
centre ~xc and radius r, we can write the error as
ek = ||~xk − ~xc||2 − r
2 = ~a T
~xk + b + ~x T
k
~xk . (2)
where ~a ≡ −2~xc and b ≡ ~x T
c ~xc − r
2
. The change in parameterization, from ~xc and r to ~a and b
is convenient because ek is linear in ~a and b. Of course it is straightforward to find the circle’s
centre ~xc and radius r from ~a and b.
To properly formulate an estimator we should understand the nature of the expected errors, ek. To
that end, let’s assume that the observed positions (measurements) are equal to the true positions,
denoted ˆ~xk, plus independent, isotropic Gaussian noise ~m ∼ N (0, σ2
I); i.e.,
~xk = ˆ~xk + ~m , where p(~m) = 1
2πσ2
e
−~mT ~m/2σ
2
. (3)
2
With this measurement noise model, what is the distribution of the errors ek? How does it depend
on the circle parameters ~xc and r?
2. LS estimator: Suppose we approximate the distribution of the errors, ek, as a mean-zero Gaussian (Normal) distribution. In other words, assume that the only source of error on the RHS of
Eqn. (2) is additive, mean-zero Gaussian noise. What is a reasonable choice for the variance of
the Gaussian noise?
Assuming that you are given a set of independent measurements, {~xk}
N
k=1, all of which arise from
the same circle, a natural way to derive an estimator is to formulate the negative log likelihood
of the measurements (eg. see Sections 4.1 and 4.4 of Prince’s book). The maximum likelihood
estimator is found by minimizing the negative log likelihood with respect to the unknowns ~a and
b. Beginning with this likelihood, derive a least-squares estimator.
3. Circle proposals. The first step of our greedy strategy is to generate a set of circle proposals
for a given set of edgel measurements {(~xk, ~nk)}
N
k=1. For this step you may find it helpful to use
both the position measurements and the normal measurements.
This requires that you complete the m-function getProposals.m (in the circleFit subdirectory).
When finished this function should return a P × 3 array named circles. Each row of circles
corresponds to the parameters (xc, yc, r) of a circle, where ~xc = (xc, yc)
T denotes the image
position of the center of the circle, and r denotes the radius of the circle. Your implementation of
getProposals should attempt to efficiently produce up to numGuesses proposals. The idea is to
quickly find initial guesses from which you can select one to be subsequently refined with the robust
estimator. Since this will be run numerous times, it is essential that this step require minimal
computational effort. In your write up, clearly explain the algorithm you used to automatically
generate circle proposals, along with the reasons why you chose it over other possibilities.
If you have trouble with this step, you can begin by writing getProposals so that it simply
generates proposals from moused-in points (say a center point plus one point on the circle). You
won’t get any marks for this, but it will help you get started on other parts of the problem.
4. Circle selection. The next step in findCellScript.m is to select the best circle from the
proposed circles in the first step (Question 3 above). For example, a good circle might be close
to many edgels in the data. Implement your selection process in the function bestProposal, so
that it chooses a promising circle from the list of proposals. In your write up, clearly describe
how you select the best circle (i.e., be specific about what you mean by “best”), and explain your
reasons for this choice.
5. Robust fitting. Beginning with the best circle from the second step (Question 4) as an initial
guess, the next step is to optimize the fit. To this end, derive an IRLS algorithm for estimating
the circle parameters (~a, b) which (locally) minimize the objective function
O(~a, b) = X
k
ρ(ek(~a, b), σg) . (4)
Here ek is as defined in (2). In your write-up, include the derivation of the these equations for ~a
and b, and clearly describe your IRLS algorithm.
Complete the function fitCircleRobust to implement this IRLS algorithm for robustly fitting
a circle to your edgel data. In order to test your code, you can initially set demoRobustConv to
true. This displays how your code behaves for each initial guess provided by getProposals in the
second step above. Experiment with different values of the robust estimator’s scale parameter σg
(sigmaGM in the code). Ideally, we would choose σ
2
g
to be equal to the variance of the errors ek for
3
inlier measurements. In practice, describe is a suitable way to choose σg for your implementation.
What goes wrong when σg is much too large or much to small?
Before continuing on, set demoRobustConv back to false.
6. Model update. Finally, given the robustly fit circle, decide if it should be kept in the model
(see the function isGoodCircle). If it is decided that it should be kept, then the handout code
greedily removes the edgels with sufficiently high weights from the data. In your write up, explain
how you would choose a good threshold to be applied to the weights. Also, describe on what basis
your isGoodCircle function decides to keep a new circle, and justify your choice.
These four steps (Questions 3-6) are then repeated to fit additional circles. The cell finder is now
complete.
7. Brief evaluation. Your completed code should now provide a very reasonable first cut at solving
our geneticist’s problem. It should be expected to miss cells that have been significantly cropped
due to the side of the images. Also, small buds on some cells might be missed (see Figure 1).
Other than these two “failure modes”, your algorithm should work well. What other situations
are observed to cause your algorithm to fail to find a plausible set of circles? Briefly describe
what you might do to try to fix these remaining problems.
Academic Honesty. Cheating on assignments has very serious repercussions for the students
involved, far beyond simply getting a zero on their assignment. This is especially so for graduate
students.
This assignment is strictly individual work. It should not be discussed with anyone, even at the
level of ideas. The TAs are very experienced in detecting signs of collaboration between students
and there is a zero tolerance on cheating. Even minor infractions will be identified and reported
to the department and the Dean’s office.
The course’s assignments cover widely-used vision techniques, some of which have been used in prior
instalments of this course. Answers to some of the written questions—and even code for some of the
assignments—may be just a mouse click (or google search) away. You must resist the temptation to
search for, download and/or adapt someone else’s solutions and submit them as your own without
proper attribution. Accidentally stumbling upon a solution is no excuse either: the minute you
suspect a webpage describes even a partial solution to this assignment, either stop reading it or cite
it in the report you submit. Simply put: if an existing solution is easy for you to find, it is just as
easy for us to find it as well. In fact, you can be sure we already know about it!!
Clearly, web searches related to generic Matlab commands and/or generic concepts from calculus or
probability theory are not out of bounds. However, if you have any doubt about whether or not you
should cite material that you read online (or somewhere else) as you were working on the assignment,
the safest thing to do is to just cite it. In such a case, your grade will depend on how much of the
solution is your own work and how much of it came from the cited sources. But regardless of the
grade you get, you will not be in violation of the University’s Academic Integrity policies.
4