Starting from:

$29.99

CSE 252B Assignment 5

Assignment 5
Problems
1. Point on line closest to the origin (5 points) Given a line l = (a,b,c), show that the point on l that is closest to the origin is the point x = (−ac,−bc,a2+b2) (hint: this calculation is needed in the two-view optimal triangulation method used below).
2. Programming: Automatic estimation of the fundamental matrix (150 points)
(a) Feature detection (20 points) Download input data from the course website. The file IMG_5030.JPG contains image 1 and the file IMG_5031.JPG contains image 2. In your report, include a figure containing the pair of input images. For each input image, calculate an image where each pixel value is the minor eigenvalue of the gradient matrix N = P w I2 x P w IxIy P w IxIy P w I2 y   where w is the window about the pixel, and Ix and Iy are the gradient images in the x and y direction, respectively. Calculate the gradient images using the five-point central difference operator. Set resulting minor eigenvalues that are below a specified threshold value to zero (hint: calculating the mean instead of the sum in N allows for adjusting the size of the window without changing the threshold value). Apply an operation that suppresses (sets to 0) local (i.e., about a window) nonmaximum pixel values in the minor eigenvalue image. Vary these parameters such that around 1350–1400 features are detected in each image. For resulting nonzero pixel values, determine the subpixel feature coordinate using the F¨orstner corner point operator. In your report, state the size of the feature detection window (i.e., the size of the window used to calculate the elements in the gradient matrix N and corner points), the minor eigenvalue threshold value, the size of the local nonmaximum suppression window, and the resulting number of features detected in each image. Additionally, include a figure containing the pair of images, where the detected features (i.e., corner points) in each of the images are indicated by a square about the feature, where the size of the square is the size of the detection window. (b) Feature matching (15 points) Determine the set of one-to-one putative feature correspondences by performing a brute-force search for the greatest correlation coefficient value (in the range [-1, 1]) between windows centered about the detected features in image 1 and windows centered about the detected features in image 2. Only allow matches that are above a specified correlation coefficient threshold value (note that calculating the correlation coefficient allows for adjusting the size of the matching window without changing the threshold value). Further, only allow matches that are above
2
a specified distance ratio threshold value, where distance is measured to the next best match for a given feature. Vary these parameters such that around 300 putative feature correspondences are established. Optional: constrain the search to coordinates in image 2 that are within a proximity of the detected feature coordinates in image 1. In your report, state the size of the proximity window (if used), the correlation coefficient threshold value, the distance ratio threshold value, and the resulting number of putative feature correspondences (i.e., matched features). Additionally, include a figure containing the pair of images, where the matched features in each of the images are indicated by a square about the feature, where the size of the square is the size of the matching window, and a line segment is drawn from the feature to the coordinates of the corresponding feature in the other image (see Fig. 11.4(e) in the Hartley & Zisserman book as an example). (c) Outlier rejection (20 points) The resulting set of putative point correspondences should contain both inlier and outlier correspondences (i.e., false matches). Determine the set of inlier point correspondences using the M-estimator Sample Consensus (MSAC) algorithm, where the maximum number of attempts to find a consensus set is determined adaptively. For each trial, you must use the 7-point algorithm (as described in lecture) to estimate the fundamental matrix, resulting in 1 or 3 solutions. Calculate the (squared) Sampson error as a first order approximation to the geometric error. In your report, describe any assumptions, including the probability p that at least one of the random samples does not contain any outliers (used to determine the number of attempts to find a consensus set), and the probability α that a given data point is an inlier and the variance σ2 of the measurement error (both used to determine the distance threshold; hint: this problem has codimension 1). State the resulting number of inliers and the number of attempts to find the consensus set. Additionally, include a figure containing the pair of images, where the inlier features in each of the images are indicated by a square about the feature, where the size of the square is the size of the matching window, and a line segment is drawn from the feature to the coordinates of the corresponding feature in the other image (see Fig. 11.4(g) in the Hartley & Zisserman book as an example). (d) Linear estimation (15 points) Estimate the fundamental matrix FDLT from the resulting set of inlier correspondences using the direct linear transformation (DLT) algorithm (with data normalization). Include the numerical values of the resulting FDLT, scaled such that kFDLTkFro = 1, in your report with sufficient precision such that it can be evaluated (hint: use format longg in MATLAB prior to displaying your results). (e) Nonlinear estimation (70 points) As described in lecture, parameterize the fundamental matrix as (ω U ,ω V ,σ), where kσk = 1, and calculate the camera projection matrices P = [I|0] and P0 =
3
[M|e0], where M = UZdiag(σ1,σ2,(σ1 + σ2)/2)V, U = exp([ωU]×) = [u1|u2|u3], σ = (σ1,σ2), V = exp([ωV]×), Z =  0 −1 0 1 0 0 0 0 1   and e0 = −u3. Use FDLT and the triangulated 3D points as an initial estimate to an iterative estimation method, specifically the sparse Levenberg-Marquardt algorithm, to determine the Maximum Likelihood estimate of the fundamental matrix F = exp([ωU]×)diag(σ,0)exp([ωV]×) that minimizes the reprojection error. The initial estimate of the 3D points must be determined using the two-view optimal triangulation method described in lecture (algorithm 12.1 in the Hartley & Zisserman book, but use the ray-plane intersection method for the final step instead of the homogeneous method). Additionally, you must parameterize the homogeneous 3D scene points that are being adjusted using the parameterization of homogeneous vectors (see section A6.9.2 (page 624) of the textbook, and the corrections and errata). In your report, show the initial cost (i.e., the cost at iteration 0) and the cost at the end of each successive iteration. Show the numerical values for the final estimate of the fundamental matrix FLM, scaled such that kFLMkFro = 1, in your report with sufficient precision such that it can be evaluated. (f) Point to line mapping (10 points) Qualitatively determine the accuracy of FLM by mapping points in image 1 to epipolar lines in image 2. Choose three distinct points x{1,2,3} distributed in image 1 that are not in the set of inlier correspondences and map them to epipolar lines l0 {1,2,3} = FLMx{1,2,3} in the second image under the fundamental matrix FLM. In your report, include a figure containing the pair of images, where the three points in image 1 are indicated by a square (or circle) about the feature and the corresponding epipolar lines are drawn in image 2. Comment on the qualitative accuracy of the mapping (hint: each line l0 i should pass through the point x0 i in image 2 that corresponds to the point xi in image 1).

More products