Project 1 Color Images, Edge Detection, and Run-Length Encodings

Your shopping cart is empty.

CS 61B Project 1
Color Images, Edge Detection, and Run-Length Encodings

Warning: This project is time-consuming. Start early.
This is an individual assignment; you may not share code with other students.
Getting started: You will find the code for this assignment in ˜cs61b/hw/pj1/.
Start by copying it into your own pj1 directory.
In this project you will implement two simple image processing operations on
color images: blurring and edge detection. You will use libraries to read and
write files in the TIFF image format. One option in TIFF files is that they
can be compressed if there are many adjacent pixels of the same color; the
compressed form is called a run-length encoding. You will write code to
convert an image into a run-length encoding and back.
Each image is a rectangular matrix of color pixels, which are indexed as
follows (for a 4x3 image):
------ x
| -----------------------------
| | 0, 0 | 1, 0 | 2, 0 | 3, 0 |
y | -----------------------------
| | 0, 1 | 1, 1 | 2, 1 | 3, 1 |
v -----------------------------
| 0, 2 | 1, 2 | 2, 2 | 3, 2 |
-----------------------------
Note that the origin is in the upper left; the x-coordinate increases as you
move right, and the y-coordinate increases as you go down. (This conforms to
Java’s graphics commands, though you won’t need to use them directly in this
project.) We use the notation (i, j) to denote the pixel whose x-coordinate is
i and whose y-coordinate is j.
Each pixel has three numbers in the range 0...255 representing the red, green,
and blue intensities of the pixel. These three bytes are known as the RGB
values of the image. A pixel in which all three values are zero is pure black,
and a pixel in which all three values are 255 is bright white. Although Java
has a "byte" integer type, its range is -128...127, so we will usually use
Java’s "short" type for methods that take RGB parameters or return RGB values.
Part I: Image Blurring and Edge Detection
==========================================
This part is worth 40% of your total score. (8 points out of 20).
Implement a class called PixImage that stores a color image. The PixImage
class will include methods for reading or changing the image’s pixels. It will
also include a method for blurring an image and a method for detecting edges in
an image. We have provided a skeleton file named PixImage.java that includes
prototypes for the public methods the class offers. You are required to
provide implementations of all these methods.
A PixImage is described by its size and the RGB values of each pixel, but it is
up to you to decide _how_ a PixImage stores a color image. You should
certainly use one or more arrays; otherwise, you have some freedom to choose
the details.
The size of a PixImage is determined when it is constructed, and does not
change afterwards. There is one PixImage constructor, which takes two integers
as input, representing the width and height of the image, and returns an image
of the specified size. For example, the statement
PixImage image = new PixImage(w, h);
should create a w x h Image object. In your implementation, you may define any
fields, additional methods, additional classes, or other .java files you wish,
but you cannot change the prototypes in PixImage.java. We will test your code
by calling your public methods directly, so it is important that you follow
this rule. Please read PixImage.java carefully for an explanation of what
methods you must write. The most important of these are boxBlur(), a simple
image blurring algorithm, and sobelEdges(), an edge detection algorithm.
The pixels of a PixImage can be changed with the method setPixel(). However,
the methods boxBlur() and sobelEdges() should NEVER change "this" original
PixImage; they should construct a new PixImage and update it to show the
results. The pixels in the new, output PixImage should depend only on the
pixels in "this" original PixImage. To obtain correct behavior, you will be
working with two PixImages simultaneously, reading pixels from one and writing
(modifying) pixels in the other.
In an image, a pixel not on the boundary has nine "neighbors": the pixel
itself and the eight pixels immediately surrounding it (to the north, south,
east, and west, and the four diagonal neighbors). For the sake of exposition,
we consider each pixel to be its own "neighbor". A pixel on the boundary has
six neighbors if it is not a corner pixel; only four neighbors if it is a
corner pixel. In both boxBlur() and sobelEdges(), the contents of any
particular output pixel depend only on the contents of its neighbors in the
input image.
boxBlur() simply sets each output pixel to be the average of its neighbors
(including itself) in "this" input PixImage. This means summing up the
neighbors and dividing by the number of neighbors (4, 6, or 9). The sum might
not be divisible by the number of neighbors, but the output pixel values must
be integers, so we will allow Java to round non-integer quotients toward zero
(as it always does when it divides one integer by another).
Each color (red, green, blue) is blurred separately. The red input should have
NO effect on the green or blue outputs, etc.
boxBlur() takes a parameter "numIterations" that specifies a number of repeated
iterations of box blurring to perform. If numIterations is zero or negative,
boxBlur() should return "this" PixImage (rather than construct a new PixImage).
If numIterations is positive, the return value is a newly constructed PixImage
showing what "this" PixImage would become after being blurred "numIterations"
times. IMPORTANT: each iteration of blurring should be writing to a
_different_ PixImage than the one produced by the previous iteration. You
should NEVER be reading and writing pixels in the same image simultaneously,
and you will get the wrong answer if you try.
sobelEdges() implements the Sobel edge detection algorithm. The output
PixImage will be a grayscale image (i.e. red == green == blue for every pixel)
with light pixels where edges appear in "this" input PixImage (i.e. where
regions of contrasting colors meets) and dark pixels everywhere else. If you
imagine that the image is a continuous field of color, the Sobel algorithm
computes an approximate gradient of the color intensities at each pixel.
For each pixel (x, y), you will compute an approximate gradient (gx, gy) for
each of the three colors. As with blurring, the intensity of pixel (x, y) in
the output PixImage depends only on the neighbors of (x, y) in "this" input
PixImage, and the red, green, and blue intensities are treated separately at
first. The red gradient (gx, gy) is a 2D vector that locally approximates the
direction of greatest increase of the red pixel intensities (and depends ONLY
on the red intensities of the pixels.) If two regions of very different red
intensities meet at the pixel (x, y), then (gx, gy) is a long vector that is
roughly perpendicular to the boundary where the constrasting regions meet.
02/10/14
21:34:24 readme 2
We compute the red gradient (gx, gy) with the following _convolutions_.
-------------- ----------------------------------------------
| 1 | 0 | -1 | | x - 1, y - 1 | x , y - 1 | x + 1, y - 1 |
-------------- ----------------------------------------------
gx = | 2 | 0 | -2 | * | x - 1, y | x , y | x + 1, y |
-------------- ----------------------------------------------
| 1 | 0 | -1 | | x - 1, y + 1 | x , y + 1 | x + 1, y + 1 |
-------------- ----------------------------------------------
---------------- ----------------------------------------------
| 1 | 2 | 1 | | x - 1, y - 1 | x , y - 1 | x + 1, y - 1 |
---------------- ----------------------------------------------
gy = | 0 | 0 | 0 | * | x - 1, y | x , y | x + 1, y |
---------------- ----------------------------------------------
| -1 | -2 | -1 | | x - 1, y + 1 | x , y + 1 | x + 1, y + 1 |
---------------- ----------------------------------------------
The boxes on the right store the red pixel intensities for the neighbors of (x,
y). The convolution operation "*" simply means that we multiply each box on
the left with the corresponding box on the right, then sum the nine products.
(It’s like an inner product, aka dot product, of two vectors of length 9.
Note that this is NOT matrix multiplication!) The green and blue gradients are
defined likewise.
(If you are interested in further details, see the Wikipedia page
http://en.wikipedia.org/wiki/Sobel_operator .)
This gives us three gradient vectors for each pixel (red, green, and blue).
Define the _energy_ of a gradient vector (gx, gy) to be the square of its
length; by Pythagoras’ Theorem, the energy is gx * gx + gy * gy. Define the
_energy_ of a pixel to be the sum of its red, green, and blue energies.
(If you think of a pixel’s three gradients together as being a vector in
a six-dimensional space, the pixel’s energy is the square of the length of that
six-dimensional vector.)
energy(x, y) = gx(red)^2 + gy(red)^2 + gx(green)^2 + gy(green)^2 +
gx(blue)^2 + gy(blue)^2.
IMPORTANT: You must compute the energy EXACTLY. The pixel intensities are of
type "short", but the energy is usually too large to fit in a "short". Thus
you must cast all the gradient vectors to type "long" or "int" BEFORE you
compute any squares, and you must keep the results as "long" or "int" for the
rest of the computation.
The maximum possible value of a Sobel gradient (for one color) is (1020, 1020),
so each pixel’s energy (combining all three colors) is a number in the range
0...6,242,400. We have provided a method mag2gray() in the PixImage class
that takes a pixel energy of type "long" and flattens it down to a grayscale
intensity in the range 0...255. The map is logarithmic, so that images over a
wide range of intensities will reveal their edges. However, energies of
roughly 5,080 and below map to intensity zero, so very-low contrast edges do
not appear in the output PixImage. Don’t worry if you don’t understand
mag2gray(); JUST DON’T CHANGE mag2gray().
In your output PixImage, set the red, green, and blue intensities of the pixel
(x, y) to be the value mag2gray(energy(x, y)).
Pixels on the boundary of the output image require special treatment, because
they do not have nine neighbors. We treat them by _reflecting_ the image
across each image boundary. (Imagine the image is sitting right on the shore
of a lake, so an upside-down copy of the image is reflected below it.) Thus,
we treat the pixel (-1, 2) as if it had the same RGB intensities as (0, 2), and
the pixel (1, height) as if it had the same RGB intensities as (1, height - 1).
Then we compute the Sobel convolutions as usual. The reflections prevent
spurious "edges" from appearing on the boundaries of the image.
(Hint: programming will be a lot easier if you write helper functions that do
the reflection for you, and use them for every pixel access in your edge
detector, so you don’t have to think about it again.)
We have provided Java classes to help you see your output images and debug your
implementation of Part I, in these files:
Blur.java
Sobel.java
The main() methods in these classes read an image in TIFF format, use your code
to perform blurring and/or edge detection, write the modified image in TIFF
format, and display the input and output images. You will need to compile them
against the JAI image libraries in the .jar files we have included, which
may require you to add the .jar files to your "classpath". In Unix:
javac -cp "jai_core.jar:jai_codec.jar" *.java
Both programs take one or two command-line arguments. The first argument
specifies the name of an input image file in TIFF format. (If you specify no
arguments, the programs will remind you how to use them.) The optional second
argument specifies the number of iterations of your box blurring filter to
perform. For example, if you run
java -cp ".:jai_core.jar:jai_codec.jar" Blur image.tiff 3
then Blur will load the image from image.tiff, perform three iterations of
blurring, write the blurred image to a file named blur_image.tiff, and display
the input and output images. If you omit the second command-line argument, the
default number of iterations is 1.
The Sobel program does Sobel edge detection. Optionally, it will also perform
iterations of blurring prior to edge detection. A small amount of blurring
tends to make edge detection more robust in images whose lines of contrast are
not very sharp. If you run
java Sobel image.tiff 5
then Sobel will load the image from image.tiff, perform five iterations of
blurring, perform Sobel edge detection on the blurred image, write the blurred
image to a file named blur_image.tiff, write the grayscale-edge image to a file
named edge_image.tiff, and display all three images (1 input, 2 output). If
you omit the second command-line argument, no blurring is performed, and no
blurred image is written nor displayed. (Sobel also has an optional third
command-line argument that you shouldn’t try until you complete Part III.)
We have included some .tiff files for you to play with. There is also a bit of
test code for the boxBlur() and sobelEdges() methods in the main() method of
PixImage.java.
You might find it useful to check out the methods in ImageUtils.java that read
and write the TIFF files. We have also included a copy of the TIFF standard
(the file TIFF6.pdf) for those who are curious. Neither of these things are
necessary to complete the project, but if you want more control over writing
images to files for your own entertainment, it is easy to modify Blur.java or
Sobel.java for that purpose.
02/10/14
21:34:24 readme 3
Part II: Converting a Run-Length Encoding to an Image
======================================================
This part is worth 25% of your total score. (5 points out of 20).
A large number of large image files can consume a lot of disk space. Some
PixImages can be stored more compactly if we represent them as "run-length
encodings." Imagine taking all the rows of pixels in the image, and connecting
them into one long strip. Think of the pixels as being numbered thusly:
-----------------------------
| 0 | 1 | 2 | 3 |
-----------------------------
| 4 | 5 | 6 | 7 |
-----------------------------
| 8 | 9 | 10 | 11 |
-----------------------------
Some images have long strips of pixels of the same color (RGB intensities).
In particular, the grayscale images produced by sobelEdges() can have large
uniform regions, especially where no edges are detected. Run-length encoding
is a technique in which a strip of identical consecutive pixels (possibly
spanning several rows of the image) are represented as a single record or
object. For instance, the following strip of intensities:
------------------------------------------------------
| 7 | 7 | 7 | 88 | 88 | 88 | 88 | 88 | 0 | 0 | 0 | 0 |
------------------------------------------------------
0 1 2 3 4 5 6 7 8 9 10 11
could be represented with just three records, each representing one "run":
--------------------
| 7,3 | 88,5 | 0,4 |
--------------------
"7,3" means that there are three consecutive 7’s, followed by "88,5" to signify
five consecutive 88’s, and then "0,4" for four jet black pixels. With this
encoding, a huge image whose pixels are mostly one color (like daily comic
strips, which are mostly white) can be stored in a small amount of memory. The
TIFF image format has run-length encoding built in, so in these cases it can
produce shorter image files. (Note that the TIFF encoder we’ll be using only
gives us a savings when there are runs of pixels for which all three colors are
identical. However, it is also possible to write TIFF files in which the three
colors are separated, so one can exploit runs within a single color.)
Your task is to implement a RunLengthEncoding class, which represents a
run-length encoding as a linked list of "run" objects. It is up to you whether
to use a singly- or doubly-linked list, but a doubly-linked list might make
Part IV easier. (You are not permitted to use large arrays for this purpose.)
Because this is a data structures course, please use your own list class(es) or
ones you have learned in class. In future courses, it will sometimes make more
sense for you to use a linked list class written by somebody else, such as
java.util.LinkedList. However, in CS 61B this is forbidden, because I want you
to always understand every detail of how your data structures work. Likewise,
you may not use java.util.Vector or other data structures libraries.
Part II(a): Implement two simple constructors for RunLengthEncodings. One
constructs a run-length encoding of a jet black image. The other constructs
a run-length encoding based on four arrays provided as parameters to the
constructor. These arrays tell you exactly what runs your run-length encoding
should contain, so you are simply converting arrays to a linked list. (Read
the prototype in RunLengthEncoding.java.)
Part II(b): Your run-length encodings will be useful only if other classes
are able to read your encodings after you create them. Therefore, implement
the iterator() method in the RunLengthEncoding class and the RunIterator()
constructor, the hasNext() method, and the next() method in the RunIterator
class.
These methods work together to provide an interface by which other classes can
read the runs in your run-length encoding, one by one. A calling application
begins by using RunLengthEncoding.iterator() to create a new RunIterator i that
points to the first run in the run-length encoding--the run that contains pixel
(0, 0). (Outside classes should never call the RunIterator() constructor
directly; only RunLengthEncoding.iterator() should do that.) Each time
i.next() is invoked, it returns the next run--represented as an array of four
ints--until every run has been returned. The returned array encodes a run by
storing the length of the run (the number of pixels) at index zero, the red
pixel intensity at index one, the green pixel intensity at index two, and the
blue pixel intensity at index three. (This four-int array is constructed in
next(), and can be discarded by the calling method after use. This array
should not be part of your RunLengthEncoding data structure! It must be
freshly constructed for the sole purpose of returning four ints.)
Given an iterator i, i.hasNext() returns true if it has more runs to return.
It returns false if i has already been used to return every run, letting the
calling program know that there are no more runs in the encoding. Calling
programs should always check i.hasNext() before they call i.next(); if they
call i.next() at the end of the encoding, i.next() may throw an exception and
crash your program. (If a calling application wants to reset i so it points
back to the first run again, it has to construct a brand new RunIterator.)
Please read the file RunIterator.java carefully for more information. You
might also find it helpful to look up the Iterator class in the Java API.
Part II(c): Implement a toPixImage() method in the RunLengthEncoding class,
which converts a run-length encoding to a PixImage object.
Read RunLengthEncoding.java carefully for an explanation of what methods
you must write. The fields of the PixImage class MUST be private, and the
RunLengthEncoding class cannot manipulate these fields directly. Hence,
the toPixImage() method will rely upon the PixImage() constructor and the
setPixel() method.
You cannot change any of the prototypes in RunLengthEncoding.java or
RunIterator.java--except the RunIterator() constructor, which you can give any
signature you like, as it is called only from RunLengthEncoding.iterator().
We will test your code by calling your public methods directly.
There is a bit of test code for Parts II, III, and IV in the main() method of
RunLengthEncoding.java.
02/10/14
21:34:24 readme 4
Part III: Converting an Image to a Run-Length Encoding
=======================================================
This part is worth 25% of your total score. (5 points out of 20).
Write a RunLengthEncoding constructor that takes a PixImage object as its sole
parameter and converts it into a run-length encoding of the PixImage.
The fields of the PixImage class MUST be private, so the RunLengthEncoding
constructor will rely upon the getWidth(), getHeight(), getRed(), getGreen(),
and getBlue() methods.
Testing ------- The following is worth 1 point out of the 5, but should probably be done as
soon as you can during Part III.
Your RunLengthEncoding implementation is required to have a check() method,
which walks through your run-length encoding and checks its validity.
Specifically, it should print a warning message if any of the following
problems are found:
- If two consecutive runs have exactly the same type of contents.
For instance, a "99,12" run followed by an "99,8" run is illegal, because
they should have been consolidated into a single run of twenty 99’s.
- If the sum of all the run lengths doesn’t equal the size (in pixels) of
the PixImage; i.e. its width times its height.
- If a run has a length less than 1.
You may find that the check() method is very useful in helping to debug your
RunLengthEncoding() constructors and setPixel() in Part IV. I also recommend
implementing a toString() method for your RunLengthEncoding so you can print
and examine it to help debugging.
If the Sobel program is given three (or more) command-line arguments,
regardless of what the third argument is, it will write the grayscale-edge
image twice, once as an uncompressed TIFF file and once as a run-length encoded
TIFF file. The two TIFF files will be different and have different lengths,
but the images should look identical. If they don’t, there is probably a bug
in your run-length encoding method or your RunIterator.
You might find it interesting to compare the sizes of the two TIFF files. (Not
surprisingly, the disparity is greatest for the input file black.tiff, which is
all black pixels.)
Part IV: Changing a Pixel in a Run-Length Encoding
===================================================
The last part is the hardest, but it is only worth 10% of the total score
(2 points out of 20), so don’t panic if you can’t finish it.
Implement the setPixel() method of the RunLengthEncoding class, which is
similar to the setPixel() method of the PixImage class. However, this code is
much trickier to write. Observe that setPixel() can lengthen, or even shorten,
an existing run-length encoding. To change a pixel in a run-length encoded
image, you will need to find the right run in the linked list, and sometimes
break it apart into two or three runs. If the changed pixel is adjacent to
other pixels of identical color, you should consolidate runs to keep memory use
down. (Your check() method ensures that your encoding is as compact as
possible.)
IMPORTANT: For full points, your setPixel() method must run in time
proportional to the number of runs in the encoding. You MAY NOT convert the
run-length encoding to a PixImage object, change the pixel in the PixImage, and
then convert back to a run-length encoding; not only is that much too slow, it
will be considered CHEATING and punished accordingly.
Test Code --------- We are still working on an autograder for the project, and will provide it when
it is ready. Until then, there is some test code in the main() methods of both
PixImage and RunLengthEncoding, and the programs Blur and Sobel can also help.
The autograder will assign some, but not all, of the points for the project.
Additional points will be assigned by a human reader for partly finished code
and for your check() method, which is not autogradeable. Points may be
subtracted if you break some of the rules stated above, especially the rules on
where you must use arrays and where you must use linked lists.
Submitting your Solution ------------------------ Make sure that your program compiles and runs on the _lab_ machines with the
autograding program before you submit it. Change (cd) to your pj1 directory,
which should contain PixImage.java, RunLengthEncoding.java, RunIterator.java,
and any other .java files you wish to submit. If your implementation uses
.java files in addition to those we have specified, have no fear: the "submit"
program will ask you which .java files in your pj1 directory you want to
submit. From your pj1 directory, type "submit pj1".
After submitting, if you realize your solution is flawed, you may fix it and
submit again. You may submit as often as you like. Only the last version you
submit will be graded, unless you inform your reader promptly that you would
prefer to have an earlier submission graded instead.
If your submission is late, you will lose 1% of your earned score for every two
hours (rounded up) your project is late.
Afterthought ------------ Our own John Canny, a professor right here in the Berkeley CS department,
invented an "optimal" edge detector, which is a more sophisticated version of
what we’ve done in this project. If you’re interested, see the Wikipedia page
http://en.wikipedia.org/wiki/Canny_edge_detector .

Shopping cart

US$0

Project 1 Color Images, Edge Detection, and Run-Length Encodings

More products