$30
ECSE 420 – Parallel Computing
Lab 1 – Simple CUDA Processing
In this lab, you will write code for simple signal processing, parallelize it using CUDA,
and write a report summarizing your experimental results. We will use PNG images as
the test signals.
Each group should submit a single zip file with the filename Group <your group
number Lab 1.zip.
1. Image Rectification
Rectification produces an output image by repeating the following operation on each
element of an input image:
������ � � = ����� i j if ����t[i][j] ≥ 0
0 if �����[i][j] < 0
Figure 1 shows rectification performed on a small array.
Figure 1: Rectification Example
Figure 1: Rectification
Of course, since pixel values are already in the range [0,255], rectifying them will not
change their values, so you should “center” the image by subtracting 127 from each pixel,
then rectify, then add back 127 to each pixel (or equivalently, compare the pixel values to
127 and set equal to 127 if lower).
Write code that performs rectification. Parallelize the computation using CUDA kernels.
Measure the runtime when the number of threads used is equal
to {1,2,4,8,16,32,64,128,256}, and plot the speedup as a function of the number of
threads. Discuss your parallelization scheme and your speedup plots and make reference
to the architecture on which you run your experiments. Include in your discussion an
image of your choice (not the provided test image) and the output of rectifying that
image.
To check that your code runs properly, the TA will run the following command:
./rectify <name of input png <name of output png < # threads
When the input test image is “test.png”, the output of your code should be identical to
“test rectify.png”. You can use the “test equality” code provided by TAs to check if two
images are identical. The grader should be able to run your code with different numbers
of threads, and the output of your code should be correct for any of those thread counts.
2. Pooling
Pooling compresses an image by performing some operation on regularly spaced sections
of the image. For example, Figure 2 shows max-pooling on 2x2 squares in an image. In
this case, the operation is taking the maximum value in the section, and the sections are
the disjoint 2x2 squares.
Figure 2: Max Pooling
Write code which performs 2x2 max-pooling. Analyze, discuss, and show an example
image as described in the first section.
The grader will run the following command:
./pool <name of input png <name of output png < # threads
When the input test image is “test.png”, the output of your code should be identical to
“test pool.png”. Note that if the input image is of size m by n, then the output of 2x2
max- pooling will be of size m/2 by n/2. Do not worry about the corner case in which the
image cannot be divided perfectly into 2x2 squares– you may assume that the input test
image will have equal width and height.