$29.99
CMPS 297S/396AA: GPU COMPUTING
ASSIGNMENT 6
In this assignment, you will implement an exclusive scan kernel using the Brent-Kung method.
Instructions
1. Place the files provided with this assignment in a single directory. The files are:
main.cu: contains setup and sequential code
kernel.cu: where you will implement your code (you should only modify this file)
common.h: for shared declarations across main.cu and kernel.cu
timer.h: to assist with timing
Makefile: used for compilation
2. Edit kernel.cu where TODO is indicated to implement the scan and add kernels. Please take
note of the following:
You must implement the Brent-Kung exclusive scan. No credit will be given for
implementing Kogge-Stone, or inclusive Brent-Kung with shifted inputs.
Your kernel is expected to work for any set of input dimensions so make sure to handle
boundary conditions correctly.
Your code should be optimized by using shared memory and re-indexing threads to
minimize control divergence.
You do not need to apply thread coarsening.
3. Compile your code by running: make
4. Test your code by running: ./scan
If you are using the HPC cluster, do not forget to use the submission system. Do not run
on the head node!
For testing on different input sizes, you can provide your own values for input size as
follows: ./scan <N>
Submission
Submit your modified kernel.cu file via Moodle by the due date. Do not submit any other files or
compressed folders.