Starting from:

$35

Assignment 4 – Discrete Cosine Transformation


CSCI 3280 Introduction to Multimedia Systems 
Assignment 4 – Discrete Cosine Transformation

Introduction
As the key to the JPEG baseline compression process, the Discrete Cosine Transform (DCT)
is a mathematical transformation that transforms a signal from spatial representation into
frequency representation. In an image, most of the energy will be concentrated in the lower
frequencies. So, once an image is transformed into its frequency components, we can treat
them selectively, e.g. retaining lower frequency coefficients with high accuracy while
squeezing the size of high frequency coefficients, so as to reduce data amount needed to
describe the image without sacrificing too much image quality. In practice, a specially
configured quantization table will be used to realize such selective operation. In this
assignment, you are required to implement a program that performs DCT on a 2D image and
quantize its frequency coefficients.
Implementation Guideline
1. Discrete Cosine Transformation
Given an image 𝐒 in the spatial domain, the pixel at coordinates (𝑥, 𝑦) is denoted as 𝐒(𝑥, 𝑦).
To transform 𝐒 into an image in the frequency domain 𝐅, the most straightforward formula is:
𝐅(𝑢, 𝑣) =
𝑐(𝑢)𝑐(𝑣)
√2𝑁
∑ ∑ 𝐒(𝑥, 𝑦)
𝑀−1
𝑦=0
𝑁−1
𝑥=0
cos (𝑢𝜋
2𝑥+1
2𝑁
)cos (𝑣𝜋
2𝑦+1
2𝑁
),
𝑤𝑖𝑡ℎ 𝑐(𝑥) = {
1
√2
𝑖𝑓 𝑥 = 0
1 𝑒𝑙𝑠𝑒
.
However, the implementation above uses four nested loops and has complexity 𝑂(𝑛
4
) for a
2D DCT of size 𝑁 × 𝑀. Here, you are required to implement a more efficient version. First,
the input image 𝐒 (with pixel values ranging from 0~255) is preprocessed by shifting pixel
value by 128 and get the shifted image 𝐒̅ (with pixel values ranging from -128~127). Then, it
2
is cut up into blocks of 8 × 8 pixels (e.g. an image of resolution 256 × 256 will be divided into
32 × 32 blocks), and the DCT is applied to each block individually by using row-column
decomposition: build a 2D DCT by running a 1D DCT over every row and then every column.
Specifically, for each block 𝐒̅
𝑃 with 8 × 8 pixels, the 2D DCT is performed through the following
two steps:
• Apply 1D DCT to row 𝑣 of 𝐒̅
𝑃 : 𝐑𝑃
(𝑢, 𝑣) =
𝑐(𝑢)
2
∑ 𝐒̅
𝑃
(𝑥, 𝑣)
7
𝑥=0
cos (𝑢𝜋
2𝑥+1
16 ) , ∀𝑢 ∈
{0,1, … ,7}, and repeat it over every row of 𝐒̅
𝑃.
• Apply 1D DCT to column 𝑢 of 𝐑𝑃 : 𝐅𝑃
(u, v) =
𝑐(𝑣)
2
∑ 𝐑𝑃
(𝑢, 𝑦)
7
𝑦=0
cos (𝑣𝜋
2𝑦+1
16 ) , ∀𝑣 ∈
{0,1, … ,7}, and repeat it over every column of 𝐑𝑃.
Then, the 8 × 8 frequency coefficient array 𝐅𝑃 is obtained. By computing the frequency
coefficients for every block, you can get the final DCT result, i.e. a 𝑁 × 𝑀 coefficient array 𝐅.
2. Coefficient Quantization
The frequency coefficients generated by DCT are float numbers and many of them have very
tiny values. To convert these coefficients into integers while cause less information loss, you
are required to quantize the frequency coefficients of every blocks by using the
provided quantization matrix. Specifically, for every element in the coefficient array 𝐅𝑃, the
actual formula for quantization is: 𝐅̂
𝑃
(𝑢, 𝑣) = ⟦
𝐅𝑃(𝑢,𝑣)
𝐐(𝑢,𝑣)
⟧ , where ⟦⋅⟧ means rounding a float
number to the nearest integer, and the 8 × 8 quantization matrix 𝐐 is defined as:
3 5 7 9 11 13 15 17
5 7 9 11 13 15 17 19
7 9 11 13 15 17 19 21
9 11 13 15 17 19 21 23
11 13 15 17 19 21 23 25
13 15 17 19 21 23 25 27
15 17 19 21 23 25 27 29
17 19 21 23 25 27 29 31
Finally, a 𝑁 × 𝑀 quantized coefficient array 𝐅̂ will be achieved.
Basic Requirements: No quantization (80 point)
1. You are required to implement the Discrete Cosine Transform (DCT) as described in
Section 1 and apply it to the provided testing image.
3
2. Program must be coded in ANSI C/C++ and uses standard libraries only.
3. The compiled program must run in Windows 10 command prompt as a console program
and accepts source bitmap (.bmp format) with the following syntax and save generated
images to the current directory.
C:\> dct <img_path> <apply_quantization>
dct is the executable file of your program, e.g. the command: dct linda.bmp 0 is to
transform the input image from spatial domain into frequency domain and output the
frequency coefficient array without quantization applied.
<img_path> is the path where the input image is located.
<apply_quantization> specifies whether the obtained frequency coefficients need to be
quantized, where 1 means TRUE while 0 means FALSE. Note that basic requirement
does NOT require the implementation of the coefficient quantization part.
4. You are required to submit source code only. We will use Visual Studio 2015 C++ compiler
and have your program compiled via Visual Studio command prompt (Tools Command
Prompt) with the command line: C:\> cl dct.cpp bmp.cpp (Please ensure your
source code gets compiled well with it).
Enhanced Part: With quantization (20 point)
You are further required to implement the coefficient quantization as described in Section 2,
which will make the output coefficients be compression-efficient integers.
Submission
We expect the following files zipped into a file named by your SID (e.g. s1234567890.zip)
and have it uploaded to the Blackboard by due date: Apr. 29, 2020 (11:59pm)
- README.txt (Tell us anything that we should pay attention to)
- bmp.h & bmp.cpp (No need to change)
- dct.cpp (write your code in this file only)

More products