$30
Deep Learning in Hardware
Homework 1
ECE 498/598
This homework contains some questions and problems marked with a star. These are intended for
graduate students enrolled in the 598NSG section. Undergraduates enrolled in the 498NSU section
are welcome to solve them for extra credit. The topics covered in this homework include: some
general machine learning, the ADALINE algorithm, quantization, DNN complexity estimation, deep
learning in Python, theoretically optimal linear prediction, and online prediction. This homework
needs to be submitted through Gradescope before 5pm CT on the due date. No extension will be
provided, so make sure to get started as soon as possible.
Problem 1: Learning a Boolean Function using ADALINE and MADALINE
In this problem we will test out the ADALINE and MADALINE algorithms in the context of
implementing Boolean functions. Consider the following function:
f(v, w, x, y, z) = (((v ⊕ w) + x) ⊕ y) + (y ⊕ z)
where v, w, x, y, z are binary variables in {−1, +1}, ⊕ is the ’xor’ operation, i.e., a ⊕ b = 1{a==b} −
1{a==−b} and + denotes the ’or’ operation.
1. Draw the truth table of the function f.
2. Code the ADALINE algorithm to learn this Boolean function. Plot the evolution of the five
weights and the bias. Upon convergence, how many of the 32 input combinations result in
the correct output?
3. * Code the MADALINE algorithm to learn this Boolean function. Use two layers and five
units in the intermediate layer. Report the converged values of all weights and biases. How
many of the 32 input combinations result in the correct output?
Figure 1: The considered triangular distribution in Problem 2.
Problem 2: Optimal quantization of a triangular distribution
1
Deep Learning in Hardware
Prof. Naresh Shanbhag Homework 1
ECE 498/598 (Fall 2020)
Assigned: 09/03 - Due: 09/18
In this problem, we will test out the Lloyd-Max (LM) algorithm on the triangular distribution
depicted in Figure 1. Note: the figure is NOT drawn to scale.
1. Recall the quantization level update equation in the Lloyd Max iteration:
rq =
R tq+1
tq
xfX(x)dx
R tq+1
tq
fX(x)dx
Derive the closed form expression of this update (i.e., evaluate the above expression).
2. Code the corresponding LM algorithm to determine the optimal quantization levels for this
distribution. Plot the quantization levels and thresholds and compare with those of a uniform
quantizer. Do this for quantizers with 4 and 16 levels.
3. Empirically evaluate and report the SQNR of the LM and uniform quantizers.
4. * Derive an expression for the SQNR of the LM and uniform quantizers, i.e., compute the
ratio of the data variance E
X2
and the MSE of the quantizer E
X − Xˆ
q
2
. Compare
your answer to the empirical SQNR obtained in the previous question.
5. The quantization we have been discussing so far does not assume clipping. In this problem,
we are assuming a signed clipping scheme where xq = c if x ≥ c and xq = −c if x ≤ −c where
c is some positive number. Derive an expression for the SQNR as a function of Bx and c.
Validate your SQNR expression by plotting the empirical SQNR vs. c for the 4-level and
16-level LM quantizers and c ranges from 0.5 to 1.9. Repeat this operation for the 4-level
and 16-level uniform quantizers as well. Use 1000 samples for each empirical SQNR value.
Problem 3: Profiling Deep Net Complexity
We have seen in class that an important first step in hardware implementation of DNNs is an
estimation of the hardware complexity. For instance, it is very useful to understand storage
and computational costs associated with every DNNs. In this problem, we start by familiarizing ourselves with DNN topologies and how to extract complexity measures from them. You are
asked to consider the following two networks: ResNet-18 and VGG-11. Please refer to https:
//github.com/pytorch/vision/tree/master/torchvision/models for the exact topological description of each of these networks, and answer the following questions for both networks. You are
encouraged to write python scripts using the PyTorch package to solve these problems.
1. Plot the total number of activations and the data reuse factor per layer as a function of layer
index. What is the total number of activations in each network?
2. Plot the total number of weights and the weight reuse factor per layer as a function of layer
index. What is the total number of weights in each network?
2