Starting from:

$30

Homework 03 ECE 449/590

Homework 03
ECE 449/590

1. (30 points) Let {x
(1), . . . , x(m)} be sampled i.i.d. from a Gaussion distribution of
mean µ anf variance σ
2
. Let ˆµm =
1
m
Pm
i=1 x
(i) be an estimator of Gaussian mean
µ. Show that Var(ˆµm) = σ
2
m
.
2. (70 points) Considering the following neural network with inputs (x1, x2), outputs
(z1, z2, z3), and parameters θ = (a, b, c, d, e, f, i, j, k, l, m, n, o, p, q),

g1
g2

=

a b
c d  x1
x2

+

e
f

,

h1
h2

=
 ReLU(g1)
ReLU(g2)

,
 z1
z2
z3
!
=
 i j
k l
m n !

h1
h2

+
 o
p
q
!
.
A. (15 points) For a minibatch containing a single training sample (x1, x2, y =
2), apply softmax and write down the cross-entropy loss function J(θ) as a
function of (z1, z2, z3). Compute ∂J
∂z1
,
∂J
∂z2
,
∂J
∂z3
as functions of (z1, z2, z3).
B. (20 points) Base on A., apply backprop to compute ∂J
∂i ,
∂J
∂j ,
∂J
∂k ,
∂J
∂l ,
∂J
∂m,
∂J
∂n,
∂J
∂o ,
∂J
∂p ,
∂J
∂q ,
∂J
∂h1
,
∂J
∂h2
.
C. (20 points) Base on B., apply backprop to compute ∂J
∂a ,
∂J
∂b ,
∂J
∂c ,
∂J
∂d ,
∂J
∂e ,
∂J
∂f . Explain why you don’t need to compute ∂J
∂x1
and ∂J
∂x2
.
(Hint: use the step function u(x) as the derivative of ReLU(x).)
D. (15 points) For the learning rate ϵ, show the equation to apply the simple
SGD algorithm to update θ for this minibatch.
1

More products