$29
Homework 03: Transition-based Parsing
CS 6501-005 Natural Language Processing
Updated on: October 19, 2019
1. Extracting Parsing Actions. One way to obtain a transition-based parser is to treat parsing as
a classification problem. At time step t, the parsing action is predicted by considering the status of
the input buffer and the stack. Specifically, the decision function used by a parser can be defined
as
yˆt = argmax
y∈Y
θ
>f(xt, y) (1)
where θ is the classification parameter, Y is the set of candidate parsing actions, and ˆyt is the
predicted parsing action. xt is the numeric representation of the input buffer/stack status, usually
extracted from the top two words on the stack and the head word in the input buffer.
To train a classifier as in Equation 1, we need the training example as a collection of pairs {(xt, yt)}.
It means we have convert a parsing tree as following to a sequence of parsing actions. For simplicity,
in this problem, we can ignore the dependency relations and only focus on three basic parsing
actions. Please refer to lecture 11 slides 19 for the list of parsing actions related to this example.
Book me the morning flight
root
iobj
dobj
det
nmod
(a) (3 points) Please design an algorithm that can simulate the parsing procedure based on a given
dependency tree and implement it. Your answer should include (i) a brief description of the
algorithm and (ii) an implementation with the file name [ComputingID]-alg.py.
Note that, there are some special cases during parsing, for example, the first two actions must
be Shift; when the input buffer is empty, Shift action is illegal, etc.
(b) (3 points) In the attached data.conll file, you will find 505 sentences with annotated dependency tree in the CoNLL format. In each section of the data, please ignore the lines starting
with #. To get the dependency tree, you need following three columns
– Column 1: word index
– Column 2: word
– Column 7: head word index
Your answer should include a file of parsing actions extracted from the 505 example sentences, with
file name [ComputingID]-parsing-actions.txt.
2. Constituent Parsing. Transition-based parsing can also be used to build a parse tree based on
the CFG grammar, as discussed in lecture 9 and lecture 10.
(a) (2 points) Design a set of parsing actions that can build any parse tree from the following
grammar.
1
In your answer, explain carefully the meaning of each action and how it can be done with basic
operations associated with input buffer and stack. As an example of defining parsing actions,
please refer to the slides of pages 13 - 15 in lecture 11.
(b) (2 points) Use the parsing actions defined in (2) to reconstruct the parsing action sequence of
the following tree
S
NP
DT
the
NN
man
VP
Vi
sleeps
2