Starting from:

$30

Memoization of Binomial Coefficients with Splay Trees

Memoization of Binomial Coefficients with Splay Trees
CS 223

1 Introduction
For this project you will implement the symbol table ADT using splay trees with parent links as
described in Sections 3. You will test your implementation by computing binomial coefficients and
caching the results in a symbol table to speed up future computations; this technique is called
memoization and is explained in Section 2. How to test your implementation is described in
Section 4. What you are to submit is described in Section 5.
2 Computing binomial coefficients
If n is sufficiently small, we can compute the binomial coefficient
n
k
?
directly as
?
n
k
?
=
n!
(n − k)! k!
. (1)
We can avoid evaluating factorials by using the recurrent formula
?
n
k
?
=
?
1 if k = 0 or n = k

n−1
k−1
?
+

n−1
k
?
otherwise (2)
which can be tabulated yielding Pascal’s Triangle (shown here for n = 0 . . . 6) :
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
0 1 2 3 4 5 6
0
1
2
3
4
5
6
n
k
The following na¨ıve implementation of Equation 2 is unbearably slow:
long binomial(int n, int k) {
if (k == 0 || n == k)
return 1;
return binomial(n-1,k-1) + binomial(n-1,k);
}
n-1, k-1 n-1, k
n, k
n-2, k-2 n-2, k-1 n-2, k
n-3, k-3 n-3, k-2 n-3, k-1 n-3, k
n-4, k-4 n-4, k-3 n-4, k-2 n-4, k-1 n-4, k
Figure 1: Call-graph for recursive binomial function for input (n, k). The values for the shaded
nodes are computed multiple times.
Figure 1 shows the resulting function call graph for input (n, k) and demonstrates that most values
are computed multiple times. For example, binomial(n-2,k-1) will be invoked twice which means
that binomial(n-3,k-2) and binomial(n-3,k-1) will each be invoked three times. Therefore,
binomial(n-4,k-2) will be invoked six times, etc. . .
2.1 Memoization
To avoid the multiple evaluations in our simple recursive implementation, we can cache previous
results in a table to avoid re-computations as follows:
long binomial(int n, int k) {
if (k == 0 || n == k)
return 1;
if (k == 1 || k == n - 1)
return n;
if (k n/2)
k = n - k;
Long v = table.search(n,k); // search for previously cached value
if (v != null) // if found....
return v.longValue(); // ...then we are done
long b = binomial(n-1,k-1) + binomial(n-1,k); // ...else compute value
table.insert(n,k,b); // and cache the result
return b;
}
This technique is called memoization and has the further advantage of speeding up future invocations as long as the table persists between calls. Note that we attempt to reduce the size of the
table by noting the following:
?
n
1
?
=
?
n
n − 1
?
= n, (3)
2
?
n
k
?
=
?
n
n − k
?
. (4)
We could build a customized table for this problem that would yield constant time search and
insertion (e.g., use a dynamically growable 2D array with varying row size); We would like to
construct a more general solution that we could easily re-adapt for any function and without
predicting what the ranges of the input values will be. One possible solution is to use a Binary
Search Tree (BST) for the table; We will implement our memoization table using a Splay Tree as
described in Section 3.
3 Splay Trees with Parent References
Splay Tree insertion and search is performed in O(log n) amortized time so this should give us
reasonable performance. In our iterative implementation in class, we recorded the path from the
root to the target node on a stack so we could perform the necessary splay operations from the
node back up the root. An alternate technique is to store a parent reference in each node. Internal
node operations become a little messier with an added parent pointer (particularly rotations, see
Section 3.3), but allow us to easily traverse back up the tree towards the root.
3.1 Interface
For this project you only need to implement the insert and search operations (no duplicate keys):
public interface SymbolTable<K extends Comparable<K,V {
public void insert(K key, V val);
public V search(K key);
}
SymbolTable represents a generic symbol table collection with type parameters K and V for the key
and value respectively. Whatever actual type is used for K is required to implement Java’s generic
Comparable<T interface:
public interface Comparable<T {
int compareTo(T other);
}
3.2 Internal node representation
The nodes of your Splay Tree should also record a parent reference to allow for iterative splay
operations from node back up to the root:
public class SplaySymbolTable<K extends Comparable<K,V implements SymbolTable<K,V {
public class Node {
public K key;
public V val;
public Node left, right;
public Node parent; // null for root
public Node(K k, V v) {...}
}
...
3
Since the nested class Node is not a static class, K and T represent the same types as the outer
SplaySymbolTable class.
3.3 Rotations with parent links
P
X
A B
C P
X
A
B C
rotate right
Figure 2: Right rotation with parent links (gray arrows). Note that subtrees A and C keep their
same parent, but B (which may be empty!) switches parents. After the rotation, X needs to be
properly attached to subtree’s original parent (which may be the root!).
Figure 2 shows how the child and parent link need to be modified for a right rotation. The
parent links for X, P, and B (if its not empty) need to be properly adjusted. In addition, the
rotation method should now be burdened with reattaching the rotated sub-tree with the parent;
There are three separate cases based on how P was originally attached:
1. P was the root of the entire tree (P.parent was null),
2. P was the right child of its parent (P.parent.right == P), or
3. P was the left child of its parent.
It is not necessary to return a value from the rotation methods since the caller is no longer responsible for reattaching the rotated subtree. Note that these methods can no longer be static since
they may have to modify the table’s root reference.
3.4 Splaying
Splaying from a node x back up to the root is easier given the parent links:
private void splay(Node x) {
while x has a parent p and grandparent g {
case zig-zig: rotR(g); rotR(p);
case zig-zag: rotL(p); rotR(g);
case zag-zag: rotL(g); rotL(p);
case zag-zig: rotR(p); rotL(g);
}
if (x != root) {
case zig: rotR(root);
case zag: rotL(root);
}
}
4
3.5 Iterative insert and search
The insert and search methods both alter the tree by splaying the inserted or found node back
up to the root. These methods should be performed iteratively.
3.6 Printing your trees
Before tackling the problem of computing binomial coefficients, you should test your implementation
and see if it is constructing the proper Splay trees. I will provide you with a TreePrinter class
which visualizes your tree by writing it to a Scalable Vector Graphics (SVG) file which can be
displayed with (most) any modern web browser. The TreePrinter class requires you to serialize
your tree by performing a pre-order traversal of the tree and storing a string representing each
node in a java.util.Vector; null references are also recorded to mark where the leaves are
thus allowing the TreePrinter instance to reconstruct the tree’s structure. Make sure your key
implements the toString() method and add the following methods to your class:
private void serializeAux(Node tree, Vector<String vec) {
if (tree == null)
vec.addElement(null);
else {
vec.addElement(tree.key.toString() + ":black");
serializeAux(tree.left, vec);
serializeAux(tree.right, vec);
}
}
public Vector<String serialize() {
Vector<String vec = new Vector<String();
serializeAux(root, vec);
return vec;
}
I’ll provide you with a test program you can tinker with that creates an SVG file showing the tree
in Figure 3:
SplaySymbolTable<Integer,Integer table = new SplaySymbolTable<Integer,Integer();
...insert 1, 2, 3, 4, 9, 8, 7, 6
...search for 3, 9, 7
Vector<String serializedTable = table.serialize();
TreePrinter treePrinter = new TreePrinter(serializedTable);
FileOutputStream out = new FileOutputStream("tree.svg");
PrintStream ps = new PrintStream(out);
treePrinter.printSVG(ps);
5
7
3
2
1
6
4
9
8
Figure 3: Splay tree after inserting 1, 2, 3, 4, 9, 8, 7, 6 into an empty tree and searching for 3, 9, and
7.
4 Experiments
4.1 Computing
n
k
?
for multiple k
Binomial coefficients are useful in many applications. For example, we can compute the probability
of exactly k successes in a sequence of n independent “yes/no” experiments where each trial yields
success with probability p:
f(k, n, p) = ?
n
k
?
p
k
(1 − p)
n−k
. (5)
For example, if we flip a coin 10 times, then f(k = 5, n = 10, p = 0.5) yields the probability of
getting “heads” exactly 5 times. We can compute the probability at k or less successes as
F(k, n, p) = X
k
i=0
?
n
k
?
p
k
(1 − p)
n−k
. (6)
This, and other application like it, demonstrate the need for computing a sequence of related values
of the form
n
k
?
; This makes our memoization scheme all the more useful. Figure 4 shows the splay
trees that result from computing
10
5
?
,

10
k
?
for k = 5, 4, 3, 2 and
10
k
?
for k = 2, 3, 4, 5. This suggests
our splay tree scheme is more efficient for increasing k (generates fewer comparisons and rotations).
The resulting trees are not necessarily balanced (they are “left heavy”) due do the search order of
the coefficients; this is evident in Figure 5 which shows the splay tree for
20
k
?
for k = 2, . . . , 20.
4.2 Memoized vs non-memoized binomial
I will provide you with the test harness class Binomial that you can use to compare a memoized
versus a non-memoized version of the binomial function using your Splay Tree implementation.
You are encouraged to modify this code during development. Your SplaySymbolTable classes
should record the number of rotations and comparisons performed in public instance variables so
that the test harness can reset and examine the tree’s efficiency:
public int rotations;
public int comparisons;
The Binomial class can be compiled with or without memoization by setting or clearing the static
memoize flag. The test harness computes
35
k
?
for k = 2, . . . , 17 and records the system-time u
10,5
9,4
8,4
8,3
7,3
7,2
6,3
5,2
4,2 6,2
10,2
9,3
9,2
8,2
7,2
6,3
5,2
4,2 6,2
7,3
8,4
8,3
9,4
10,3
10,4
10,5
10,5
10,4
9,4
9,2
8,4
7,3
6,3
5,2
4,2 6,2
7,2
8,3
8,2
9,3
10,3
10,2
Figure 4: Splay tree after computing
10
5
?
(left) requiring 27 comparisons and 18 rotations. Splay
tree after computing
10
k
?
for k = 5, 4, 3, 2 (center) requiring a total of 77 comparisons and 56
rotations. Splay tree after computing
10
k
?
for k = 2, 3, 4, 5 (right) requiring a total of 76 comparisons
and 55 rotations.
20,10
19,9
19,2
18,9
17,8
17,2
16,8
15,7
15,2
14,7
13,6
13,2
12,6
11,5
11,2
10,5
9,4
9,2
8,4
7,3
6,3
5,2
4,2 6,2
7,2
8,3
8,2
9,3
10,4
10,2
10,3
11,4
11,3
12,5
12,2
12,4
12,3
13,5
13,4
13,3
14,6
14,2
14,5
14,4
14,3
15,6
15,5
15,4
15,3
16,7
16,2
16,6
16,5
16,4
16,3
17,7
17,6
17,5
17,4
17,3
18,8
18,2
18,7
18,6
18,5
18,4
18,3
19,8
19,7
19,6
19,5
19,4
19,3
20,9
20,2
20,8
20,7
20,6
20,5
20,4
20,3
Figure 5: Splay tree for
20
k
?
for k = 2, . . . , 10 which required 639 comparisons and 503 rotations
public class Binomial {
private static SplaySymbolTable<IntPair,Long table =
new SplaySymbolTable<IntPair,Long();
...
final static boolean memoize = true;
public static long binomial(int n, int k) {
...uses table if memoize is true...
}
public static void main(String[] args) {
long startTime = System.nanoTime(); // start clock
int n = 35;
int imax = n/2;
for (int i = 2; i <= imax; i++) {
long v = binomial(n, i);
System.out.println("binomial(" + n + ", " + i + ") = " + v);
}
long endTime = System.nanoTime(); // stop clock
double seconds = (endTime - startTime)*1e-9;
System.out.println("time = " + seconds + " seconds");
System.out.println("comparisons = " + table.comparisons);
System.out.println("rotations = " + table.rotations);
}
}
5 What to Submit
You will archive all your source code, a README text file, and any other supporting files into a jar
file and submit your solution electronically via the class web site. The README file must contain
the following:
• A brief description of the project.
• The name and email of all authors (you should be the sole author of SplaySymbolTable.java).
• A brief summary of the speed comparison of the memoized and non-memoized computation
of
n
k
?
for k = 2, . . . , n/2 for moderately large n ≥ 35. How many comparisons and rotations
were necessary?
• A list of all files in the archive.
• Explain how to run your main test harness.

More products