$30
Project 2. A Spreadsheet Application with Static Type Inference
In this project, we’ll continue building the spreadsheet software. We will work in Racket, and build a static type
inferencer and checker for the spreadsheet language to make sure that the spreadsheet values are appropriately typed.
Moreover, we will use miniKanren to infer the types of columns.
The project is a chance for you to demonstrate that you can:
1. Use racket to solve type checking problems.
2. Use miniKanren and constraint logic programming to solve problems like type inference and type inhabitation.
3. Use good programming style while writing functional code in Racket.
You may work with one partner to write the code for this project. However, each student must independently write a
description of how the code works, and their contribution to the project.
As in Project 1 you may share some test cases with other students. However, tests must be shared on Piazza, and
each team can only share up to 3 test cases.
Starter code
Download this file from Quercus files:
• Projects/p2/p2-base.rkt
• Projects/p2/p2-spreadsheet.rkt
• Projects/p2/p2-starter-tests.rkt
• Projects/p2/mk.rkt
Submission
There will be three Markus assignments set up:
• “p2”: Submit the file p2-base.rkt and p2-spreadsheet.rkt Do not submit mk.rkt, since we will be supplying
our own. Only one person in each team needs to submit this file.
• “p2-writeup”: Submit a PDF file p2-writeup.pdf containing your written explanation. Each team member
must submit this file separately.
If you choose to work with a partner for an assignment, you must form a group for “p2” on MarkUs. One student
needs to invite the other to a group. You should declare a partnership well before the deadline (there is no downside
of doing so). Ask your instructor for help if you’re having trouble forming a group.
Note that “p2-writeup” deadline is set up two days later than “p2”, to avoid double-deducting grace tokens. You may
submit p2-writeup up 48 hours past the p2 deadline, but we will not accept writeups submitted any later than that
(i.e., you cannot use any grace tokens to further extend the writeup deadline).
A statically typed spreadsheet application
In project 1, we wrote a spreadsheet application that fills in the values of computed columns in a spreadsheet like this
one:
Column: id name age voter name2 year-until-100
Formula: (>= age 18) (++ name name) (- 100 age)
Values: 1 “adam” 12 #f “adamadam” 88
2 “betty” 15 #f “bettybetty” 85
3 “clare” 18 #t “clareclare” 82
4 “eric” 49 #t “ericeric” 51
5 “sam” 17 #f “samsam” 83
In our Haskell version of the spreadsheet, each spreadsheet value had an implicit associated type represented using
1
value constructors: number, string, boolean, and error.
In this part of the project, we will use Racket instead. At the same time, we will add some static type checking to our
spreadsheet language DeerLang. In other words, we should be able to see that the expression (>= x "hello") will
generate a type error without evaluating any code.
Task 1. A Simple Type Inferencer
Recall the spreadsheet language DeerLang definition from project 1; we’ll use DeerLang again in this project:
<expr> = VALUE
| ID
| '(' BUILTIN <expr> ... ')' ; builtin function call
| '(' 'lambda' '(' ID ... ')' <expr> ')' ; function definition
| '(' <expr> ... ')' ; function expression
VALUE = ... numbers, strings, booleans ...
BUILTINS = ... described below ...
ID = ... valid identifiers ...
For task 1 of the project, we’ll write a type inferencer that takes an expression in DeerLang and determines the
type of the result of evaluating that expression. Our language will support three data types: numbers, strings, and
booleans. We’ll use the Racket symbols 'num, 'str and 'bool to represent these types.
We will be implementing the function typeof in the file p2-base.rkt. This function takes an expression and a type
environment (to be explained soon), and returns the type of the expression. Here are some examples of what our
function should return:
> (typeof 3 '()) ; ignore the second argument for now
'num
> (typeof "hello" '())
'str
> (typeof #t '())
'bool
> (typeof '(= 3 3) '())
'bool
The type environment will be an association list that stores the types of identifiers. In other words, the type
environment is a list of pairs, where the car of each pair is the identifier name, and the cdr is its type. Using this
type environment, we can identify the types of expressions involving identifiers:
> (typeof a '((a . num) (b. str)))
'num
> (typeof b '((a . num) (b. str)))
'str
> (typeof '(+ a a) '((a . num) (b. str)))
'num
(You might be wondering why we are using an association list with O(n) lookup time, rather than a hash map with
O(1) lookup time. The reason is that we will be turning the function typeof into a miniKanren relation typeo in
the next part of the project. Since miniKanren only works with simple Racket list structures, we’ll consistently use
association lists for our type environment.)
Like before, the spreadsheet application has several builtin operations. We added two more operations to the list
from project A. The builtin definitions and types are shown in the table below:
Racket Symbol Racket Function Arguments Return Type
'+ + 2 numbers number
'- - 2 numbers number
'* * 2 numbers number
2
Racket Symbol Racket Function Arguments Return Type
'/ / 2 numbers number
'> > 2 numbers boolean
'= equal? 2 numbers boolean
'>= >= 2 numbers boolean
'++ string-append 2 strings string
'! not 1 boolean boolean
'num->str number->string 1 number string
'len string-length 1 string number
Types of builtin operations and functions will be represented as a list, with the first element being a list of argument
types, and the second element being the return type. Here’s how we will represent function types:
; type of the builtin =
'((num num) bool) ; takes 2 numbers and returns a boolean
; type of the builtin !
'((bool) bool) ; takes 1 boolean and returns a boolean
Identifiers may also represent functions, so you can have function types in the type environment:
> (typeof 'f '((f . ((num) num)) (g . ((num) str))))
'((num) num)
> (typeof '(g (f 3)) '((f . ((num) num)) (g . ((num) str))))
'str
The typeof function should also be able to recognize type errors, and return the symbol 'error.
> (typeof '(+ 3 "hello") '())
'error
> (typeof '(f (g 3)) '((f . ((num) num)) (g . ((num) str))))
'error
Your task is to complete the function typeof that can return the type of a DeerLang expression that does not not
involve function definitions. You also can assume that we won’t test with free identifiers (identifiers whose types
are not in the type environment). Here are some example expressions that we don’t expect you to be able to handle
for task 1:
; NOT TESTED: contains lambda expressions
(typeof '(lambda (x) (+ x x)) '())
(typeof '((lambda (x) x) 3) '())
; NOT TESTED: contains free variabels
(typeof '(+ x 3) '())
(typeof '(+ x y) '((x . 3)))
; NOT TESTED: builtins are not identifiers
(typeof '+ '())
We recommend starting by writing typeof to handle literal values, then identifiers. Then, handle builtin functions,
either one at a time or all at once. Finally, handle function calls. Alternatively, you can treat builtin functions as
regular function calls. This second approach requires more thinking time, but less coding and debugging time.
Task 2. A Type Inferencer Relation in miniKanren
Handling function definitions is difficult in a functional setting. Consider the expression ((lambda (x) x) 3). The
function definition does not actually constrain the type of the parameter x! However, a function like (lambda (x)
(+ x x)) already fully constrains its parameter and output type. In other words, the type of a function parameter
could be constrained by the function body and by the argument type.
3
Problems like type inferencing where constraints can come from multiple places and in somewhat complicated ways is
much more easily solved using logic programming. To that end, we’ll write a relation typeo instead of a function.
The typeo relation will have three arguments: the DeerLang expression (including function definitions!) the type
environment (association list), and the output type ('num, 'str, or 'bool). The relation will succeed if the output
type matches the input.
Here are some examples of what running the typeo relation could look like. We use a logic variable out to represent
the output type of an expression, and query the value of out:
> (run 1 (out) (typeo '(> 4 5) '() out))
'(bool)
> (run 1 (out) (typeo '(> 4 "hi") '() out))
'()
> (run 1 (out) (typeo '((lambda (x) x) 3) '() out))
'(num)
Step 1 Handle constants, identifiers, and built-in expressions. Start by converting the typeof function into the
typeo relation. For example, you might have had these lines of code in your typeof function:
(define (typeof expr env)
(cond
[(number? expr) 'num]
...))
The relational form typeo of the function explicitly describes the constraint on the output type.
(define (typeo expr env type)
(conde
((number? expr) (== type 'num))
...))
You may find the relations numbero, stringo, boolo, and symbolo useful. You may also need to create fresh logic
variables when handling built-ins.
Step 2 Handle function calls. After step 2, your typeo relation should be the relational form of your typeof function.
This step (handling function calls) should be similar to handling builtins, except that you won’t know the length of
your argument list.
You may find it useful to implement a helper relation type-listo that takes a list of expressions, a type environment,
and a list of types corresponding to those expressions.
Step 3 Handle function definitions. This step will be the most challenging. You will need to manipulate the type
environment here, and possibly add logic variables in the type environment. You might also find it useful to write
helper relations, including relations like lookupo and appendo that we wrote during lecture.
Task 3. Type Checking a Spreadsheet
With typeof and typeo available to us, we can type check an actual spreadsheet that includes type annotations. The
input to the type checker will be similar to project 1, but each column will have a type annotation:
<spreadsheet> = '(' 'spreadsheet'
'(' 'def' (ID <expr> ) ... ')'
'(' 'columns' (ID <type> <column>) ... ')'
')'
<column> = '(' 'values' <type> VALUE ... ')'
| '(' 'computed' <type> <expr> ')'
<type> = 'num' | 'str' | 'bool'
Here is an example spreadsheet:
(define sample-spreadsheet
'(spreadsheet
4
(def (voting-age 18)
(concat (lambda (x y) (++ x y)))
(canvote (lambda (x) (>= x voting-age))))
(columns
(id num (values 1 2 3 4 5))
(name str (values "adam" "betty" "clare" "eric" "sam"))
(age num (values 12 15 18 49 17))
(voter bool (computed (canvote age)))
(name2 str (computed (concat name name))
(year-until-100 num (computed (- 100 age)))))))
Complete the function type-check-spreadsheet in the file p2-spreadsheet.rkt. This function takes a spreadsheet,
and returns a list of booleans representing whether the annotated type of each column is correct. Here is one example:
> (type-check-spreadsheet
'(spreadsheet
(def (voting-age 18)
(canvote (lambda (x) (>= x voting-age))))
(columns
(name str (values "adam" "betty" "clare" "eric" "sam"))
(age num (values 12 15 18 49 17))
(voter bool (computed (canvote age)))
(voter2 bool (computed (>= age name))))))
'(#t #t #t #f)
This output shows that type checking succeeded for the first three columns, but not for the last one. The last column
uses the formula (>= age name) where name is a string rather than a number.
Note: If you wish for us to test this function with our own correct implementation of typeo and typeof, change the
import in p2-spreadsheet.rkt. The instructions are at the top of that file.
As before, you can assume that:
• computed columns will only depend on columns appearing before it
• there will always be at least 1 column with raw data initially and that definitions+identifiers names will never
be the same.
• definitions will not be recursive or mutally recursive
• all value columns have the same number of elements, consistent with the number of rows in the spreadsheet.
You may not assume that all value columns have the correct types. For example, the “name” field below has a number
in the first row rather than the string, so the type checking for that column should fail. Any other computed column
that depends on “name” will also fail the type check.
> (type-check-spreadsheet
'(spreadsheet
(def (voting-age 18)
(canvote (lambda (x) (>= x voting-age))))
(columns
(name str (values 4 "betty" "clare" "eric" "sam"))
(age num (values 12 15 18 49 17))
(voter bool (computed (canvote age)))
(voter2 bool (computed (>= age name))))))
'(#f #t #t #f)
Task 4. Synthesizing Programs (Type Inhabitor)
One of the neat features of miniKanren is that logic variables can go anywhere! In p2-spreadsheet.rkt, write a
macro that takes an expression with the logic variable BLANK in the abstract syntax tree expression structure, the
type you would like the expression to be, and the maximum number of results n, and returns a list of possible ways to
replace BLANK in the expression so that it would satisfy the type checker. Here are some examples.
5
It is okay if your list of examples are different, or if they do not appear in the same order!
> (fill-in BLANK `(+ 3 ,BLANK) 'num 5)
'((_.0 (num _.0)) ; BLANK can be any number
((+ _.0 _.1) (num _.0 _.1)) ; BLANK can be the sum of two numbers
((- _.0 _.1) (num _.0 _.1)) ; BLANK can be the difference of two numbers
(((lambda () _.0)) (num _.0)) ; BLANK can be a call to a thunk that returns a number
((+ _.0 (+ _.1 _.2)) (num _.0 _.1 _.2))) ; BLANK can be the sume of three numbers...
In this example, we are filling in the blank in the expression (+ 3 BLANK) so that the resulting expression is of type
'num. We are returning 5 results.
Note: If you wish for us to test this operation with our own correct implementation of typeo and typeof, change
the import in p2-spreadsheet.rkt. The instructions are at the top of that file.
Grading
• 25 points: Task 1
• 30 points: Task 2
• 20 points: Task 3
• 5 points: Task 4
• 10 points: Style
• 10 points: Quality of the Written Explanation
Working with a partner You are encouraged to work with a partner. We expect both partners to contribute
equally to the project, and to understand all the code that you submit.
Style Rubric (10 points) We’ll be grading for the following:
1. (3 pts) Are you indenting your code appropriately? How many code indentation issues can the TA find in a 5
minute period?
2. (3 pts) Are you using let* to effectively reduce duplicate code? How many code duplications can the TA find
in a 5 minute period?
3. (4 pts) Are you documenting your code effectively? Are you choosing good variable names? Can the TA choose
a handful of variable names, and understand what the variables store?
Style: Indentation Appropriately indenting Racket code makes the code easier to read. If you are using Dr. Racket,
you can use Dr. Racket to automatically indent your code for you. Here’s an example where having the wrong
indentation can make your code more difficult to read:
(define (foo x y z)
(cond [(equal? x 1) y]
[else (let* ([m (+ x y)]
[n (+ y z)]) ; <-- looks like a branch of the "cond"!
(+ m n))]))
Properly align your code so that sibling sub-expressions have the same indentation level. That way, your code becomes
much easier to scan and understand.:
(define (foo x y z)
(cond [(equal? x 1) y]
[else (let* ([m (+ x y)]
[n (+ y z)]) ; better!
(+ m n))]))
6
Style: Repeated Code Good Racket code liberally use let* to define local variables to prevent repetition. In
fact, expert Racket code have very little indentation, and a lot of define and let* expressions!
You should not be afraid to define new helper functions, and to use many, many local variables! It is better to use a
local a variable called defs and assign it to a very simple expression like (rest (second spreadsheet)), than to
have that expression repeat in many places.
Written Explanations (10 points) Submit a one-page written (max 500 words, Times New Roman, font size
12) explanation answering the below questions. The explanations should be written and submitted independently,
even if you worked with a partner.
1. (3 pt) If working in a team, describe your contribution to the project. Otherwise, please declare that you wrote
all the code.
2. (3 pt) Describe the strategy behind your typeo function. Why can’t we use the same strategy in the typeof
function to handle function definitions?
3. (4 pt) You have two options for this question.
Option 1:
We used macros to implement the logic programming language features “-<” and “?-”, and we wrote an interpreter
to implement DeerLang in project 1. Explain the advantages and limitations in using these approaches to build
language features. You will be graded for correctness and insight.
Option 2:
Many people in the programming languages research community strive to make their talks accessible to people who
are new to the field. These are some of the talk recordings to give you a sense of what people are working excited
about. Choose one of the videos, watch it, and write a paragraph describing what the talk is about. You will be
graded on how convinced TA is that you watched the talk, based purely on your write up.
1. On the Expressive Power of Programming Languages (2019) https://www.youtube.com/watch?v=
43XaZEn2aLc We discussed the Church-Turing thesis in the first lecture. But if all Turing-complete language
have the same computational power, then how can we compare the expressiveness of different languages?
2. Four Languages from Forty Years Ago (2018) https://www.youtube.com/watch?v=0fpDlAEQio4 This
talk is a gentle introduction to the “big ideas” in programming languages, including logic programming, algebraic
data types, and others.
3. “I See What You Mean” (2015) https://www.youtube.com/watch?v=R2Aa4PivG0g This talk is about the
connection between logic programming and Structured Query Language, but continues the discussion by adding
the notion of time.
4. Barliman: trying the halting problem backwards, blindfolded (2016) https://www.youtube.com/
watch?v=er_lLvkklsk&feature=youtu.be&t=95 This talk is about using miniKanren to build an IDE that takes
test-driven development one step further.
5. Boundaries (2012) https://www.destroyallsoftware.com/talks/boundaries This video is a little old, but the
ideas are still relevant even if the technology isn’t. This talk is the most “industry focused” out of the list of
talks.
6. Propositions as Types (2015) https://www.youtube.com/watch?v=IOiZatlZtGU How the simply typed
lambda calculus corresponds to logic.
7. A Little Taste of Dependent Types (2018) https://www.youtube.com/watch?v=VxINoKFm-S4
Here’s our recommendations on how to choose the talks:
• If you enjoyed the week 9 materials on miniKanren, watch talk #4.
• If you want to see a broad view of programming languages, watch talk #1 or #2
• If you want to see a different view of logic programming, watch talk #3
• If you want to understand what type theory is and why there is so much research in this area, watch talk #6 or
#7
• If you want a talk that is more about software engineering, watch talk #5.
7