$30
Computer Engineering 175
Phase V: Storage Allocation
“This goodly frame, the earth, seems to me a sterile promontory . . . ”
Shakespeare, Hamlet, Act II
1 Overview
In this assignment, you will extend your compiler to generate code for a 64-bit Intel processor running the Linux
operating system. This assignment is worth 10% of your project grade. Your program is due at 11:59 pm, Sunday,
March 5th.
2 Storage Allocation
Any variable declared outside of a function has static storage. Such variables and all functions are global symbols.
Any variable declared within a function has automatic (i.e., stack-based) storage. The type int is a signed type and
requires four bytes of storage. The type long is a signed type and requires eight bytes of storage. The type char is
a signed type and required one byte of storage.
The type “pointer to T” requires eight bytes of storage regardless of the type T. The type “array of T” is stored
as a consecutive sequence of objects of type T. The total number of bytes of storage is therefore equal to the length
of the array times the size of the type T. The first object in an array has index zero, with lower indices having lower
addresses than higher indices.
3 Expressions
3.1 Overview
For this assignment, your compiler will see only function calls in which all arguments are either 32-bit integer
literals or scalar variables of type int.
3.2 Semantic Rules
3.2.1 Primary expressions
primary-expression → id ( expression-list )
| id ( )
For function calls, arguments are passed by value. A function may therefore change the values of its parameters
without affecting the values of the arguments. The order of evaluation of arguments is unspecified. Recursive calls
to any function are permitted. For this assignment only, each call will have maximum of six arguments and each
argument will either be a 32-bit integer literal or a scalar variable of type int.
4 Function Definitions and Statements
4.1 Overview
A function definition contains a sequence of statements, which are executed in sequence. For this assignment, the
statements will be only simple assignment statements (Sec. 4.2.1) or simple function call expressions (Sec. 3.2.1).
1
4.2 Semantic Rules
4.2.1 Assignment statements
statement → expression = expression
The value of the right operand replaces that of the object referred to by the lvalue of the left operand. For this
assignment only, the right operand will always be an integer literal of type int, and the left operand will always be
a scalar variable of type int.
4.2.2 Parameter lists
parameter-list → parameter
| parameter , parameter-list
parameter → type-specifier pointers id
For this assignment only, all parameters will be scalar variables of type int, and no more than six parameters
will be specified.
5 Registers and Stack Frame
The 32-bit Intel processor has eight registers: %eax, %ebx, %ecx, %edx, %esi, %edi, %esp, and %ebp. The 64-bit Intel
processor extends these registers to 64-bit registers (now called %rax, %rbx, etc.) and provides eight additional 64-
bit registers %r8–%r15. The 32-bit register names may still be used to access the lower half of the 64-bit registers,
while names such as %r8d, %r9d, etc. are now used to access the lower half of the newer 64-bit registers.
The caller places the first six arguments into the %rdi, %rsi, %rdx, %rcx, %r8, and %r9 registers. Any remaining
arguments (though not allowed for this assignment only) are pushed on the stack from right to left. The call instruction then pushes the return address on the stack before transferring control. The first task the callee usually
does is to use the enter instruction (or equivalent instructions) to allocate its stack frame, which pushes the old
base pointer (%rbp) on the stack, copies the stack pointer (%rsp) to the base pointer, and allocates space by adjusting the stack pointer. Thus, the seventh argument would be found at sixteen bytes from the base pointer. The
stack must be kept aligned on a sixteen-byte boundary.
Arguments pushed on the stack are always eight bytes apart regardless of their underlying size. Consequently,
if an odd number of arguments are pushed on the stack, then it would not be properly aligned. In such a case, the
stack pointer is adjusted down by eight bytes before pushing the arguments, in effect pushing a fictitious argument
on the stack to make the number of arguments pushed be an even number.
The callee saves the %rbx, %rbp, and %r12–%r15 registers. Other registers are caller-saved. Any return value is
stored in the %rax register. The callee then uses the leave instruction (or equivalent instructions) to deallocate its
space and restore the old base pointer, but does not pop its arguments from the stack. The caller is responsible for
popping the arguments by adding the appropriate number of bytes to the stack pointer.
6 Assignment
You will write a code generator for Simple C by augmenting your parser, using the given rules as a guide. Your
compiler will only be given legal programs as input. Your compiler should write valid Intel assembly code to the
standard output. In the previous assignment, error messages were written to the standard error. Therefore, you
do not need to change or remove the semantic checks already in place. Your generated assembly code can be
assembled and linked using the native compiler along with any additional C files:
$ scc < file.c > file.s 2> /dev/null
$ gcc file.s [additional-source-files]
$ ./a.out
2
Your assembly code will be inspected to check that you are allocating a realistic amount of memory for the
stack frames.
7 Hints
You can use the native compiler to determine how to perform most operations. Using the -S flag will generate
assembly code rather than object code from a source program. You should be aware that the native compiler
sometimes uses sophisticated instructions and addressing modes for faster code. However, the correctness of
your code is what is important, not its speed, and using simpler instructions may result in code that is easier to
understand and debug.
3