Starting from:

$29

Assignment 1 Basic array manipulation functions


// ~ Overview ~ //

In this assignment we will be writing some basic array manipulation
functions as well as reimplementing a number of standard C-library
string functions. The purpose is to develop a deeper understanding of C
arrays and strings as well as to familiarize yourself with the C standard
library and its idioms.

You can use the "man" program to see the official documentation of the
assignment's functions. "Man" stands for "manual"; however, if it
helps you remember, pretend it stands for "mansplain".

So for example, to see the official documentation for the "strlen"
function just type in the Terminal:

man strlen

Voila! Read and you know everything you'll ever need to know about the
strlen function.


// ~ IMPORTANT ~ //

You do not need to use malloc/free in this assignment. Furthermore,
you must not include the <string.h header file in your submission.

Additionally, make sure you follow the submission guidelines!


// ~ Learning Goals ~ //

(1) How to compile, run, and test your answer for correctness.
(2) How to run your code under valgrind, in order to find memory
errors.
(3) How to run your code under gdb.
(4) How to "iterate" over arrays in C.
(5) How to use Version Control.
(6) Demonstrate that you ran your code under gdb. See further down in
this README for more details.
(7) Learn about C arrays, strings, and ASCII.


// ~ Getting Started ~ //

The PA01 folder contains five files:
(1) answer01.c: this is the file that you hand in. It has the skeleton
of this assignment's functions in it, and you must "fill in the blanks".
(2) answer01.h: this is a "header" file and it contains a complete
explanation of the functions you will be writing for this assignment.
(3) pa01.c: in computer science, you cannot know if a function is
correct without running it. You can use this file to write code that
runs the functions in answer01.c, in order to ensure their
correctness. pa01.c comes with some example testing code to help you
get started. Modify or delete this code: it is yours to do with what
you will.
(4) README: this file.
(5) tester: see the discussion below on testing your assignment.

Follow the discussions below on how to compile and run your code, as
well as test and submit it.


// ~ Submitting Your Assignment ~ //

You must submit one zip file to blackboard. This zip file must contain
precisely three files:
(1) answer01.c, your solution
(2) memcheck.log, a Valgrind log that was produced by running your
program.
(3) gdb.txt, a logfile produced through running gdb.

Notes on creating each of these files are included in this readme.
You create the zip file using the following command.

zip pa01.zip answer01.c memcheck.log gdb.txt

If your zip file does not meet the above specification, then you may
get zero for the assignment. In particular, right-clicking and
"archiving" the files in a window manager such as Windows, OSX, or KDE,
may produce the wrong directory structure, resulting in a zero
mark. YOU HAVE BEEN WARNED. Following the instructions is *part* of
getting the assignment correct. So please follow the instructions.

Submit "pa01.zip" to blackboard.


// ~ Determining Your Mark ~ //

IMPORTANT!!! -- You will probably feel inclined to use the tester to
test your code without actually making your own test cases in your
main function. DO NOT DO THAT. It is EXTREMELY beneficial to sit down
and write your own testcases so that you actually understand what it is
that you're trying to accomplish as well as where things go wrong, since
99.99% of code is not bug free the first time through. The tester is
slow for a reason, please do not waste your time sitting there waiting
for it to run, as it most likely will not tell you what is wrong.

Additionally, do NOT write all of this code at once and then try testing
it for correctness. It is almost a guarantee that it will NOT work if you
do this. Test your code as you go, break it down into smaller pieces.
Even smaller than functions. When you write something in your code to do
any given task, test it and make sure it works before moving on. You will
save yourself a lot of time this way. It only takes seconds to recompile
and run a program of this size. For more information on how to test your code,
see the Testing Your Program section below.

This assignment is pass or fail. You pass if everything works, and
fail if there is one or more things wrong. You can determine your mark
before submitting your assignment. To test your program, you must run
the "tester" program. You run the tester program as follows:

./tester answer01.c

You can write your assignments on any computer, using any development
tools; however, they will be marked on the lab computers, and so you
should ensure that your program works on these computer before
submitting your assignment. This applies to *all* assignments in this
course.

Getting a passing grade from the tester only ensures a passing grade
for the assignment when you submit the assignment correctly, including
all auxiliary files listed above.


// ~ (1a) Compiling Your Program ~ //

The remainder of this document is to help you figure out how to meet
the learning goals of this assignment. Firstly, you must compile, run
and test your program.

In this course we use the compiler "gcc", which is one of the world's
best and most important C compilers. You can compile your program by
typing the following into the terminal (make sure you are in your PA01
directory):

gcc -Wall -Wshadow -g pa01.c answer01.c -o pa01

Gcc takes a wide variety of arguments, and these are the most
important arguments that we will use, and re-use in this course. The
arguments mean:
(1) -Wall, turn on all *common* compilation warnings. You will fail
the assignment if your solution has 1 or more compilation warnings.
(2) -Wshadow, in addition to common warnings, warn if a variable
declaration "shadows" the declaration of another variable.
(3) -g, turn on debugging symbols. This produces a larger executable
program; however, you can use this program in a debugger. You will
need to learn how to use the debugger "gdb" in order to pass this
course.
(4) pa01.c answer01.c. These are the files that you are compiling and
linking into a computer program. You only ever compile ".c" files. You
never compile ".h" files. Each ".c" file is compiled separately (gcc
does this internally), and multiple ".c" files are linked together
into a single computer program. Compiling and linking are two
different steps, but here we are doing both steps with a single
instruction. The "int main(...)" function must appear in precisely one
of those ".c" files (pa01.c). It cannot appear in answer01.c. If you
put "int main(...)" in answer01.c, then the tester will fail and you
will fail the assignment. (Aside: there are tricks that allow you to
maintain multiple main functions, using conditions; however, only one
main is present in an executable.)
(5) -o pa01, create an "output file" pa01. By default gcc will produce
a file called "a.out", and we are just telling gcc to name that file
"pa01" instead.

// ~ (1b) Running Your Program ~ //

To run your program, simply type into the terminal:

./pa01

Note that this should print:

Welcome to ECE264, we are working on PA01.

This statement is printed by a "printf" statement in pa01.c. You can
(and should) modify pa01.c, and edit its behavior, in order to test
the behavior of the functions you are writing in "answer01.c".

The file (technically: compilation unit) pa01.c "knows" about the
functions in answer01.c, because it "includes" the declarations for
those functions, which are written in the file "answer01.h".
Declarations and are not compiled into code, but instead, they merely
describe the existence of some compilable function in some compilation
unit somewhere. In this case, the functions declared in answer01.h are
defined in the compilation unit "answer01.c".

In future assignments, you will only be provided with an "answerXY.h"
header file, with declarations. You will need to write your answerXY.c
file from scratch, and write your own test-cases and main function in
a separate compilation unit.


// ~ (1c) Testing Your Program ~ //

It is your responsibility to test your program and ensure that it
works. A program that is 99% correct is still 100% wrong. Many
students seem to have trouble with this concept, but it is extremely
important, and straight-forward.

Engineers are people who get things right. If an engineer builds a
wheel, it is round. If an engineer makes a door, it is not missing a
hinge or a handle. Same goes for computer programs. It is right or
wrong. Computer programs are very difficult to write correctly, and
large programs almost always involve subtle bugs. For this reason,
software engineers need to adopt a zero-tolerance policy for software
defects. If you know a bug exists in your code, and you do not fix it,
then you will receive a fail.

A "tester" program exists, which will tell you the mark you will
receive when you submit your assignment. (Assuming you submit it
correctly with all auxiliary files.) Receiving a pass does not mean
your assignment is perfect, but it does mean that your assignment
passes the tests we thought to put it through.

This tester program is provided as a courtesy -- it is up to you to
think about your program, and test it. This is a fundamental skill for
any computer programming task. If you can't get it right, then no-one
else will get it right for you. For this reason, the tester program is
unrealistically slow for repeated use. Please do not run the tester
program until *after* you are *sure* you have your assignment 100%
correct.

So how do you write and test your code?

The most important thing to understand is that you should test your
code as you go. Because you are just starting, pa01.c includes a
little bit of testing code to start you off. However, in future
assignments, you will have to generate all of the testing code
yourself.

To generate testing code, you have to *think* about the function your
developing, and then you have to write code that *demonstrates* that
it *always* works. This is a core skill to develop if you are
interested in being a competent computer programmer.


// ~ (2a) Running ./pa01 Under Valgrind ~ //

Valgrind is an extremely useful tool for finding problems in C
programs. It is free-software, and (for C) as good as any software
commercially available. Comparable commercial software is usually
worth a few thousand dollars per user. (!).

It is a core goal of this course that you learn how to use Valgrind in
order to ensure that your assignments do not have memory errors. If
you do not learn how to use Valgrind, and you pass the course, then we
have failed as educators. Valgrind is an invaluable tool in an
engineers toolbox to do what most people in society don't know how to
do: ensure that something is correct.

Valgrind has many functions; however, in this course, we will concern
ourselves with the "memcheck" function. The memcheck function runs
your code in an emulated environment, and checks whether you access
any bits of memory that you shouldn't. Memcheck also tests to see if
you have allocated (asked for) memory that you never freed (gave
back). Together, these two types of memory problems are some of the
most pernicious programming problems in the world, and cost literally
billions of dollars each year. Valgrind will help you find these
problems.

To run your code under Valgrind, you must first compile your code, and
then type the following into the terminal:

valgrind --tool=memcheck --leak-check=full --verbose --log-file=memcheck.log ./pa01

This *runs* your program, but to check for errors, you must check the
log file that you just generated: "memcheck.log". This file
contains a lot of information. If your solution has no memory errors,
then you will see at the end of the file something like:

ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Don't worry about suppressed errors: they are not errors. When your
program runs, the C runtime (almost always) performs a few memory
tricks in order to execute more efficiently. Memcheck picks up these
advanced programming tricks, and then they are suppressed because it
is well-known that these tricks are not errors. Valgrind offers these
error suppression facilities because they are required by advanced
developers who work on important "glue" code that holds modern
operating systems together. In short: don't worry about suppressed
errors. Google is your friend if you are interested in more
information; however, it will not be tested in this course.


// ~ (2b) Debugging Valgrind Errors ~ //

Look for the *first* error, and debug that error. Sometimes you will
get hundreds (or even hundreds of thousands) or errors; however, they
may all caused by the same line of code. So just debug the first
problem you encounter in the "memcheck.log" file.

Most memory errors fall into one of four categories:
(2b.i) "Invalid read of size X", where X is a number. This means that
you are reading X bytes of memory that you do not have access to.
(2b.ii) "Invalid write of size X", as above, except that you are
*writing* memory. Writing means, that the memory access is on the
left-hand side of an "=" sign. (i.e., assignment operation, e.g. x = 5.)
(2b.iii) "X bytes in 1 blocks definitely lost in loss record U of V",
which means that you have asked for X bytes of memory, but never freed
them.
(2b.iv) "Conditional jump or move depends on uninitialized value(s)" A
conditional jump or move is what happens in an if statement or a for
loop, where some condition (e.g., if(x 5) {...}) determines the next
line of code that is executed. If the memory is uninitialized (e.g.,
"x" has not been set), then the program behavior is random. When you
see this error, look at the involved statement and ask yourself how
any involved variables could be uninitialized.

These memory errors can be difficult to solve, but it is much easier
if you think precisely about your code. This is necessary skill, and
expect to spend much time over this semester learning to do just that.

// ~ (2c) Submitting a Valgrind Log ~ //

In order to pass this assignment, you must submit a log file produced
by valgrind. See the section on submitting your assignment (above) for
information on how to do that.


// ~ (3) Using Gdb ~ //

Gdb is easily one of the world's most important debuggers. A debugger
is a virtual environment that you run your code within. This
environment allows you to control the execution of the code, and watch
what happens to the variables. Make sure that you *read* the course
notes on using gdb. Using gdb is a course requirement. If you ask for
help, you may in turn be asked to run your code under gdb.

To pass this assignment, you must submit a gdb log file. To produce a
logfile you must turn logging on. My preferred way of doing this is to
use command-line switches to tell gdb to set logging on. The command
looks as follows:

gdb -ex "set logging overwrite on" -ex "set logging on" ./pa01
...

You must then set a breakpoint (read the documentation to learn how to
do this), step through your code, print the values of some variables,
and then quit gdb. You submit the file "gdb.txt". See the submission
guidelines for more information.


// ~ Summary ~ //

# Compile
gcc -Wall -Wshadow -g pa01.c answer01.c -o pa01

# Run -- you must write your own tests
./pa01

# Run under valgrind
valgrind --tool=memcheck --leak-check=full --verbose --log-file=memcheck.log ./pa01

# Don't forget to *LOOK* at the log-file "memcheck.log"

# Run under gdb -- this is an *example*. You will need to adjust these
# commands to your situation.
gdb -ex "set logging overwrite on" -ex "set logging on" ./pa01
(gdb) b pa01.c:59
(gdb) run
(gdb) p argc
(gdb) n
(gdb) n
(gdb) c
(gdb) quit

# See what your grade should be (providing you submit everything
# correctly):
./tester answer01.c

# Create a zip file of your solution:
zip pa01.zip answer01.c memcheck.log gdb.txt

# Upload pa01.zip to blackboard.

# Please read all instructions before asking for help. We are perfectly
# willing to help anyone who is willing to help themselves.


// ~ FAQ ~ //

(*) What the pack is #include?

The #include statements tell gcc to copy and paste the files <stdio.h
and "answer01.h" into the top of the document. (Literally, copy and
paste.) The angle brackets on <stdio.h tells gcc that stdio.h is a
system header file. The quotes tell gcc to look in the current
directory, so "answer01.h" must be in the same directory as the .c
file.

Once included, a .c file can use any functions that are declared in the
header file. Header files must only contain declarations.

// A function declaration... does not compile into anything
size_t my_strlen(const char * str);

// A function definition... compiles into code... never put a
// definition in a .h file. (They belong in .c files.)
size_t my_strlen(const char * str)
{
return 0; // not very useful though.
}

The last step of building a computer program is called "linking". gcc
locates all of the definitions in a collection of compilation units
(.c files), and glues everything together.

Never, NEVER, include a .c file:

// Evil
#include "answer02.c"

----------------------------------------------------------------------

(*) What is "size_t"?

Read as "size-type", this is just some form of unsigned integer. (An
unsigned integer cannot be negative.) The C library uses "size_t" to
denote the size of various things, and it is the return type of the
"sizeof(...)" operator.

----------------------------------------------------------------------

(*) Undefined behavior?

Assumptions are always made whenever a function is written, in any
computer language. Programming gets extremely complex surprisingly
quickly, and for this reason, good programmers always specify what
their assumptions are. These assumptions are often called
"preconditions". If the preconditions are met, then the function will
execute perfectly. (Because you're a good programmer, right?) If the
preconditions are not met, then something will happen, but no
guarantees are made. This is called "undefined behavior", and is
always absent from good computer programs. Thus, it is always a bug to
call a function without meeting its preconditions, even if the
function seems to otherwise work.

A precondition of the strlen(const char * str) function is that "str"
is a pointer to a valid C string. If you pass a pointer to an integer,
an invalid pointer, or a NULL pointer, then the behavior is
undefined. It is up to the function caller to ensure that "str" is a C
string, and the strlen(...) function itself is under no obligation to
check this. Thus in most implementations, calling strlen(NULL) will
crash your program.

----------------------------------------------------------------------

(*) What is the deal with "const"?

"Const" is a type "qualifier". That means that you can take any type
and add const to it. (i.e., qualify the type with a const.) When so
qualified, the compiler will try to stop you from modifying the
value. So:

const int x = 4;
x += 4; // ERROR: assignment of read-only variable 'x'

const char * str = "Hello World!";
str[0] = 'h'; // ERROR: assignment to read-only location *str

If you try to return a const pointer as a non-const pointer, then gcc
will give you what seems at first like a cryptic warning message:

char * my_strchr(const char * str, int ch)
{
return str; // warning: return discards 'const' qualifier
// from pointer target type
}

Can you see that the 'const' "qualifier" has been 'discarded'? It is
really very straight-forward once you are familiar with the jargon. To
circumvent this warning, you use a type-cast, which tells the compiler
that you really want to do this. See below for why this is done.

char * my_strchr(const char * str, int ch)
{
return (char *) str; // no warning.
}

You can always treat a non-const type as const.

----------------------------------------------------------------------

(*) What is the deal with my_strchr, my_strrchr, my_strstr, and const?

The C programming language has a long history, and a number of
programming conventions, called "idioms". Returning non-const pointers
is a widely used idiom in the C language, that ensures that the same
function can be used for const and non-const purposes. In C, *you*,
the all-powerful programmer, are expected to get things right, and as
such, const is merely a guide.

const char * s1 = my_strchr("Hello World!", 'o'); // fine
char * s2 = my_strchr("Hello World!", '!'); // also fine
s2[0] = '\0'; // compiles without warning, but will fail miserably

In short, do not expect C to hold your hand. This is the greatest
strength and also the greatest weakness of programming in C.

----------------------------------------------------------------------

(*) What are form-feeds, carriage returns and vertical tabs?

The ASCII character set includes special control characters which can
be used to control the format of text output. Some of these control
characters are rarely used. For example, if you terminal is correctly
set up, then you can produce an audible beep by printing a "bell"
character: '\a'. Most modern terminals have beeps disabled by default
(mercifully). Some control characters are still very important, such
as '\n', which specifies a newline character.

It is important to understand that text data is just a sequence of 1s
and 0s, and it is *interpreted* as text according to an encoding
("code-book") which says things like "this binary value means that
character". By default, C uses ASCII encoding, which has a number of
different white-space capabilities. Historically, all of these
characters were used so that a programmer could position the head of a
printer at specific locations on a page. This capability has long been
superseded; however, it is important to appreciate it in order to use
ASCII correctly, and it just may come up in a future job -- updating
some of the millions of lines of legacy code still used by corporate
America.

The C standard specifies white-space to be:
(1) '\t', (ASCII value 9). Interpreted as a fixed number of space
characters. Historically this was typically 8 spaces.
(2) '\n', (ASCII value 10). So called 'line-feed', historically goes
to the next line but at the same horizontal position. The '\n' is now
always interpreted as a new line character, which implicitly includes
a 'carriage-return' (see below).
(3) '\v', (ASCII value 11). Interpreted as a fixed number of
new-lines. Historically this was typically 6 new lines.
(4) '\f', (ASCII value 12). A 'form-feed', historically asks the
printer to eject the current page, and start a new one.
(5) '\r', (ASCII value 13). A 'carriage-return', historically moves
the cursor to the beginning of the line. (Thus, a 'new line'
character is conceptually a line-feed and a carriage-return in one.)
(6) ' ', (ASCII value 32). The space character.

More products