$29
CSCI 2040 A/B: Introduction to Python
Lab Assignment 4
Notes
1. You are allowed to form a group of two to do this lab assignment.
2. You are strongly recommended to bring your own laptop to the lab with Anaconda1 and
Pycharm2
installed. You don’t even have to attend the lab session if you know what you
are required to do by reading this assignment.
3. Only Python 3.x is acceptable. You need to specify your python version as the first
line in your script. For example, if your scripts are required to run in Python 3.6, the
following line should appear in the first line of your scripts:
#python_version == '3.6'
4. For those of you using the Windows PC in SHB 924A (NOT recommended) with your
CSDOMAIN account3
, please login and open “Computer” on the desktop to check if an
“S:” drive is there. If not, then you need to click “Map network drive”, use “S:” for the
drive letter, fill in the path \\ntsvr1\userapps and click “Finish”. Then open the “S:”
drive, open the Python3 folder, and click the “IDLE (Python 3.7 64-bit)” shortcut to
start doing the lab exercises. You will also receive a paper document and if anything has
changed, please be subject to the paper.
5. Your code should only contain specified functions. Please delete all the debug statements
(e.g. print) before submission.
Exercise 1 (20 marks)
Let a be the list of values produced by range(1, 11).
1. Using the function map with a lambda argument, write an expression that will produce
each of the following: (5 marks)
• A list of round down square root of the corresponding values in the original list;
The expected output should be a list as follows,
[1, 1, 1, 2, 2, 2, 2, 2, 3, 3]
• A list where each element is larger by one than the corresponding element in the
original list; The expected output should be a list as follows, (5 marks)
1An open data science platform powered by Python. https://www.continuum.io/downloads
2A powerful Python IDE. https://www.jetbrains.com/pycharm/download/
3A non-CSE student should ask the TA for a CSDOMAIN account.
1 of 7
CSCI 2040 A/B Lab Assignment 4 Page 2
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Note that you should use lambda arguments in this part.
2. Write a list comprehension that will produce each of the following:
• A list contains values in the original list that are less than or equal to 7; The
expected output should be a list as follows, (5 marks)
[1, 2, 3, 4, 5, 6, 7]
• A list contains values that are the square of odd values in the original list. The
expected output should be a list as follows, (5 marks)
[1, 9, 25, 49, 81]
Note that in this exercise, you are required to write only one line of code for each expression.
We will manually check the correctness of your answer.
Save your script for this exercise in p1.py
Exercise 2 (20 marks)
x is a list of strings. For example,
x = ['python is cool',
'pythom is a large heavy-bodied snake',
'The python course is worse taking',
'python python python python']
In the example, there is totally 1 low frequency occurrence of the word ’python’ (case sensitive)
in the strings whose length are more than 20 and the appearance times are less than 4 in the
list x.
To count the number of low frequency occurrences of a certain word str in the strings which
are long enough, use filter and reduce and write a function low_freq_word_count(x,
str, n, m) in the functional programming paradigm.
This function takes a list of strings x , the string we want to find str and two numbers
n, m as the inputs. The output or the returned value is the total number of low frequency
occurrences of the string str in the strings whose length is more than n(>n) and the
appearance frequency in it is less than m(<m) in the list x. The function should be in the
following format:
def low_freq_word_count(x, str, n, m):
# your code here
Hint:
• You could use filter to filter out the string whose length are not long enough or these
frequencies that are too high and use reduce to do the summation.
2 of 7
CSCI 2040 A/B Lab Assignment 4 Page 3
• str.count(sub) return the number of occurrences of substring sub in string str.
Note that you are required to use filter, reduce to finish your code. No use of these will
result in the deduction of your grade for this exercise.
Save your script for this exercise in p2.py
Exercise 3 (40 marks)
Visualization is widely used for data analysis. In this exercise, you will use Python scripts to
draw 5 kinds of figures. All the script for this exercise should be in a single file p3.py. The
package matplotlib is useful for this exercise, which can be imported as following4
:
import matplotlib.pyplot as plt
Note: your grade for this exercise depends on the readability of your figures. All the plots
should be clear to read, and all the plots must contain x-label and y-label.
Histogram (10 marks)
Histogram5
is a way to visualize the distribution of continuous variables.
• Write script in p3.py to plot a histogram for the random numbers in random_numbers
generated by the following scripts. The x-axis is the value of random numbers, and the
y-axis is the probability density. Hint: You could use hist() in matplotlib.
• Save your figure as histogram.png. Hint: You could use savefig() in matplotlib.
import numpy as np
random_numbers = np.random.normal(0, 1, 1000)
Pie chart (10 marks)
Pie chart6
is a way to visualize the distribution of categorical data.
• Write script in p3.py to plot a pie chart for the number of students of 3 selected
colleges in CUHK in 2019. You can select any 3 colleges in CUHK. The data for
the number of students can be found in https://www.iso.cuhk.edu.hk/images/
publication/facts-and-figures/2019/html5/english/10/. The categories in the
pie chart are the colleges in CUHK, and there should be label text for each category on
the figure. Hint: You may use pie() in matplotlib.
• Save your figure as pie.png. Hint: You could use savefig() in matplotlib.
4https://matplotlib.org/
5https://en.wikipedia.org/wiki/Histogram
6https://en.wikipedia.org/wiki/Pie_chart
3 of 7
CSCI 2040 A/B Lab Assignment 4 Page 4
Bar chart (10 marks)
Bar chart is also suitable to visualize categorical variables.
• Write script in p3.py to plot a bar chart for the number of students of all the 9
colleges in CUHK in 2019. There should be label text for each colleges on the figure.
Hint: We could use bar() or barh() in matplotlib.
• Save your figure as bar.png. Hint: You could use savefig() in matplotlib.
Scatter plot and line chart (10 marks)
Scatter plot7 and line chart8 are common ways to visualize the relationship between two
continuous variables. Suppose we have two lists of numbers x_list and y_list generated
by the following scripts.
import numpy as np
x_list = np.linspace(0, 1, 100)
y_list = x_list + np.random.rand(100)
• Write script in p3.py to draw a scatter plot of x_list (x-axis) and y_list (y-axis).
Draw a line chart for the function y = x in the same figure.
Hint: You may use plot() and scatter() in matplotlib, and you could set the alpha
blending value to make the dots more transparent.
• You are recommended to try marker=’*’ and color=’red’ options in scatter() and
linestyle=’dashed’ options in plot().
• Save your figure as scatter_line.png. Hint: You could use savefig() in matplotlib.
Exercise 4 (20 marks)
Mastering a programming language is not only about the syntax, but also requires one to
know the programming style. In this exercise, you will get a sense of the Pythonic way of
programming. In a nutshell, a Pythonic way of programming is to utilize Python’s features
that are designed to make a programmer’s life easier. Here are some examples:
1. Creating list of lists (using list comprehension).
Suppose you want a 2-dimensional array that is a list of 4 empty lists. Since Python
does not have declaration for a 2-dimensional array, you need to construct it from lists.
The wrong way is to append the same list for 4 times (Why it is wrong9
).
7https://en.wikipedia.org/wiki/Scatter_plot
8https://en.wikipedia.org/wiki/Line_chart
9http://cryptroix.com/2016/10/25/python-call-object/
4 of 7
CSCI 2040 A/B Lab Assignment 4 Page 5
# wrong code
list = []
list_of_lists = []
for i in range(4):
list_of_lists.append(list)
The ugly code runs a explicit for-loop.
# correct but ``ugly'' code
list_of_lists = []
for i in range(4):
list_of_lists.append([])
The Pythonic code has only one line that utilizes list comprehension.
# Pythonic code
list_of_lists = [[] for _ in range(4)]
2. Open a file, reading a file
Suppose you need to process the contents in a file, line by line. The following is the
ugly code, and may forget reading a new line in the while-loop or forget closing the
file.
# ``ugly'' code
file = open('some_file_name')
line = f.readline()
while line:
# do something with the line
line = f.readline() # you may forget this
file.close() # you may forget this
In a Pythonic way, we use with which automatically close the file after usage, and we
do a for-loop directly over the file.
# Pythonic code
with open('some_file_name') as file:
for line in file:
# do something with the line
3. Chained comparison
# ``ugly'' code
if 0 <= x and x <= 100:
x = x + 1
5 of 7
CSCI 2040 A/B Lab Assignment 4 Page 6
# Pythonic code
if 0 <= x <= 100:
x += 1
4. Conditional operator
# ``ugly'' code
if 0 <= x and x <= 100:
y = x + 1
else:
y = x - 1
# Pythonic code
y = x+1 if 0 <= x <= 100 else x-1
5. Multiple assignment
# ``ugly'' code
x = 1
y = 2
# Pythonic code
x, y = 1, 2
More examples can be found in many online posts by searching “Pythonic”10
.
In this exercise, you need to write a function named get_average_grades in p4.py that
takes the name of the grading file as the input, and returns a list of the average grades for
each lab assignments of the Python course. A prototype of your function can be
def get_average_grades(filename='grades.csv')
return average_grades_list
By default, the grades are recorded in an input file named grades.csv, in the same folder of
your scripts. Each line in this file records the grades of a student in the past lab assignments,
which are separated by commas (that is called “CSV” file). For example, we have 3 students
and 4 lab assignments, and the grades.csv has the following contents:
60,61,62.5,-1
-1,70,75,73
80,-1,87.5,-1
10https://medium.com/the-andela-way/idiomatic-python-coding-the-smart-way-cc560fa5f1d6
6 of 7
CSCI 2040 A/B Lab Assignment 4 Page 7
Here, if a student does not submit a lab assignments, his grade is recorded as -1. For
example, the student for the first row has grades 60, 61 and 62.5 for the first three lab
assignments respectively, and the “-1” indicates that this student does not submit the fourth
lab assignment.
The average grade for a lab assignment is the average grade of all the students who submit
this lab assignment. For the above example, the average grade for lab assignment 1,2,3,4 are
70, 65.5, 75, 73. The output is a list of the average grades for each lab assignments
(each number should be float which will be compared by the sample answer).
The return value of get_average_grades for the above example should be a Python list:
[70, 65.5, 75, 73]
Your scripts should not contain any one of the above mentioned 5 kinds of
“ugly” code. Your marks will be deducted by 4 for each kind of “ugly” code in
your scripts. Your scripts can be in any style that does not contain the above
mentioned “ugly” code, you are NOT necessarily required to use the Pythonic
code.
Submission rules
1. Please name the functions and script files with the exact names specified in this assignment and test all your scripts. Any script that has any wrong name or syntax error will
not be marked.
2. For each group, please pack all your script files as a single archive named as
<student-id1>_<student-id2>_lab4.zip
For example, 1155012345_1155054321_lab4.zip, i.e., just replace <student-id1> and
<student-id2> with your own student IDs. If you are doing the assignment alone, just
leave <student-id2> empty, e.g, 1155012345_lab4.zip.
3. Upload the zip file to your blackboard ( https://blackboard.cuhk.edu.hk),
• Only one member of each group needs to upload the archive file.
• Subject of your file should be <student-id1>_<student-id2>_lab4 if you are in a
two-person group or <student-id1>_lab4 if not.