$30
Total points: 80 (+0 bonus)
Assignment 3
Written assignment
1. [12 points] When a client C accesses server S through Tor, she usually builds a circuit
of three nodes: N1, N2, and N3. A connection is established as follows:
C → N1 → N2 → N3 → S
N1 is also known as the entry node or the guard node, and N3 is also known as the exit
node. Visit metrics.torproject.org to answer the following questions:
(a) [4 points] Give the total amount of advertised bandwidth of relays with the “Guard”
flag (but not the “Exit” flag) and relays with the “Exit” flag (but not the “Guard”
flag) on 2020-02-01. Which is more? Give one reason to explain this phenomenon.
(b) [4 points] Give the median download rate of a file (in bits per second) for a 50 KiB
file and a 5 MiB file to the op-hk onion server on 2020-02-01. Can you explain the
difference?
(c) [4 points] What is a disadvantage of using three nodes in a Tor circuit instead of
one node? What is an advantage of doing so?
1 CMPT479 A3
2. We have four data privacy techniques:
1. k-anonymity
2. Differential privacy
3. Secure multiparty computation
4. Private information retrieval
For each scenario below, discuss whether each of the four data privacy techniques would
be suitable to resolve the challenge, and explain why if they are not. It is possible to
have multiple suitable ones or no suitable ones. This means the answers to a, b, c,
d should have 4 parts each.
(a) [4 points] People in country A are anxious to know if they have been in recent
contact (20m) with someone confirmed to have caught an infectious disease called
CROW, but the identity and exact location of CROW patients is private and there
are so many cases that the government of country A cannot manually warn all
potential contacts. Two big tech companies are trying to create an app to satisfy
this need in a privacy-preserving manner with Bluetooth beaconing.
(b) [4 points] Lacking exercise due to CROW quarantine, you want to buy a new smart
device to encourage a healthy lifestyle by measuring your calorie intake. Specifically,
you want to be able to look up the nutritional information of any food using your
device at any time. However, you are reluctant to reveal your personal eating habits
to the company that owns the device. A new company making these smart devices
is willing to use a data privacy technique to protect your privacy.
(c) [4 points] With more free time, you want to start a business that can be run from
home, and you would like to purchase its web domain as the company’s website.
However, you are aware of the practice of cybersquatting; people may purchase
the domain first if they know it is in demand, and sell it to you at an elevated
price. You want to know if the domain is still available, but you are worried that
attempting a DNS query for the domain will lead to some DNS servers purchasing
it immediately for cybersquatting. Some helpful DNS servers recognize your need
and will cooperate with you if you can identify a useful data privacy technique to
use with them.
(d) [4 points] A social media company knows that people are feeling lonely, so it is
offering a monetary reward for a challenge: design the best algorithm that can
identify which people should be recommended to talk to each other to become
online friends, based on some data about their real-life habits, online usage patterns,
interests, and other personal data. However, they are aware that simply releasing
everyone’s raw data for this challenge is a violation of their privacy.
2 CMPT479 A3
Programming assignment
Error Correcting Codes [52 points]
Error Correcting Codes (ECCs) can be used in unsafe transmission or storage devices to
improve reliability by providing redundancy. Unlike checksums, ECCs can automatically
correct errors up to a certain number of bits. In this question, you will be asked to
implement the general Hamming code, which can correct any 1-bit error in a general
string. (Do not use a Hamming code library — it should be your own code.) The
specifications of the Hamming code are as follows.
1. Write the input data string as a bitstring according to ASCII.
2. Choose enough parity bits and intersperse the data bits with parity bits, so that
the i-th parity bit pi
is followed by exactly 2i−1 − 1 data bits. Denote the i-th bit
of H as Hi
; the first bit is H1. Denote the length of H as |H|.
3. Let B(x) be the bitstring representation of the integer x, for example, B(5) = 00101.
4. Define each Mi to be a set of specific bit positions in H, such that Mi
is the set of
integers 1 ≤ x ≤ |H| where the i-th least significant bit of B(x) is 1.
5. Set the parity bit pi to be the XOR of all the bits of H in the positions indicated
by Mi
, except pi
itself, which is always located at H2
i−1 :
pi =
M
x∈Mi\{2
i−1}
Hx
Here is a more detailed explanation of Mi
. Consider the integer x = 10, so B(10) is
1010. Then the 1st least significant bit is 0, the 2nd is 1, the 3rd is 0, and the 4th is 1
(reverse the bitstring). So x belongs to M2 and M4, not to M1 and M3.
To work out an example:
1. The input data string is “ab”, which becomes 0110000101100010 (16 data bits).
2. We need five parity bits, p1, p2, p3, p4, p5, as follows:
H = p1p20p3110p40001011p500010
p5 can be followed by up to 15 data bits, but we ran out of data bits, so we stopped
there. H3 = 0 and H8 = p4. |H| = 21.
3. For example, B(20) = 10100.
4.
M1 = {1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21}
M2 = {2, 3, 6, 7, 10, 11, 14, 15, 18, 19}
M3 = {4, 5, 6, 7, 12, 13, 14, 15, 20, 21}
M4 = {8, 9, 10, 11, 12, 13, 14, 15}
M5 = {16, 17, 18, 19, 20, 21}
3 CMPT479 A3
5.
p1 = H3 ⊕ H5 ⊕ H7 ⊕ H9 ⊕ H11 ⊕ H13 ⊕ H15 ⊕ H17 ⊕ H19 ⊕ H21 = 0
p2 = H3 ⊕ H6 ⊕ H7 ⊕ . . . = 1
p3 = H5 ⊕ H6 ⊕ H7 ⊕ . . . = 0
p4 = . . . = 1
p5 = . . . = 1
(a) [18 points] Write a program that will automatically generate the correct parity bits
for any input string. The input string will be the first argument of the program.
Your program, called a3a, should write the parity bits in the correct order to stdout
(e.g. cout or print) as characters, ”0” or ”1”. Continuing the above example, if
the program is called with:
./a3a ab
The program should display:
01011
(b) [24 points] Write a program that will automatically correct any one-bit error in
either the parity or the data string. The parity will be the first argument of the
program, and the data string will be the second argument of the program. Your
program, called a3b, should output the corrected parity bits to the first line of
stdout, and the corrected data string to the second line of stdout. If there is no
correction to be made, you should still output the original parity bits and the data
string. For example, if I run your program with:
./a3b 01011 ac
Your output should be:
01011
ab
The program will always be tested with the correct number of parity bits, and there
will either be a one-bit error in the parity or the data string, or no error. Either
type of error should be fixed. Also, note the meaning of an ASCII one-bit error:
“a” (01100001) can become “i” (01101001), but “a” cannot become “b” (01100010)
because that would be two bit flips.
(c) [10 points] Submit two sentences in files sentence1 and sentence2, such that they
have the same Hamming code (as output by part a). They should all be logical and
grammatically correct sentences with no typos, and they should all include your
@connect.ust.hk username. Any ASCII characters that are not letters, numbers, or
basic punctuation will be removed.
Hint: It may be easier to construct these sentences if you use numbers somewhere
in your sentence. Keep in mind that you should not look at anyone else’s sentence.
4 CMPT479 A3
Submission instructions
All submissions should be done through CourSys. Submit the following programs:
a3.pdf, containing all your written answers.
Files for the programming assignment: sentence1, sentence2.
Code for the programming assignment, detailed below:
For the programming assignment, submit your code; do not submit any compiled files.
C++: Submit a3a.cpp, a3b.cpp. I will compile them and call ./a3a <message>.
Python: Submit a3a.py, a3b.py. I will call python a3a.py <message>.
Java: Submit a3a.java, a3b.java. I will compile with javac a3a.java and then call java
a3a <message>.
If there is a Makefile in your folder, the Makefile will override all of the above. This
implies if you are not writing in C++, Python, or Java, you must include a Makefile.
Keep in mind that plagiarism is a serious academic offense; you may discuss the assignment, but write your assignment alone and do not show or send anyone your answers and
code.
The submission system will be closed exactly 48 hours after the due date of the assignment. Submissions after then will not be accepted unless you have requested an extension
before the due date of the assignment. You will receive no marks if there is no submission
within 48 hours after the due date.
5 CMPT479 A3