2-Friends I
2-Friends II
Huffman Tree Decompression I
Huffman Tree Decompression II
Huffman Compression
100

3. What is the value for the key 1?

[0, 2]

100

4. What is the value for the key 4?    

[3]

100

1. What is the encoding for "C" and for "M"?

C = 100

M = 111

200

1. Explain why person 4 has 2 two friends (explain who is direct and indirect)

Direct friend: person 3

Indirect friend: person 2 (since person 3 is direct friends with person 2)

200

2. Using the encodings that come from this tree, what is the bit-encoding for "moonroof"?

111 01 01 00 101 01 01 110

200

3. What string does this bit-stream encode?  11001101111?

FORM [110 01 101 111]

200

3. Explain why codings['A'] represents a string of zeros and ones like "0101" representing the encoding for 'A'  created from the tree based on lines 66.

The values in codings give the path from the root of the tree to a given character, with ‘0’ representing a left and ‘1’ representing a right.

300

2. Explain why persons 1 and 3 have 3 two-friends (explain who is direct and indirect)

Person 1

Direct: person 0 and person 2

Indirect: person 3 (direct friend of person 2)

Person 3

Direct: person 2 and person 4

Indirect: person 1 (direct friend of person 2)

300

5. What is the purpose of line 19 in terms of direct or indirect friends?

Line 19 appends all the direct friends of person[index] to the set of all the twoFriends.

300

(Pseudocode section)

2. Why are two values of null used in the HuffNode returned in the else clause?

We have found a leaf so it has no children, the null are the left and right of the leaf node

300

(Pseudocode section)

3. Why does it matter that left is assigned the value returned by the first recursive call rather than the second (the calls can't be interchanged).

The pre order traversal is current, left, right, so the next significant bits needs to be the left node

300

1. Explain why counts['A'] represents the number of occurrences of 'A' in the array returned on line 64 and how the array is like a map from character to integer.

In Java, all characters also have a numerical equivalent, so you can index into an array with a character and it will convert it into a number. As the code goes through the file, for each character it indexes into the array with that character and increases the count there by one, so counts[‘A’] will represent the number of occurrences of ‘A’. 

Because the index is associated with the count, this is like if the index (a character) was the key in a map, where the value is the count (an integer).

400

6. Why is index removed from the set on line 23? Why is that necessary?

Each of the friends of person[index] will have person[index] as a friend, and we want to find the twoFriends of person[index]. We don't want to count person[index] as a twoFriend of themselves, so we should remove them from the set.

400

7. The loop that isn't shown on line 20 adds all the friends-of-friends (indirect)  to set. What is the body of that loop? Base the code on what's shown here.

set.addAll(myGraph.get(friend))

400

(Bit encodings section)

1. Draw the three-leaf  tree that corresponds to the bits below: note that internal nodes are represented by  a single 0, leaf-nodes by a single 1 followed by 9 bits for the character in the leaf. Pre-order traversal. There are five nodes and three leaves, that's a total of 32 bits (5 + 3*9).

0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 1

         ()

       /    \

     ()     (A)

   /    \

(B)    (E)

[0 0 1B 1E 1A]

400

(Bit encodings section)

2. What is the bit encoding for this tree since there are 7 nodes and four leaves there should be a total of 7 + 4*9 = 43 bits to encode this tree.


0 0 1001000100 1001000001 0 1001000101 1001000011 [0 0 1D 1A 0 1E 1C]

400

4. Explain why codings is passed to the method to write the compressed file.

In order to write the compressed file, the code needs to know the correct representation of each character. When we write the compressed file, we’re basically saying, “instead of the normal binary representation of  ‘A’ (which is ‘01000001’), we’re going to use ‘0101’. Every time there was an ‘A’ in the file, instead of writing out ‘01000001’, it’s now ‘0101’”

500

(Pseudocode section)

1. What do you think it means that reading a single bit that returns a -1 results in throwing an exception as shown above?

The file is empty and there were no more bits to read. This means something has gone wrong (likely with your compression) because the tree should end naturally when all of the children have been filled in. There shouldn’t be a case where you expect there to be another node but the file doesn’t have any more info.

500

2. Explain at a high-level (based on class and reading) how a greedy algorithm is used to create the tree from the counts on line 65.


A greedy algorithm is one that takes the best possible option at every step, without seeing the bigger picture. For example, consider this (conceptual) example where a greedy algorithm wouldn’t work:

              B

          /      \

       5          -3

     /                 \

A   –––   4   –––  C


If we were trying to find the least expensive path from A to C, a greedy algorithm would say, at each node pick the smallest attached path. Then go to the next node and pick the smallest one from there. But in this example the smallest path from A is the 4, but going A->B->C actually is cheaper, with a cost of 2. So a greedy algorithm is bad here.

In Huffman, we make the tree by picking the value with the smallest count every time to add to the tree. Unlike in the example above, in Huffman this works and successfully creates the right tree.

M
e
n
u