Big Data and Privacy
Binary and Data Representation
Data Compression
Extracting Information
Visualization and Communication
100


Waze uses data from thousands of drivers to identify traffic jams. This is an example of:

Crowdsourcing

100

What is the maximum decimal value that can be represented using 4 bits?

15

100


Which compression type is required for archiving a legal contract where every word must be preserved?

Lossless

100

You are looking at a spreadsheet of 10,000 students and want to see only those from 'New York'. This is called:

Filtering

100


Which visualization is most effective for showing a trend in data over a continuous period of time?

Line Graph

200


What does PII stand for?



Personally identifiable information.

200


Which error occurs when a program tries to store the number 257 in an 8-bit register?

Overflow error

200


What is the main trade-off when choosing Lossy compression over Lossless?

Smaller file size for lower quality.

200


What is the process of fixing incomplete or incorrectly formatted data in a dataset?

Data Cleaning

200


Why is a bar chart with a Y-axis starting at 100 instead of 0 often considered misleading?

It makes small differences between values look much larger

300


Why is 'Big Data' typically processed using parallel systems?

Because the data is too large for a single computer to handle efficiency.

300

Why might $0.1 + 0.2$ result in $0.30000000000000004$ in many programming languages?

Round-off Error

300

If you compress a 10MB photo using Lossless compression and then decompress it, what happens to the size?

It stays at 10 MB

300

Which of the following is an example of metadata for a digital photo?

Date and time the photo was taken.

300


What is the primary reason for creating a data visualization like a scatter plot?

It help human identify patterns or correlations that are hard to see in raw numbers.

400


What is 'Re-identification' in the context of data privacy?

Combining anonymous data with other datasets to uncover a person's identity.

400


How many distinct values can be represented by a single Byte?

256

400

Which of these file formats is a common example of Lossy compression?

JPEG

400


True or False: Metadata can be used to track a person's location even if the image content is hidden.

True
400


Which chart type is best for showing how a single budget is divided into different departmental percentages?

Pie chart.

500


What is 'Re-identification' in the context of data privacy?

It leads to data being skewed because certain groups are underrepresented.

500


In the hexadecimal color code #FF0000, what does the 'FF' represent?

Max Red

500


Why is Lossy compression unsuitable for executable software (.exe files)?

Software requires every bit to be exact to function correctly.

500


When you group a dataset of 5,000 individual sales into a single 'Total Monthly Revenue' figure, you are:

Aggregating.

500


If a survey about local internet quality is only posted on a high-speed fiber-optic forum, the results will likely have:

Sampling Bias
M
e
n
u