Data Science Basics
Programming & Tools
Ethics & Data Privacy
Math & Stat Concepts
Data Insights
100

What is the process of extracting knowledge and insights from data called?

What is data science?

100

Name two programming languages commonly used in data science and analytics.

What is R, Python, C, C++, MATLAB, SQL,JAVA?

100

What is an incident where sensitive, protected, or confidential data is accessed or disclosed without authorization?

What is a data breach?

100

What is the common significance level (alpha) used when interpreting p-values?

What is 0.05 (5%)?

100

What is the term for the process of storing, managing, and analyzing large volumes of data?

What is big data?

200

What is the first step in the data science process, where data is collected and prepared for analysis?

What is data collection or data preprocessing?

200

What type of software is freely available and used for collaboration and data analysis?

What are open source tools?


200

What is a potential ethical concern associated with the use of data science in social media?

What is invasion of privacy or misuse of personal data?

200

What percentage of data in a normal distribution falls within one standard deviation of the mean?

What is 68%?

200

What type of data refers to information that is placed in groups?

What is categorical data?

300

In data science, what do we call a predictive model that is trained on historical data to make future predictions?

What is a machine learning model?

300

What Python library is commonly used for data manipulation and analysis using DataFrames?

What is pandas?

300

What is data that can be used to identify an individual, such as name, social security number, or email address?

What is personally identifiable information (PII)?

300

For a line with equation y = 2x - 3, if point (4, y) lies on the line, what is the value of y?

What is 5? (Explanation: Plug x = 4 into the equation to get y = 2(4) - 3 = 5)

300

What is the term for the technique of teaching machines to mimic human cognition, such as learning and problem-solving?

What is artificial intelligence (AI)?

400

Blank is systematic error leading to inaccurate results; addressing it improves fairness and accuracy. What is blank?

What is bais?

400

What does SQL stand for?

What is Structured Query Language?

400

What is the practice of collecting only the data that is strictly necessary for a specific purpose.

What is data minimization?

400

What measures how many standard deviations a data point is from the mean of a dataset?

What is the z-score?

400

In machine learning, what term describes the process of teaching a model to make predictions based on labeled data?

What is training?

500

When a model performs well on training data but poorly on unseen data what is that called?

What is overfitting?

500

Which command-line tool is used to create a new Git repository?

What is git init?

500

What is the process of removing personally identifiable information from datasets to protect individual privacy?

What is data anonymization?

500

What does the “gradient” represent in gradient descent?

What is the vector of partial derivatives showing the direction and rate of fastest increase of the loss function?

500

What is the difference between correlation and causation?

What is correlation means two variables move together; causation means one causes the other?