Data Science Basics
The process of extracting knowledge and insights from data
What is data science?
This is the brain of the computer, responsible for processing data.
What is the Central Processing Unit (CPU)?
They use data to understand what posts people like, how long they watch videos, and what keeps them coming back on apps like TikTok and Instagram.
β
What is a Social Media Analyst?
(OR: Data Analyst?)
ββ You find it by adding all the numbers in a dataset and dividing by how many there are.
What is the mean?
OR: average
This term describes huge amounts of data that are too big or complex for regular computers to handle easily
What is big data?
The first step in the data science process, where data is collected and prepared for analysis
What is data collection?
A programming language that is commonly used in data science and analytics
What is Python / R / MATLAB / SQL ?
(C, C++, JAVA )
They help detect fake news and harmful content by using machine learning to spot patterns in social media posts
β What is a Content Moderator?
(OR: What is a Data Scientist?)
This statistical term tells us how spread out or bunched together numbers are in a dataset.
π What is standard deviation?
variation OR variability
π§Ί This kind of data is sorted into categories like βyes/noβ or βapples/bananas.β
What is categorical data?
This is a predictive model trained on historical data to make future predictions.
What is a machine learning model?
This is a cloud based version of Microsoft Excel
What is Google Sheets?
They use climate data to study weather patterns, predict extreme weather, and understand climate change.
What is a Climate Data Scientist?
(OR Environmental Data Scientist)
π This branch of math helps us understand data using numbers, charts, and probabilities.
What is statistics?
π§π½βπ» This type of event gathers people to work together on solving coding or data challenges, sometimes overnight!
What is a Datathon / Hackathon?
The representation of data using charts, graphs, maps, and other visual tools.
What is data visualization?
This is a type of software is freely available and used for collaboration and data analysis.
What are open-source tools?
Using too much personal data or tracking users without permission is an example of this ethical concern in data science.
What is data privacy?
OR invasion of privacy OR misuse of personal data?
This kind of graph shows how data changes over time, like tracking your steps each day.
π What is a line graph?
π€ This is what we get when we teach computers to βthinkβ like people and solve problems.
What is artificial intelligence (AI)?
This type of knowledge helps make sense of data within a specific field or industry.
What is domain knowledge?
This tool lets users write and run code, visualize data, explain findings and share our workβall in one place.
What is Jupyter Notebook / Google Colab?
TikTokβs βFor Youβ page is powered by this kind of system, which uses your likes and views to suggest new videos.
What is a recommendation algorithm?
(OR: recommender system)
βοΈ If a line follows the rule y = 2x + 3 and x = 4, then this is the value of y.
What is 11? (Explanation: Plug x = 4 into the equation to get y = 2(4) + 3 = 11)
π§ This machine learning step involves showing a model examples with the answers so it can learn.
What is supervised learning?