What do data scientists actually do?
Data scientists business questions and strategy; program, use math, statistics, and machine learning algorithms... to write, communicate and create pretty graphs.
Question: Why the emphasis on communication?
To be a good data scientist what must you have?
A love of learning!
Question: None. This speaks for itself.
What background(s) does the speaker have?
He calls himself a "Data Journalist." He has a journalism background, but he started out as a programmer and only "recently" started designing.
Question: Why does this matter?
Why do UPS trucks never make left turns... ever!?
Because they calculated that it is faster to just turn right.
Note: They also use dynamic re-routing by combining maps with the destination of every package in the truck to find the best path from A-B.
What is the "purpose of computing?" according to Richard Hamming, 1961
It's all about business "Insight" not Numbers for number's sake.
Question: What does this saying mean to you?
How does Hilary Mason spell "AWESOME" (i.e., what is the "process" of data science)?
OSEMN: Obtain, Scrub, Explore, Model, iNterpret
Question: Why is this cool? This is essentially what we see now when it comes to data mining or data analytics...a more robust version of this same model (developed in 2010)
What makes someone a good data scientist?
A mixture of analytical, technical and business knowledge.
Question: What classes are you taking or planning to take that will help you cultivate these skills here at Berry?
Data is the new ______ ?
"Oil" or "Soil." Visualizations are flowers blooming from the data soil.
Question: Is this an appropriate metaphor?
What was a positive side effect of the Jawbone app (the one that records your sleep patterns)?
Two things! 1) Able to ID the epicenter of an earthquake by people waking up to it! 2) Now, real-world data to discuss the impact of day-light-savings on sleep (and loss if it) in the US.
What is the DANGEROUS definition of Big Data?
Big data will tell you what to do... and you do not need to think (no need for doctors etc.).
Question: Why is this notion a fallacy? "Data will tell you whether A or B is correct, but it will not tell you what A or B should be in the first place."
What is BIG data?
1. "Too big for Excel?"...
2. Data too large to fit on one node... data that requires special infrastructure in order to analyze it or
3. Big data is "useful data."
Question: What does Ms. Mason mean when she says: "Useful data is data that we can ask questions of and get answers before we forget why we asked..."
Today, "data scientists" are a blend of which disciplines...
Mathematicians, Statisticians, Engineers (coders) and Communicators (i.e., storytelling and visualization)...
Questions: What does this blend of backgrounds mean for you... as a potential data scientist?
"Information is beautiful, data is beautiful... coming across a lovely data visualization is like a..."
"It's like a relief..." we are all demanding a visual aspect to our information.
Question: Does this resonate with you?
What is the most remarkable thing about Google Maps?
It's a data visualization product that is intuitive and unassuming. Getting data from phones, walking, on foot, Waze (crowdsourcing) etc... everyone is using it and it's boring!
What two fields (historically) does Data Science build off of?
The two main fields are: Engineering (astrophysics) & Business (finance).
What's new? It's all been done before - now it is just accessible to more people (by that we mean cheaper and open source) The modeling that a data scientist does is similar to these fields!