Basics of Data Science
Preprocessing
Modeling
Libraries
100

What does NDL stand for?

Nittany Data Labs

100

Give one way of dealing with null data

Can vary

100

What's the most common type of regression?

Linear Regression

100

What's the most common Python data manipulation library?

Pandas
200

Name one step in the data science process

Gathering Data, Data modeling, Actionable insights
200

Explain data integration

Combining data from different sources

200

Provide a method to get the coefficients of a linear regression equation

Ordinary least squares, gradient descent

200

What's the main advantage of numpy arrays over lists?

Improved preformance

300

What form of supervised learning is used to predict quantitative variables

Regression
300

What does ETL stand for?

Extract Transfrom Load

300

What advantage does the error function MAE provide over MSE?

More intrepretable

300

What's the most popular traditional ML python library?

Scikit-learn

400

What three things make up data science?

Computer Science, Statistics, Subject Matter Experience

400

What is this an example of? 

Hot One Encoding

400

What is the purpose of the validation split in the train, test, and validation split of data?

Optimize the model

400

What popular data visualization library can be used to create interactive plots?

Plotly

500

Give another application of reinforcement learning

Can vary

500

What is the Standard Scaler Equation?

New Value = (Old Value - Mean) / Std.
500

Explain overfitting

The model gets too used tothe  training data, which hurts the testing data results

500

What package does this line of code come from: plt.show()

Matplotlib

M
e
n
u