Basics of Data Science
Preprocessing
Modeling
Libraries
100

What does NDL stand for?

Nittany Data Labs

100

Give one way of dealing with null data

Can vary

100

What's the most common type of regression?

Linear Regression

100

What's the most common Python data manipulation library?

Pandas
200

Name one step in the data science process

Gathering Data, Data modeling, Actionable insights
200

Explain data integration

Combining data from different sources

200

Provide a method to get the coefficients of a linear regression equation

Ordinary least squares, gradient descent

200

What's the most popular traditional ML python library?

Scikit-learn

300

What form of supervised learning is used to predict quantitative variables

Regression
300

What does ETL stand for?

Extract Transform Load

300

What advantage does the error function MAE provide over MSE?

More intrepretable

300

What popular data visualization library can be used to create interactive plots?

Plotly

400

What three things make up data science?

Computer Science, Statistics, Subject Matter Experience

400

What is this an example of? 

One Hot Encoding

400

What is the purpose of the validation split in the train, test, and validation split of data?

Optimize the model

400

What's the main advantage of numpy arrays over lists?

Improved performance

500

Give another application of reinforcement learning

Can vary

500

What is the Standard Scaler Equation?

New Value = (Old Value - Mean) / Std.
500

Explain overfitting

The model memorizes the training data, which hurts the testing data results

500

What package does this line of code come from: plt.show()

Matplotlib

M
e
n
u