vibe check
bet
big max energy
low-key min
bffr
100

The process of replacing missing data with substituted values is called:

imputation

100

What is a common metric to evaluate linear regression models (describes how much variance is captured by the input variables)?

R^2

100

What are the three main components of optimization formulations?

Decision variables, objective function, constraints
100

Describe one reason why integer programs are harder to solve than linear programs.

Discrete feasible region, complete enumeration

100

In Mohammad's work with DHS and Immigration Policy, what did he notice about some of the detainer IDs in the dataset?

There were many non-matching IDs in the dataset, and found during data cleaning that he had to multiply them by 2 and add three.

200

Fill in the blanks: In analytics, we use _____ to build _____ to make _______.

data, models, predictions
200

Describe one reasonable method of modeling text data.

One-hot encoding, Bag-of-Words, TF-IDF, Trained Embeddings (e.g., GloVe)

200

Consider the following optimization formulation. The decision variables are x1 and x2. Is it linear? 

max 1/3*x1 + pi*x2

s.t. sin(y1)*x1 + cos(y2)*x2 <= cos(y1)*x1 + sin(y2)*x2

(y1/y2)*x1 = x2

x1, x2 >= 0

Yes

200

A class of non-linear programs where we can guarantee global optimality is known as:

convex optimization

200

Other than "linear programming", the World Food Programme application for humanitarian food aid discussed in lecture is an example of what class of optimization problems?

Network Optimization

300

Describe the difference between training sets and testing sets.

Training set is used to build the model and fit parameters.

Testing set is used to evaluate the model on unseen data.

300

Name three metrics we can use to evaluate the predictive performance of models.

MAD, MSE, MAPE

300

What is the term for the set of possible solutions that abide by all the constraints in an optimization formulation?

Feasible region

300

Describe one reason why non-linear problems are difficult to solve.

Optimum does not necessarily lie in a corner of the feasible region

No certificate of optimality (local vs. global)

300

TNG and the Airline Industry were examples of Revenue Management. In Revenue Management, what are some of the main ideas to consider (hint: there are three)?

Selling the right product to the right customers at the right price and the right time

- Customer Segmentation

- Pricing

- Holding Sales for High-Demand Periods

400

In practice, the data we work with is imperfect and requires cleaning. Name three methods for data cleaning.

KNN, mean, median, arbitrary value, etc.

400

What are the three components of ARIMA models?

AR - autoregressive (P), MA - moving average (Q), I - integrated (difference, D)

400

Define shadow price.

For a given constraint, change to the objective function if I increase the corresponding RHS by 1 unit; in other words, the "marginal value" of that constraint.

400

What are two approaches to multi-objective programming we discussed in lecture? Briefly describe both.

Weight-based Approach (assign weights to each objective)

Goal Programming (single objective, other objectives as constraints)

400

In the MAMD case, we wanted to write constraints that modeled non-linear scenarios such as "If have enough raw milk available then I must produce a minimum amount of butter; otherwise, I can not produce any butter at all."

What linearization technique did we employ to model such constraints?

Big-M

500

Name the three major classes of analytics.

Descriptive, Predictive, and Prescriptive

500

What is one key assumption we make when conducting time series analysis? 

The data is stationary

500

What is the algorithm used to efficiently solve linear programs called? Briefly describe how it works.

Simplex Method; traverse edges of the feasible region and evaluate objective at each extreme point

500

What is the term to describe candidate solutions in multi-objective programming where both objectives cannot be improved simultaneously?

Pareto-Optimal

500

In the ICBC China Bank case, what was the biggest challenge in estimating market potential, and how did they address that challenge?

Market potential can not be concretely quantified, so they didn't have data they could train a typical regression model with. Instead, they created an optimization problem to estimate the weights by minimizing the difference in predicted market potential value between similar cells (with regularization) subject to the fact that the calculated market potentials must satisfy the relative rankings that the experts suggested.