Data Science Edition!

XGBoost Hyperparameters

Statistics

Answer the Client Question

All About Models

Fix the Code

100

This hyperparameter controls how many splits the tree can make.

Smaller values of this creates simpler trees that focus on broad patterns in the data while larger values allow the tree to capture more complex patterns and interactions.

What is max_depth (maximum depth of tree)

100

This hypothesis test is used to compare the means of two independent groups.

What is a t-test.

100

Can you give me a specific example showcasing that personalization works?

Any successful campaign results!

100

This metric is used to assess the tradeoff between precision and recall, providing a single score that balances both. The harmonic mean of the two.

What is the F1 Score

100

SELECT COUNT(*) as num_members,

home_bedroom_count

FROM model_run_20250306.std_members_enhanced

GROUP BY 2

What is the Null Check

This is the most commonly missed check for developers.

200

This hyperparameter controls the minimum amount of data / weight required to make a split in a tree for the child nodes.

If the value is too low the tree can make splits even on small subsets of data. If it is higher the tree only makes where there is a significant amount of data supporting it.

What is min_child_weight

200

This is the term for an error that occurs when a true null hypothesis is rejected.

The probability of making this error is denoted by alpha (significance level).

What is a Type I error (false positive).

200

Your client asks if you can build them a model for data you just got and haven't gotten a chance to analyze. What is the correct response?

A. Of course! We'll build you an amazing model right away that has perfect accuracy.

B. Let us take a look at a baseline model to see what insights we can gather and assess if its feasible

C. Let us take a look at the data and see what trends and relationships we can see between data points that will be insightful in predicting XYZ. Once we establish some insights, we can build a baseline model and assess if it has predictive power.

What is C

200

This supervised algorithm predicts a data points class or value based on the classes or values of the data points closest to it. It is a simple yet effective method for low-dimensional datasets.

What is k-nearest neighbors (KNN)

200

for order in orders:

if (orderStatus = "Paid"):

orderStatus = "Shipped"
print(Order {order_id} is now Shipped.")

else:
print("Order {order.order_id} is now Completed.")

What is variable naming.

order_id vs orderStatus (one is camel case while another is snake case)

300

This hyperparameter controls the fraction of training data used to grow each tree.

Helps to increase robustness of model.

What is subsample.

300

This statistical test is used to compare the means of more than two independent groups.

What is ANOVA (analysis of variance)

300

Your client has received your Faraday client export but is sus on the accuracy of the fields. They ask how you validated Faraday information.

Check means, medians, and outliers

Compare against a similar client population

Compare against census data

300

This supervised learning algorithm is widely used for classification and regression tasks by finding the best decision boundary that maximizes the margin between classes.

What is Support Vector Machine (SVM)

300

public double CalculateTax(double income)
{
if (income <= 50000)
return income * 0.10; // 10% tax for income <= 50,000

if (income <= 100000)
return income * 0.15; // 15% tax for income between 50,001 and 100,000

if (income <= 200000)
return income * 0.20; // 20% tax for income between 100,001 and 200,000

return income * 0.30; // 30% tax for income > 200,000
}

What is magic number code.

Makes future updates cumbersome, increases risk of errors. Remove magic numbers entirely and centralize these values in a configuration file, database, or any central place, so they can be easily updated without modifying the code

400

This hyperparameter performs a type of regularization where it adds a penalty to the square of the leaf weights.

This helps to discourage overly large weights, smoothing the values and help to reduce overfitting and make the model more balanced.

What is lambda (L2 Regularization)

400

This statistical test is used to compare variance across multiple groups and it is a prerequisite for performing an ANOVA test.

What is the Bartlett's test

400

FREEBIE YOU JUST GOT SOME POINTS

YAY

400

This type of neural network layer is used to automatically learn spatial hierarchies in image data by applying kernel mathematics through a matrix over input data.

What is a convolutional layer

400

public class DiscountCalculator
{
public double CalculateDiscount(double amount, double discountPercentage)
{
double discount = amount * discountPercentage;
double discountedPrice = amount - discount;
return discountedPrice;
}

public double ApplyDiscount(double amount, double discountPercentage)
{
double discount = amount * discountPercentage;
double discountedPrice = amount - discount;
return discountedPrice;
}
}

This does not follow DRY (Don't Repeat Yourself).

The same logic is being used in multiple places. It means if there is a change in logic we have to manually change it everywhere.

Separate the logic into its function and this function can be used everywhere without having redundant code.

500

This hyperparameter performs a type of regularization that adds a penalty to the loss function based on the absolute magnitude of leaf weights encouraging some of the weights to become 0.

This leads to a sparser model where fewer features are effectively used in the splits.

What is alpha (L1 Regularization)

500

This principle stats that as the sample size increases, the sampling distribution of the sample mean becomes more normally distributed regardless of the shape of the original population distribution.

What is the Central Limit Theorem

500

You just presented personas to your client one of which was the Busy Household Persona and provided recommendations for benefits.

One benefit you recommended was a family wide wellness program where parents and children could schedule appointments at the same time.

Your client asks "Why is benefit beneficial for this persona and what analyses have you done in the past to prove this?"

The caretakers number of office visits, allowed amounts decrease as the child's office visits, allowed amounts increase.

ED Visits for parents are increased after having a child.

500

This model can be used to generate new data, such as images or text, by learning a generative distribution, and it consists of two networks: a generator and a discriminator

What is a Generative Adversarial Network (GAN)

500

def get_user_info():

name = input("Enter your name: ")

email = input("Enter your email: ")

return name, email

def send_welcome_email():

name, email = get_user_info() # Directly calling get_user_info, tightly coupling them

print(f"Sending welcome email to {name} at {email}")

# Calling the tightly coupled function

send_welcome_email()

What is tightly coupled code.

def get_user_info():

name = input("Enter your name: ")

email = input("Enter your email: ")

return name, email

def send_welcome_email(name, email):

print(f"Sending welcome email to {name} at {email}")

# Now, calling the functions in a decoupled way

name, email = get_user_info() # Get user info separately

send_welcome_email(name, email) # Pass the data to send_welcome_email