What does the 5 number summary include?
The 5 number summary includes the minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum values of a dataset.
What is the mean and how is it calculated?
The mean is a measure of central tendency that is calculated by summing up all the values in a dataset and dividing by the total number of values.
How can you identify a linear relationship from a scatter plot?
In a scatter plot a linear relationship is identified by a straight line or a consistent pattern of points that follow a linear trend, the points should be evenly distributed around the line
What is the form of the equation for a linear model written?
y=mx+b
How do you calculate the IQR using the five-number summary?
The IQR can be calculated by subtracting the first quartile (Q1) from the third quartile (Q3): IQR = Q3 - Q1.
Find the 5 number summary: [1,3,9,18,19,19,23]
Min: 1
Q1: 3
Q2: 18
Q3: 19
Max: 23
Find the mad and the range: [19,3,6,7,1,13,13,11,9,8]
Mad: 4
Range: 18
identify which one is linear and which one is non-linear
y = 3x^2 + 5x - 2
y = 2x + 3
linear:y = 2x + 3
non-linear:y = 3x^2 + 5x - 2
What does the slope of a line in a linear model represent?
The slope of a line in a linear model represents the rate of change
How do you identify outliers using the IQR method?
Outliers can be identified using the IQR method - any data point that falls below Q1 - 1.5 * IQR or above Q3 + 1.5 * IQR as an outlier.
What is the 5 number summary?
A set of descriptive statistics that provides information about a dataset.
When would you use the mean versus the median to describe a dataset?
The mean is typically used when the data is normally distributed or symmetric, while the median is more appropriate when the data is skewed or contains outliers.
How is the strength of a relationship measured in statistics?
The strength of a relationship is often measured using correlation coefficients
What is the slope and Y-intercept of this linear equation? y=2x + 3
Slope=2
Y-intercept=3
How can outliers impact statistical analysis and interpretation of data?
Outliers can mess up statistical analysis by making results inaccurate, changing the average and spread of data, and affecting how variables are related.
Find the 5 number summary: [5,7,9,8,20,6,12,5,10,15]
Min: 5
Q1:6
Q2: 8.5
Q3:12
Max:20
What is the mode and how is it determined?
The mode is the value that appears most frequently in a dataset. It can be determined by observing which value occurs with the highest frequency.
Can a strong relationship between variables always be considered as causation?
No, a strong relationship between variables does not imply causation. Correlation does not prove causation, as there may be other factors influencing the relationship.
What happens if the y-intercept is negative in a linear model?
If the y-intercept is negative it means that the line crosses the y-axis below the origin (0,0) on a graph.
Find the outliers of the following data set using the IQR method: 22,38,40,42,41,24,28,29,35,34,33,9,32,25,23,27,26,31,37,30,39,36
Q1=26
Q3=37
37-26=11
IQR=1.5*11=16.5
Q1=26-16.5=9.5
Q3=37+16.5=53.5
The outlier is 9 because the IQR of Q1 is 9.5
calculate the five number summary of the following numbers: [12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34]
Minimum: 12
Q1: 17
Q2: 23
Q3: 29
Maximum: 34
How does the standard deviation measure variability in a dataset?
The standard deviation measures the average amount of deviation or dispersion from the mean in a dataset. It provides a measure of how spread out the data points are from the mean.
A researcher is interested in examining the correlation between students' time spent using social media and their academic performance, the researcher hypothesizes that spending more time on social media is associated with lower academic performance. What type of correlation is this?
This will be a negative correlation. As one variable (time spent on social media) increases, the other variable (academic performance) decreases.
Write the following in Y-intercept form and find the slope: 3x - 2y = 7
y=3/2x+-7/2
Slope=3/2
What are some alternative methods for identifying outliers besides the IQR method?
Some alternative methods for identifying outliers include the boxplots, scatter plots, and clustering techniques.