row 1
row 2
row 3
row 4
row 5

Techniques in Quantitative Data Analysis:

  • Descriptive statistics
  • Inferential statistics
  • Regression analysis
  • Correlation analysis
  • Hypothesis testing
  • T-tests
  • Analysis of variance (ANOVA)

Descriptive Statistics

  • Mean (Average): Mean=Sum of valuesNumber of valuesMean=Number of valuesSum of values

  • Median: For an odd number of values, the median is the middle value when the data is sorted. For an even number of values, the median is the average of the two middle values when the data is sorted.

  • Mode: The mode is the value that appears most frequently in the dataset.

  • Variance: Variance=∑(��−Mean)2Number of valuesVariance=Number of values∑(Xi−Mean)2

  • Standard Deviation: Standard Deviation=VarianceStandard Deviation=Variance



A t-test is a statistical test used to determine if there is a significant difference between the means of two groups. It is commonly employed when comparing means from two independent samples.


Standard Deviation

Determine the standard deviation using the variance from the previous example. Standard Deviation=9=3Standard Deviation=9=3


Purpose of Correlation

Correlation measures the strength and direction of a linear relationship between two variables. It helps identify if and how changes in one variable are associated with changes in another.


Difference between Quantitative and Qualitative Data:

  • Quantitative data consists of numerical values and can be measured and counted (e.g., height, weight, income).
  • Qualitative data consists of non-numerical information and is often categorical, capturing qualities or characteristics (e.g., colors, opinions).


  • Identify the mode in the dataset: 5,8,12,8,16,20,85,8,12,8,16,20,8.
  • The mode is 8, as it appears more frequently than any other value.


Calculate the variance of the dataset: 12,15,18,21,2412,15,18,21,24. Variance=(12−18)2+(15−18)2+(18−18)2+(21−18)2+(24−18)25=9Variance=5(12−18)2+(15−18)2+(18−18)2+(21−18)2+(24−18)2=9


Frequency Distribution

  • Suppose you have a dataset of test scores: 85,92,78,95,88,92,78,90,92,8585,92,78,95,88,92,78,90,92,85.
  • Create a frequency distribution to show how many times each score occurs.

Hypothesis Testing (t-test)

Perform a t-test to determine if there is a significant difference between the means of two groups.


Expected Value 

Consider a game where you win $10 with a 1/6 probability and lose $5 with a 5/6 probability. Calculate the expected value.


Mean (Average)

Calculate the mean of the following dataset: 10,15,20,25,3010,15,20,25,30. Mean=10+15+20+25+305=20Mean=510+15+20+25+30=20


Mean, Median, and Mode

  • Mean: The average of a set of values.
  • Median: The middle value in a sorted list of numbers.
  • Mode: The value that appears most frequently in a dataset.

Quantitative Data Analysis

Quantitative data analysis involves the use of statistical methods to analyze numerical data. It is used to uncover patterns, trends, relationships, or associations within the data and to draw meaningful conclusions.


Normal Distribution

A normal distribution is a symmetric, bell-shaped probability distribution. In a normal distribution, the mean, median, and mode are equal, and specific percentages of the data fall within standard deviations of the mean.



Calculate the 75th percentile of a dataset, representing the value below which 75% of the data falls.



Find the median of the dataset: 8,12,16,20,248,12,16,20,24.

Since there's an odd number of values, the median is the middle value, which is 16.


Discrete vs. Continuous Data

  • Discrete data can only take specific, distinct values and cannot be subdivided indefinitely (e.g., whole numbers).
  • Continuous data can take any value within a given range and can be subdivided into smaller and smaller parts (e.g., height, weight).

Regression Analysis 

Regression analysis is a statistical method used to examine the relationship between one dependent variable and one or more independent variables. It helps in understanding the strength and nature of the relationship.


Hypothesis Testing

Hypothesis testing is a statistical method used to make inferences about a population based on a sample of data. It involves formulating a hypothesis, collecting data, and assessing whether the evidence supports or contradicts the hypothesis.


Inferential Statistics

  • Hypothesis Testing (t-test): �=Mean differenceStandard Error of the differencet=Standard Error of the differenceMean difference

  • Confidence Interval: Confidence Interval=Mean±(Critical Value×Standard Error)Confidence Interval=Mean±(Critical Value×Standard Error)


Data Set in Quantitative Analysis

A data set is a collection of data points or observations. In quantitative analysis, a data set typically includes numerical values that can be analyzed statistically.


Dependent vs. Independent Variable

  • The dependent variable is the outcome being studied and is affected by the independent variable.
  • The independent variable is the factor that is manipulated or controlled to observe its effect on the dependent variable.

Importance of Data Cleaning

Data cleaning involves identifying and correcting errors or inconsistencies in datasets. Clean data is crucial for accurate and reliable analysis, preventing errors and ensuring meaningful results.


Role of a Data Analyst

A data analyst gathers, processes, and analyzes data to provide insights and support decision-making. They use statistical and analytical techniques to interpret complex datasets.