What is a colon (:)?
The method to display summary statistics of ALL features
What is .describe(include = 'all')?
Calculated by taking the sum of all numbers in a list and dividing by the number of elements
What is the mean?
Plot used to illustrate change over time
What is a line chart?
Data that can be counted or measured
What is quantitative or numerical data?
These characters allow us to check if two values are equivalent
What is ==?
Method to return a view of the dataframe without repeated observations
What is .drop_duplicates()?
Value retrieved by sorting all numbers and selecting the number in the middle
What is the median?
A plot showing the frequency of a quantitative variable
What is a histogram?
Numeric data that has a finite number of possible values
What is discrete data?
What is a method?
Function to combine data on a column or index
What is pd.merge() ?
Measures of central tendency and variability
What is descriptive statistics?
When a histogram has a tail to the right
What is positive skew?
Numeric data with an infinite number of values
What is continuous data?
An object stored within a class
What is an attribute?
Property to retrieve an entry by index
What is .iloc[]?
The square root of variance
What is the standard deviation?
A chart used to plot correlation between two or more variables
What is a scatterplot?
Numeric data with rank
What is ordinal data?
The order in which a computer executes statements
What is boolean masking?
Deduction of some population characteristic from a sample
What is inferential statistics?
Three characteristics every plot should have
Any of:
- Title
- Axis labels
- Appropriate scale/resolution
- Legend
- Accessible coloring (only used for signification)
- Easily interpretable
etc.
Rounding error that sometimes affects non-integer numerics
What is floating point error?