True or False: Matplotlib is open-source and freely available for anyone to use.
What is True?
What is the code to import Matplotlib into your Python script (or Jupyter Notebook)?
What is:
import matplotlib.pyplot as plt
This is the type of data that is well-suited for visualizing with an Area plot using Matplotlib.
What is... data where you want to visualize the cumulative or proportional data over a continuous range?
Examples:
- Time Series Data
- Proportional Data
- Stacked Area plots
- Hierarchical Data
These are insights provided by a Histogram plot of data.
What are...
shape, spread and central tendencies of a continuous numerical dataset?
This type of Data is well-suited for visualization in a bar chart.
What is... categorical data?
Each bar can represent a category or string data, and the height of the bar can correspond to a numerical value associated with that category.
I am the brilliant creator of Matplotlib.
Who is John D. Hunter?
What is the Python code to check what version of Matplotlib you have installed?
What is:
print(matplotlib.__version__)
This is the Matplotlib function that you would use to create a basic area plot.
What is...
plt.fill_between()
plt.area
plt(kind=area)
This type of data is NOT well suited for a Histogram plot.
What are... categorial or discrete data?
(These data types do not have a natural order or continuity and would be better visualized using bar charts; while time-related analyses may be best explored with line charts or density plots)
A histogram is best for continuous data that has a single variable.
This is the Matplotlib function to show a bar chart using Python.
What is... plt.bar(categories, values)
Example code:
import matplotlib.pyplot as plt
categories = ['Category A', 'Category B', 'Category C']
values = [10, 15, 20]
plt.bar(categories, values)
plt.show()
This is where the Matplotlib source code is found.
What is a GitHub repository?
By default, Python will draw this type of graph using the plot() function
What is a line graph from point to point?
These are used to fill multiple different areas on the same plot to distinguish between the areas on a plot.
What are different colors or patterns?
Python code that is 1 of the 2 ways to handle missing data in a Histogram plot.
What are either remove or impute the missing values... here's the code:
df.dropna() #removes rows with missing values
df.fillna(mean_value) #fills missing values with the mean of the column
This is how to customize colours of individual bars in a bar chart.
What is... specifying the color parameter within the bar function?
For example:
plt.bar(categories, values, color=colors)
Name one of the coding languages used to build Matplotlib that is not Python.
What are C, Objective-C or Javascript.
(some segments use these languages for Platform compatibility instead of Python)
This is the code that defines how to emphasize each argument (each point) that will be graphed on a chart.
What is a "marker" argument
Example: to set the marker as an asterisk use:
plt.plot(ypoints, marker = '*')
This parameter is useful when you have multiple overlapping areas and want to make a plot visually interpretable by changing the transparency of the filled area in an Area Plot.
What is 'alpha'?
0 (completely transparent) to 1 (completely opaque). The alpha parameter controls the visibility of a filled area, allowing underlying content or overlapping areas to show through.
Example:
plt.hist(data, bins=30, alpha=0.7, color='skyblue', edgecolor='black')
This parameter is used to specify the number of adjacent bars in a histogram.
What are... bins in the plt.hist() function
Example:
plt.hist(data, bins=20) #creates a histogram with 20 bins
This function can identify specific coordinate values directly on bars in your Matplotlib bar chart.
What are... adding labels to the bars using the text function?
Example:
plt.bar(categories, values)
for i, value in enumerate(values):
plt.text(i, value + 1, str(value), ha='center', va='bottom') #Adds labels to the bars using the text function
While it is not a requirement, this tool is often used to install Matplotlib?
What is pip?
Matplotlib can be included in many Python distributions, and in some cases, you might not need to explicitly install it using pip.
This is the Python function that displays multiple plots in one figure.
What is the subplot() function
Example:
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(1, 2, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.show()
This is the best approach to handle missing data in area plots.
What is... interpolate the missing values to fill the gaps and maintain continuity in the plot?
Example:
# when y has missing values represented as NaN
y_interpolated = np.interp(x, np.isnan(y), np.nan_to_num(y))
plt.fill_between(x, y_interpolated, color='skyblue', alpha=0.5, label='Interpolated Area')
This function, when set to 'True', normalizes the histogram so that the area under the histogram equals 1.
What is... the density parameter?
Normalizing a histogram to equal 1 with the density parameter provides a standardized representation that helps to compare histograms even when the number of data points varies.
Example: plt.hist(data, bins=30, density=True, alpha=0.7, color='blue', edgecolor='black')
The count parameter, when set to True, returns the count of data points in each bin.
This is one of 3 approaches to handle missing data in Bar Charts.
What is...
1. Remove missing data
2. Fill missing values (with data interpolation; this may cause bias)
3. Create a separate category for the missing data so that you can see the gaps