How to Show Mean on Boxplot Using Matplotlib

Boxplots are a great way to visualize the distribution (min, max, quartiles, and median) of a dataset. However, they do not display the mean of the dataset by default.

In Python matplotlib, you can use the showmeans parameter from the boxplot function to show the mean on the boxplot.

The following example explains how to plot a boxplot and show the mean on it using the matplotlib Python package.

Let’s create a random dataset for three groups using the rand function from NumPy,

# import packages
import numpy as np

data = np.random.rand(50, 3)

This dataset contains 50 observations with different means for three groups.

Create a boxplot using the boxplot function and show the mean value on the boxplot by passing the showmeans=True parameter.

# import packages
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.boxplot(data, showmeans=True)
plt.xlabel("groups")
plt.ylabel("values")
plt.show()

/images/posts/71_boxplot_with_mean.png
Boxplot with the mean value shown as triangle

In the above boxplot, the green triangle at the boxplot represents the mean of the dataset. The default orange lines represent the median of the dataset.

You can also add a line (similar to the median) for the mean on the boxplot instead of plotting triangle shapes.

# import packages
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.boxplot(data, showmeans=True, meanline=True)
plt.xlabel("groups")
plt.ylabel("values")
plt.show()

/images/posts/71_boxplot_with_mean_line.png
Boxplot with the mean value shown as dotted green line