Contents

How to Get Count and Mean with pandas GroupBy

The pandas groupby function is useful for statistical analysis of the group-specific data in the pandas DataFrame.

You can use the pandas groupby.describe() and groupby.agg() functions to get the count and mean together for groups in a DataFrame.

The following examples explain how to get group-wise count and mean together for a pandas DataFrame using groupby.describe() and groupby.agg() functions.

Using groupby.describe() function

Create a sample pandas DataFrame,

# import package
import pandas as pd

df = pd.DataFrame({'col1': ['X', 'X', 'Y', 'Y', 'Z', 'Z'], 	
	'col2': [100, 104, 200, 205, 300, 302]})

# view DataFrame
df

  col1  col2
0    X   100
1    X   104
2    Y   200
3    Y   205
4    Z   300
5    Z   302

In this DataFrame, col1 contains the various groups and col2 contains their values.

You can use the groupby.describe() function to get the count and mean as below,

df.groupby(["col1"])['col2'].describe()[["count", "mean"]].reset_index()

# output

 col1  count   mean
0    X    2.0  102.0
1    Y    2.0  202.5
2    Z    2.0  301.0

In the above example, we used the groupby.describe() function to get the count and mean together in a single DataFrame. The reset_index() function is used for resetting the index of the DataFrame.

If you want to get complete summary statistics, please read this article.

Using groupby.agg() function

You can also use the groupby.agg() function for getting count and mean together.

Create a sample pandas DataFrame,

# import package
import pandas as pd

df = pd.DataFrame({'col1': ['X', 'X', 'Y', 'Y', 'Z', 'Z'], 	
	'col2': [100, 104, 200, 205, 300, 302]})

# view DataFrame
df

  col1  col2
0    X   100
1    X   104
2    Y   200
3    Y   205
4    Z   300
5    Z   302

You can use the groupby.agg() function to get the count and mean as below,

df.groupby(["col1"])['col2'].agg(["count", "mean"]).reset_index()

# output

  col1  count   mean
0    X      2  102.0
1    Y      2  202.5
2    Z      2  301.0

In the above example, we used the groupby.agg() function to get the count and mean together in a single DataFrame. The reset_index() function is used for resetting the index of the DataFrame.