Contents

Calculate Mean of Rows on Selected Columns in pandas DataFrame

In pandas DataFrame, you can use the mean() function as shown below to calculate the mean of row values for selected columns.

df[['col1', 'col2']].mean(axis=1)

Where, df is the pandas DataFrame with selected columns (col1 and col2).

The axis=1 parameter returns the mean along the horizontal axis, that is, the mean of the rows on selected columns.

The following examples demonstrate how to calculate the mean of rows on selected columns in pandas DataFrame.

Example 1 (DataFrame without missing values)

Create a sample pandas DataFrame,

# import package
import pandas as pd

df =  pd.DataFrame({'Age': [35, 45, 55, 40, 50],
					'Height': [5, 7, 6, 5.5, 7.2],
					'Weight': [55, 67, 68, 95, 65]})

print(df)

   Age  Height  Weight
0   35     5.0      55
1   45     7.0      67
2   55     6.0      68
3   40     5.5      95
4   50     7.2      65

Now, calculate the mean of the rows on the selected columns (Age and Weight),

df['mean'] = df[['Age', 'Weight']].mean(axis=1)


print(df)

   Age  Height  Weight  mean
0   35     5.0      55  45.0
1   45     7.0      67  56.0
2   55     6.0      68  61.5
3   40     5.5      95  67.5
4   50     7.2      65  57.5

Example 2 (DataFrame with missing values)

If there are missing values (NaN) in the selected columns, they will get dropped before calculating the mean.

Create a sample pandas DataFrame with missing values,

# import package
import pandas as pd
import numpy as np

df =  pd.DataFrame({'Age': [35, 45, 55, 40, np.nan],
					'Height': [5, 7, 6, 5.5, 7.2],
					'Weight': [55, 67, np.nan, 95, 65]})

print(df)

    Age  Height  Weight
0  35.0     5.0    55.0
1  45.0     7.0    67.0
2  55.0     6.0     NaN
3  40.0     5.5    95.0
4   NaN     7.2    65.0

Now, calculate the mean of the rows on the selected columns (Age and Weight),

df['mean'] = df[['Age', 'Weight']].mean(axis=1)


print(df)

    Age  Height  Weight  mean
0  35.0     5.0    55.0  45.0
1  45.0     7.0    67.0  56.0
2  55.0     6.0     NaN  55.0
3  40.0     5.5    95.0  67.5
4   NaN     7.2    65.0  65.0

You can see that mean of the row (2) is 55.0 which is calculated after dropping the missing value in the selected columns.