How to Convert the Summary Output in data frame in R
In R, the summary()
function is very useful for generating the
summary statistics ( minimum, 1st quartile, median, mean, 3rd quartile, and maximum values) for numerical vector and data frame.
The output from a summary()
function is in table format and is not convenient to access the values of the summary statistics for downstream analysis.
You can use the following methods to convert the output from the summary()
function into a data frame format.
Method 1: unclass()
function
data.frame(unclass(summary(df)), check.names = FALSE)
Method 2: do.class()
and lapply()
functions
data.frame(do.call(cbind, lapply(df, summary)))
The following examples demonstrate how to use unclass()
and do.class()
functions to convert
output from the summary()
function into a data frame format.
Create a sample data frame,
# create sample data frame
set.seed(1234)
df <- data.frame(
col1 = rnorm(5, mean = 5),
col2 = rnorm(5, mean = 10)
)
# view data
head(df)
col1 col2
1 3.792934 10.506056
2 5.277429 9.425260
3 6.084441 9.453368
4 2.654302 9.435548
5 5.429125 9.109962
Get summary statistics
summary(df)
col1 col2
Min. :2.654 Min. : 9.110
1st Qu.:3.793 1st Qu.: 9.425
Median :5.277 Median : 9.436
Mean :4.648 Mean : 9.586
3rd Qu.:5.429 3rd Qu.: 9.453
Max. :6.084 Max. :10.506
# data type
class(summary(df))
[1] "table"
Example 1: unclass()
function
Now, we will use the unclass()
function to get the output in a data frame format.
new_df = data.frame(unclass(summary(df)), check.names = FALSE)
new_df
col1 col2
X Min. :2.654 Min. : 9.110
X.1 1st Qu.:3.793 1st Qu.: 9.425
X.2 Median :5.277 Median : 9.436
X.3 Mean :4.648 Mean : 9.586
X.4 3rd Qu.:5.429 3rd Qu.: 9.453
X.5 Max. :6.084 Max. :10.506
# data type
class(new_df)
[1] "data.frame"
From the above example, you can see that the output from the summary()
function is converted into a data frame format
using the unclass()
function.
Example 2: do.class()
and lapply()
functions
Similarly, you can also use do.class()
and lapply()
functions to get the output in a data frame format.
new_df <- data.frame(do.call(cbind, lapply(df, summary)))
col1 col2
Min. 2.654302 9.109962
1st Qu. 3.792934 9.425260
Median 5.277429 9.435548
Mean 4.647646 9.586039
3rd Qu. 5.429125 9.453368
Max. 6.084441 10.506056
# data type
class(new_df)
[1] "data.frame"
From the above example, you can see that the output from the summary()
function is converted into a data frame format
using the do.class()
and lapply()
functions.