Two-sample t-test with Unequal Sample Sizes in R
The two-sample t-test with unequal sample sizes can be performed using the built-in t.test()
function from
in R.
In case of unequal sample sizes, you should check the assumption of the equality of variances (homoscedasticity).
If the variances are not equal you should perform the Welch’s t-test (which does not assume equal variances between the two samples).
You should use Student’s two-sample t-test with unequal sample sizes when variances are equal
t.test(sample1, sample2, var.equal = TRUE)
var.equal
parameter, t.test()
will perform a two-sample t-test assuming non-equal
variances between two samples.You should use Welch’s t-test with unequal sample sizes when variances are not equal
# Welch t-test
t.test(sample1, sample2, var.equal = FALSE)
Welch’s t-test is appropriate for unequal sample sizes when variances are not equal as it adjusts for differences in sample size and variance.
The following examples demonstrate how to perform a two-sample t-test with unequal sample sizes using the built-in t.test()
function in R.
Create dataset
Create datasets with unequal sizes for two samples ,
sample1 <- c(28, 35, 45, 65, 44, 56, 40, 42, 35, 34, 44)
sample2 <- c(15, 10, 40, 25, 26, 21)
The sample1 has 11 observations whereas the sample2 has 6 observations.
Check assumption of equality of variances (homoscedasticity)
Before performing the two-sample t-test with unequal sample sizes, you should check the assumption of the equality of variances.
You can use Levene’s test to assess whether the variances of the two samples are equal
# load package
library(car)
leveneTest(c(sample1, sample2),
group = factor(rep(c("sample1", "sample2"), c(length(sample1), length(sample2)))))
# output
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 2e-04 0.9897
15
As the p value (0.9897) from Levene’s test is greater than significance level (0.05), you would fail to reject the null hypothesis of equal variances i.e. variances are equal between the two samples.
Perform two-sample t-test
As the variances are equal between two unequal sizes samples, you should perform the two-sample t-test using t.test()
function
(assuming equal variance).
t.test(sample1, sample2, var.equal = TRUE)
# output
Two Sample t-test
data: sample1 and sample2
t = 3.715, df = 15, p-value = 0.002074
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
8.402563 31.021680
sample estimates:
mean of x mean of y
42.54545 22.83333
As the p value (0.002) from the two-sample t-test is less than a significance level (0.05), you should reject the null hypothesis and conclude that there is a statistically significant difference between the means of the two samples with unequal sizes.