Proportion Test in Python: Similar to R prop.test
Proportion test is used for comparing the proportions (e.g. number of successes) in two or more groups to determine if there are significant differences between these groups.
In Python, you can use the chi2_contingency
function from the scipy package to perform a proportion test
similar to the prop.test
function in R.
chi2_contingency
function performs a chi-squared test of independence (similar to the prop.test
function in R) based on proportions provided
in the contingency table.
prop.test
is exactly similar to the chi-squared test on a 2x2 contingency table.The basic syntax for chi2_contingency
for proportion test:
# import package
from scipy.stats import chi2_contingency
chi2, p, dof, exp_prop = chi2_contingency(contingency_table)
The contingency_table
contains the proportion for the two groups.
The following examples demonstrate how to use the proportion test (similar to prop.test
function in R) to compare the proportions
in two different groups.
Sample dataset
Suppose a marketing survey is completed in two cities (A and B) with 500 individuals for a purchase of the product.
In city A, 300 individuals purchased the product, and in city B, 400 individuals purchased the product.
The number of successes in cities A and B are 300 and 400, respectively.
# import package
import numpy as np
# Create the contingency table
contingency_table = np.array([[300, 200],
[400, 100]])
Hypothesis
We will test the following Null and Alternative hypotheses.
Null Hypothesis (H0): No difference in the proportions of individuals who purchased the product of the two cities.
Alternative Hypothesis (Ha): There is a difference between the proportions of individuals who purchased the product of the two cities.
Proportion test using chi2_contingency
Now, perform the proportion test using the chi2_contingency
function from the scipy package. This function is similar to
prop.test
function in R.
# import package
from scipy.stats import chi2_contingency
# perform the chi-square test for proportion
chi2, p, dof, exp_prop = chi2_contingency(contingency_table)
print(chi2, p, dof, exp_prop)
# output
46.67142857142857 8.394401757688147e-12 1 [[350. 150.]
[350. 150.]]
The Chi-Square Statistic: 46.67; and p value: < 0.05
As the p value (< 0.05) is less than the significance level alpha (0.05), we reject the null hypothesis.
We conclude that there is a significant difference in the proportion of products purchased by individuals in two cities.
If you want to perform one-sample proportion test, please read our article on the one-sample proportion Z test.