Calculate Probability From Normal Distribution in Python

2024-07-07 380 words 2 minutes

Contents

You can use the cdf function, which is a cumulative distribution function (CDF), from the SciPy Python package to calculate the probability (p value) from the normal distribution given the mean and standard deviation of the distribution.

The CDF represents the probability that a random variable from the given distribution will be less than or equal to a specific value.

The following examples explain how to calculate the probability given mean and standard deviation using the cdf function from the SciPy package.

Example 1 (probability less than or equal to)

Suppose you have normally distributed data with a mean of 50 and a standard deviation of 10.

You want to calculate the probability that a random value is less than or equal to 40.

Here, you can use the cdf function to calculate the probability.

# load packages
from scipy.stats import norm

norm.cdf(x=40, loc=50, scale=10)

# output
0.158655253931457

The probability that the random value from a normal distribution is less than or equal to 40 is 0.1586.

Additionally, you can also NormalDist function from the statistics package

# load packages
from statistics import NormalDist

NormalDist(mu=50, sigma=10).pdf(40)

# output
0.15865525393145713

Example 2 (probability greater than or equal to)

Suppose you have normally distributed data with a mean of 100 and a standard deviation of 20.

You want to calculate the probability that a random value is greater than or equal to 90.

Here, you can use the cdf function to calculate this probability.

# load packages
from scipy.stats import norm

1 - norm.cdf(x=90, loc=100, scale=20)

# output
0.6914624612740131

The probability that the random value from a normal distribution is greater than or equal to 90 is 0.69146.

Example 3 (probability within range)

Suppose you have normally distributed data with a mean of 200 and a standard deviation of 50.

You want to calculate the probability that a random value that is between 400 and 500.

Here, you can use the cdf function to calculate this probability.

# load packages
from scipy.stats import norm

norm.cdf(x=500, loc=200, scale=50) - norm.cdf(x=400, loc=200, scale=50)

# output
3.1670255245419554e-05

The probability that the random value from a normal distribution will fall between 400 and 500 is 3.1670255245419554e-05.

Please visit this article, if you want to learn how to calculate the probability from the normal distribution in R.