Table of Contents

# The normal distribution.

We describe normal distribution, probably the most used in statistics, its properties and the characteristics of the standard distribution.

The dictionary says that a bell is a simple device that makes a sound. But a bell can be much more. I think there’s even a plant with that name and a flower with its diminutive.

But undoubtedly, the most famous of all bells is the renowned **Gauss’ bell curve**, the most beloved and revered by statisticians and other species of scientific.

## Let’s place us into context: the Gauss’ bell

But, what is a bell curve?. It’s nothing more, nor less, than a probability density function. Put another way, it is a continuous probability distribution with a symmetrical bell-shape, hence the first part of its name. And I say the first part because the second one is more controversial because it is not quite clear that Gauss is the father of the child.

It seems that the first who use this density function was somebody named Moivre, who was studying was happened to a binomial distribution when the sample size is large. Yet another of the many injustices of History, the name of the function is associated with Gauss, who used it some 50 years later to record data from his astronomical studies. Of course, for defense of Gauss, some people say the two of them discovered the density function independently.

## The normal distribution

To avoid controversy, we will call it from now on by its other name, different from Gauss’ bell: normal distribution. And it seems that it was so named because people used to think that most natural phenomena were consistent with this distribution. Later in time, it was found that there’re other distributions that are very common in biology, such as the **binomial **and Poisson’s.

As it happens with any other density function, the utility of normal curve is that it represents the probability distribution of occurrence of the random variable we are measuring. For example, if we measure the weights of a population of individuals and plot it, the graph will represent a normal distribution. Thus, the area under the curve between two given points on the x axis represents the probability of occurrence of those values. The total area under the curve is equal to one, which means that there’s a 100% chance (or a probability of one) of occurrence of any of the possible values of the distribution.

There’re infinite different normal distributions, all of them perfectly characterized by its mean and standard deviation. Thus, any point in the horizontal axis can be expressed as the mean plus or minus a number of times the standard deviation and its probability can be calculated using the formula of the density function, which I dare not so show you here. We can also use a computer to calculate the probability of a variable within a normal distribution, but what we do in practice is something simpler: to standardize.

## Standard normal distribution

The **standard normal distribution** is the one that has a mean of zero and a standard deviation of one. The advantage of the standard normal distribution is twofold. First, we know its distribution of probabilities among different points on the horizontal axis. So, between the mean plus or minus one standard deviation are 68% out of the population, between the mean and plus or minus two deviations are 95%, and between three standard deviations 99% out of the population, approximately.

The second advantage is that any normal distribution can be transform into a standard one, simply subtracting the mean to the value and dividing the result by the standard deviation of the distribution. We came up so with the z score, which is the equivalent of the value of our variable in a standard normal distribution with mean zero and a standard deviation of one.

So, you can see the usefulness of it. We do not need software to calculate the probability. We just standardize and use a simple probability table, if we do not know the value by heart. Moreover, the thing goes beyond.

Thanks to the magic of the **central limit theorem**, other distributions can be approximated to a normal one and be standardized to calculate the probability distribution of their variables. For example, if our variable follows a binomial distribution we can approximated it to a normal distribution when the sample size is large. In practice, when np and n(1-p) are greater than five. The same applies to the Poisson’s distribution, which can be approximated to a normal when its mean is greater than 10.

And magic is twofold because besides of being able to avoid the use of complex tools and allow us to easily calculate probabilities and confidence intervals, it should be noted that both binomial and Poisson’s distributions are discrete mass functions, while normal distribution is a continuous density function.

## We’re leaving…

And that’s all for now. I only want to say that there’re other continuous density functions different from normal distribution and that they can also be approximated to a normal when the sample is large. But that’s another story…