Last Updated on 13 September 2023
This is an important concept in probability and statistics, useful in the analysis phase of Six Sigma DMAIC. When you are using data, it’s usually not possible to measure every product you’ve ever made or every item of data that you can find, so instead we use sampling. Instead, you take a small number of the items and use those to predict how the whole population looks and acts.
What is the central limit theorem?
The central limit theorem is that a distribution of the means of samples will (virtually alway) approximately follow the normal distribution, even if the distribution of the actual measurements follow a different distribution. As the sample sizes increase, it will become more aligned with the normal distribution.
It is often shown in Six Sigma by the symbols you use:
- xbar is the sample mean, and mu is the population mean
- s is the sample standard deviation, and sigma is the population standard deviation.
You try to estimate the population figures using the sample figures.
The estimate for the population is usually taken to be the sample mean, but due to the small sample you can’t be that sure. You therefore introduce the concept of a confidence interval.
Confidence interval
This is the distance that you predict the population mean will fall into away from the sample mean. You will therefore end up with a range that the actual mean is likely to fall into.
Hypothesis testing
From the sample you will want to know whether your process has improved or needs to be improved. The Hypothesis testing is a use of central limit theorem
Control charts
These are used to see if the process is in control. You plot your measurements between the control limits allowed for the process. These limits are calculated by xbar +/- A2 Rbar
As the sample size increases, the control limit width decreases as you would expect the average to be closer to the actual figure. This is because with a few samples, the variance can be large as you may pick an especially large or small item. As you choose a bigger sample, the mean of that sample is expected to converge with the actual mean.
Leave a Reply