Last Updated on 16 September 2023
A histogram is a chart (in effect, a bar chart), where instead of the usual 2 or more variables, there is only one variable that you are measuring. You will often in Six Sigma come across a large amount of data for one variable (usually the variable you are trying to control), and it can be hard to interpret this large amount of data. A histogram makes this easy to see by showing the data in an easily to interpret format.
This will easily show:
- Where certain values of the data make up unexpectedly high or low proportions of the data
- Where there is a skew (e.g. low values occur more often than high values)
- The spread – is there high or low variation in the results
- How many data points there are far away from the others, that could otherwise skew the findings
- The shape of the peak – is it tight over a small area, large over a lot of ‘buckets’ (high variation), or even more than one peak?
This can give you a lot of information about your data that would be otherwise difficult to receive.
How to make a histogram
- Measure your variable as many times as is practical. You want at very least 25 measurements, but you will get a better result if you have more than this.
- Find the range of your data (largest to smallest value)
- Split the range into equal buckets (range of the variable) (for my example below I have 20 ‘buckets’ of 5mm each, covering the range 0 to 200mm – you can change the number to get an easily interpretable chart)
- Plot the data in a bar chart with a bar for each ‘bucket’, with the variable on the x (horizontal) axis and the frequency on the y (vertical) axis.
If you have cut lengths of a metal, you need to know usual length and consistency. You can create a list of all the values, even put them in order, and find an average, but that still doesn’t give you that much information. Putting them in a graph would also be meaningless (they all have different lengths, so all the frequencies would be 1) and wouldn’t add value. You therefore create a histogram – in my example I’ve put them into 10mm buckets:
From this you can easily see that a large number are just over 100mm long, and it clearly shows the spread and variance much quicker than other methods of showing the data.