A visual interpretation of numerical data showing the number of data points falling within a specified range of values
A histogram is a column or a bar chart data analytic tool that shows the distribution of continuous numerical data. By counting the values in a dataset, a histogram groups them into bins (intervals) based on the frequency with which they occur.
Components of Histogram Chart
A histogram chart is made of five different parts:
- Title - The title of the histogram describes the represented data in the continuous numerical data.
- X-axis -The X-axis represents the scale for a group of intervals formed based on data for which we need to find the frequency in our histogram.
- Y-axis - The Y-axis represents the scale that shows the occurrence of a value within a group of intervals corresponding to the X-axis.
- Columns/Bars - Bars represent the height and width of the data. Height portrays how many times a particular value falls within an interval while the width represents the area covered.
- Legend - The legend describes what kind of data is represented in the histogram.
Difference between histogram & bar chart
Don't confuse yourself between a histogram and a bar chart. Though they may both look similar,
The differences between them are illustrated in the table below:
|A histogram shows the frequency of numerical data in the form of bars.||A bar chart shows the comparison between different categories of data.|
|The bars cannot be reordered.||The bars can be reordered.|
|The width of each bar doesn't need to be the same.||The width of each bar will be the same.|
|Values are grouped together and are considered as intervals.||Values are not grouped together and are individual entities.|
How to create a histogram
There are several methods with which you can create a histogram in Excel. We will explore each method to find the one that works the best for you.
Method 1: Using the Data Analysis ToolPak
Before we use this method to create a histogram, we need to ensure that you have activated the Data Analysis ToolPak in Excel. To do this, follow the steps given below:
- Click on the File tab and navigate the mouse to select Options.
- This will open up the dialog box for Options. Click on Add-ins to check whether you have the Data Analysis ToolPak activated.
- If the ToolPak is active, it will appear under the 'Active Applications Add-ins'. However, as we can see here, our Analysis ToolPak is still inactive.
- Now select the ToolPak and click on Go.
- This will open up another dialog box called Add-Ins. Now you need to tick the box for Analysis ToolPak and click on Ok.
- Once activated, you will find the data analysis tool in the Data tab at the end in the 'Analyze' section.
Next, we will make a histogram for the salaries earned by employees at company XYZ. The data for employee salary is as below:
After clicking on the Data Analysis tool from Data Tab, we will get the dialog box that displays various options of analysis tools. Since our objective is to create a histogram, we will select it and click on Ok.
Next, Excel asks us to input different parameters such as 'Input Range' and 'Bin Range' to create the histogram. Our input range will be the salaries represented in column C. But what exactly is a bin range?
If you want a really simple method to create the bin range, first you need to find the minimum and maximum value from column C. By using the MIN and MAX formula in cell F4 and F5, i.e., =MIN(C3:C22) and =MAX(C3:C22) respectively, you will get the lower limit and upper limit of salaries.
Next is the creation of a bin range. A bin range is an interval by which our salary data will be grouped. The interval can be of $2,000, $5,000 or even $10,000. It entirely depends on us. We will create a bin range with an interval of $5,000, beginning from $35,000 and ending with $80,000. With bin range in place as well, now our spreadsheet looks as illustrated below:
Now we will go ahead and reference the data to create our histogram. The range C2:C22 will be our 'Input Range' while the range E2:E12 will be our 'Bin Range'. Next, tick the box that says 'Labels' as well. The final and most crucial selection you need to make is to tick the box that says 'Chart Output'. Without it, Excel won't display the histogram to you.
You can either opt to have the histogram in a new spreadsheet, a different workbook, or in the desired cell of the same spreadsheet. For example, we want our output in cell M2. Now click on Ok.
You will get the result as:
As you can see, based on our bin range, Excel calculates the frequency of salary that falls into those intervals. So if you add all the values in the 'Frequency' column, you will get a total of 20, which is equal to the number of employees in company XYZ.
On the left side is our histogram based on the bin range representing the X-axis and the frequency of occurrence of different salaries in 'bin range' intervals on the Y-axis.
You can also include a cumulative percentage line in your histogram by selecting the 'Cumulative Percentage option in the histogram setting. After ticking the checkbox, your histogram will look as:
As you have noticed, using the Data Analysis ToolPak makes it really easy to create a histogram in Excel. However, in case some of the data in your table changes, you will need to create a new histogram. Other methods will help you create dynamic histograms in Excel, such as using formulas or pivot tables.
Method 2: Using formulas
You can create dynamic histograms in excel using the FREQUENCY or COUNTIFS functions that automatically update the histogram when you change values in the table. For example, assume we create a histogram for the salaries earned by employees at company XYZ. The data for employee salary along with the bin range is as below:
Using the FREQUENCY function, we will check the occurrence of a value in the bin range and draw our histogram based on the summary table.
The FREQUENCY function, as the name suggests, checks the occurrence of a value within a specified range while ignoring the blank or text values.
The syntax for the FREQUENCY function is:
data_array = range of values for which we need to calculate the frequency
bins_array = the range of interval to check the occurrence of values from data_array
Using the formula =FREQUENCY(C4:C23, E4:E13) in cell F4, you will get an array that spills up to cell F14.
This completes your frequency table using the FREQUENCY function. However, please note that we365 to get our results. Since =FREQUENCY(C4:C23, E4:E13) is an array formula, press Ctrl + Shift + Enter to get the result in other Excel versions.
WSO’s Pro Tip
Online Excel is an excellent tool if you need array-based formulas results. For example, you could have used the FREQUENCY function in online Excel and then copied the table into your spreadsheet
The COUNTIFS function works similar to the FREQUENCY function but is a bit more complicated than the latter since it consists of three different COUNTIFS formulas that you need to use.
The syntax for the COUNTIFS function is:
criteria_range1 = the range of values that needs to be evaluated for the criteria
criteria1 = condition upon which the range of values will be evaluated.
The formula that you will use for the first bin(interval) will be =COUNTIFS($C$3:$C$22, "<="&$E3) while the formula to calculate the value for over the bin will be =COUNTIFS($C$3:$C$22, ">"&$E12).
This is what the frequency table currently looks like:
Next we will use the final formula =COUNTIFS($C$3:$C$22, ">"&$E3, $C$3:$C$22, "<="&$E4) that will give the values for range F4:F11 such that our result using both the formulas is as:
Now, all you need to do is enter a column chart. Click on the Insert tab and then select the Column chart.
Honestly, you might have to play around a bit by changing the data sources, changing the scale of the Y-axis, renaming the title of the chart but the final result that you will get is as:
You can check out our article on the secondary axis, which covers the section on how to change the data source for the chart.
Method 3: Using a Pivot table
Next, we will see how to make histograms using pivot tables. This is our preferred method since it eliminates the time-consuming task of finding the frequency of values as per the bin intervals.
To insert a Pivot table in the spreadsheet, use the keyboard shortcut of Alt + N + V + T after selecting the data. Press on OK.
Now we need to find the frequency of values in column C. In Pivot table fields, add 'Salary' for both Rows and Values area.
The spreadsheet should look as illustrated below:
Now, there are two crucial things that you must do. First, remember the COUNTIFS formula we had used to return the frequency of values? We will apply a similar approach to get the frequency for each value in range A4:A23. To do that, right-click on any value in the 'Sum of Salary' column and then select the Value Field Settings option.
Select 'Count' to summarize your value fields by count instead of sum. Press Ok. Your table will update as:
Twenty values equal to twenty counts. So far, so good. But we still haven't made our bin intervals. So right-click on any cell in the 'Row Labels' column and select the 'Group' option to add bin intervals. The parameters that we have selected for the group are:
By grouping the values in the 'Row Labels' column, we have formed our bin intervals while the 'Count of Salary' column corresponds to our frequency of salary. The final table will look as illustrated below:
You already know what the next step is! Hit that keyboard shortcut of Alt + N + C + Enter to insert the histogram after selecting the data from the table.
Customizing the histogram
You can remove the spaces between the bars since that's how most traditional histograms look. A different color can also be a preference for people while working on detailed reports. The different customizations that you can do for the histogram are:
Changing the bar color
You can change the bar's color in just a few simple steps. First, select the histogram such that all the columns are highlighted.
Right-click on the column and select the 'Fill' option and select any color that you need to fill the column with. Similarly, you can also change the background color by following the same procedure.
Removing space between the bars
To remove the space between the bars, select all the histogram columns. Then, right-click on any column and select the 'Format Data Series' option.
On the right side of the Excel, you will get an option to change the 'Gap Width', which you need to change to zero. After making the changes, your histogram should look as illustrated below: