|


|

Histogram
A histogram is a basic graphing tool that displays the relative frequency
or occurrence of continuous data values showing which values occur most
and least frequently. It illustrates the shape, centering, and spread
of data distribution and indicates whether there are any outliers.
When should we use a histogram?
When you are unsure what to do with a large set of measurements presented
in a table, you can use a histogram to organize and display the data
in a more user-friendly format. A histogram will make it easy to see
where the majority of values fall in a measurement scale, and how much
variation there is. It is helpful to construct a histogram when you
want to do the following:
- Summarize large data sets graphically. When you
look at Viewgraph
2, you can see that a set of data presented in a table is not
easy to use. You can make it much easier to understand by summarizing
it on a tally sheet (Viewgraph
3) and organizing it into a histogram (Viewgraph
8).
- Compare process results with specification limits.
If you add the process specification limits to your histogram, you
can determine quickly whether the current process was able to produce
“good” products. Specification limits may take the form
of length, weight, density, quantity of materials to be delivered,
or whatever is important to produce the required results of a given
process.
- Communicate information graphically. The team members
can easily see the values which occur most frequently. When you use
a histogram to summarize large data sets, or to compare measurements
to specification limits, you are employing a powerful tool for communicating
information.
- Use a tool to assist in decision-making. As we
move along, you will see that the shapes, sizes, and the spread of
data have meanings that can help you in investigating problems and
making decisions. However, always bear in mind that if the data you
have in hand are not the most recent, or you do not know the manner
how the data were collected, it is a waste of time trying to chart
them. Measurements cannot be used for making decisions or predictions
when they were produced by a process that is different from the current
one, or were collected under unknown conditions.
What are the parts of a histogram?
As you can see in Viewgraph
1, a histogram is made up of five (5) parts:
- Title: The title briefly describes the information
that is contained in the histogram.
- Horizontal or X-axis: The horizontal or X-axis
shows you the scale of values into which the measurements fit. These
measurements are generally grouped into intervals to help you summarize
large data sets. Individual data points are not displayed.
- Bars: The bars have two important characteristics
-- height and width. The height represents the number of times the
values within an interval occurred. The width represents the length
of the interval covered by the bar. It is the same for all bars.
- Vertical or Y-axis: The vertical or Y-axis is the
scale that shows you the number of times the values within an interval
occurred. The number of times is also referred to as “frequency.”
- Legend: The legend provides additional information
that documents where the data came from and how the measurements were
gathered.
How do we develop a histogram?
There are many different ways to organize data and build histograms.
You can safely use any of them as long as you follow the basic rules.
The following scenario will be used as an example to provide data as
we go through the process of building a histogram step by step:
During sea trials, a ship conducted test firings
of its MK 75, 76mm gun. The ship fired 135 rounds at a target.
An airborne spotter provided accurate rake data to assess the
fall of shot both long and short of the target. The ship computed
what constituted a hit for the test firing as:
FROM 60 yards short of the target TO 300 yards
beyond the target |
| Step 1 |
Count the total number of data points you
have listed. Suppose your team collected data on the miss
distance for the gunnery exercise described in the example. The
data you collected was for the fall of shot both long and short
of the target. The data are displayed in Viewgraph
2. Simply counting the total number of entries in the data set
completes this step. In this example, there are 135 data points. |
| Step 2 |
Summarize your data on a tally sheet.
You need to summarize your data to make it easy to interpret. You
can do this by constructing a tally sheet. |
| |
- First, identify all the different values found in Viewgraph
2 (-160, -010…030, 220, etc.). Organize these values
from smallest to largest (-180, -120…380, 410).
- Then, make a tally mark next to the value every time that
value is present in the data set.
- Alternatively, simply count the number of times each value
is present in the data set and enter that number next to the
value, as shown in Viewgraph
3.
This tally helped us organize 135 mixed numbers into a ranked sequence
of 51 values. Moreover, we can see very easily the number of times
that each value appeared in the data set. Forming intervals of values
can summarize this data even further. |
| Step 3 |
Compute the range for the data set.
Compute the range by subtracting the smallest value in the data
set from the largest value. The range represents the extent of
the measurement scale covered by the data; it is always a positive
number. The range for the data in Viewgraph 8 is 590 yards. Subtracting
-180 from +410 obtains this number. The mathematical operation
broken down in Viewgraph
4 is:
+410 – (-180) = 410 + 180 = 590
Remember that when you subtract a negative (-) number from another
number it becomes a positive number. |
| Step 4 |
Determine the number of intervals required.
The number of intervals influences the pattern, shape, or spread
of your Histogram. Use the following table (Viewgraph
5) to determine how many intervals (or bars on the bar graph)
you should use. |
| |
If you have this many data points: |
Use this number of intervals: |
| |
Less than 50
50 to 99
100 to 250
More than 250 |
5 to 7
6 to 10
7 to 12
10 to 20
|
| |
In this example, 10 have been chosen as an appropriate
number of intervals. |
| Step 5 |
Compute the interval width. To
compute the interval width (Viewgraph
6), divide the range (590) by the number of intervals (10).
When computing the interval width, you should round the data up
to the next higher whole number to come up with values that are
convenient to use. For example, if the range of data is 17, and
you have decided to use 9 intervals, then your interval width
is 1.88. You can round this up to 2.
In this example, you divide 590 yards by 10 intervals, which
give an interval width of 59. This means that the length of each
interval is going to be 59 yards. To facilitate later calculations,
it is best to round off the value representing the width of the
intervals. In this case, we will use 60, rather than 59, as the
interval width. |
| Step 6 |
Determine the starting point for each interval.
Use the smallest data point in your measurements as the starting
point of the first interval. The starting point for the second interval
is the sum of the smallest data point and the interval width. For
example, if the smallest data point is -180, and the interval width
is 60, the starting point for the second interval is -120. Follow
this procedure (Viewgraph
7) to determine all of the starting points (-180 + 60 = -120;
-120 + 60 = -160; etc.). |
| Step 7 |
Count the number of points that fall within
each interval. These are the data points that are equal
to or greater than the starting value and less than the ending value
(also illustrated in Viewgraph
7). For example, if the first interval begins with -180 and
ends with -120, all data points that are equal to or greater than
-180, but still less than -120, will be counted in the first interval.
Keep in mind that EACH DATA POINT can appear in only one interval. |
| Step 8 |
Plot the data. A more precise
and refined picture comes into view once you plot your data (Viewgraph
8). You bring all of the previous steps together when you construct
the graph.
- The horizontal scale across the bottom of the graph contains
the intervals that were calculated previously.
- The vertical scale contains the count or frequency of observations
within each of the intervals.
- A bar is drawn for the height of each interval. The bars look
like columns.
- The number of observations or percentage of the total observations
determines the height for each of the intervals.
- The histogram may not be perfectly symmetrical. Variations
will occur. Ask yourself whether the picture is reasonable and
logical, but be careful no to let your preconceived ideas influence
your decisions unfairly.
|
| Step 9 |
Add the title and legend. A title
and a legend provide the Who, What, When, Where, and Why (also illustrated
in Viewgraph
8) that are important for understanding and interpreting the
data. This additional information documents the nature of the data,
where it came from, and when it was collected. The legend may include
such things as the sample size, the dates and times involved, who
collected the data, and indefinable equipment or work groups. It
is important to include any information that helps clarify what
the data describes. |
More
Quality Tools
|