2.1 Grouping Data in Tables
Data
Data
Categorical Data (or Qualitative Data)
Categorical Data (or Qualitative Data)
Numerical Data (or Quantitative Data)
Numerical Data (or Quantitative Data)
2.1.1 Table for Categorical Data
Example A survey of some company CEOs about their highest college degree shows the following:
MBA
MBA
Law
Law
MBA
PhD
None
Masters
Bachelors
Bachelors
MBA
Bachelors
MBA
MBA
Masters
Law
Bachelors
MBA
Bachelors
Bachelors
In this example, the categories are: Bachelors, Law, Masters, MBA, None and PhD.
The degree for each individual CEO is called an observation.
The number of observations in a category is called the frequency of the category.
We use the following frequency table to summarize the above data:
Degree
There are 6 Bachelors
There are 6 Bachelors Frequency
Bachelors
6
Law
3
Masters
2
MBA
7
None
1
PhD
Total frequency: 6+3+2+7+1+1 = 20
Total frequency: 6+3+2+7+1+1 = 201
Total
20
The ratio of the frequency of a category to the total frequency is called relative frequency:
We use the following table (frequency and relative frequency table) to show the frequency distribution and relative frequency distribution:
Degree
Frequency
Relative Frequency
Bachelors
6
6/20 = 0.3
Law
3
3/20 = 0.15
Masters
2
2/20 = 0.1
MBA
7
7/20 = 0.35
None
1
1/20 = 0.05
PhD
1
1/20 = 0.05
Total
20
1
You can show your work as below without showing the computation of the relative frequencies.
Degree
Frequency
Relative Frequency
Bachelors
6
0.3
Law
3
0.15
Masters
2
0.1
MBA
7
0.35
None
1
0.05
PhD
1
0.05
Total
20
1
Sometimes we use percentage instead of relative frequency. The following table is a frequency and percent distribution table.
Degree
Frequency
Percent
Bachelors
6
6/20 × 100% = 30%
Law
3
3/20 × 100% = 15%
Masters
2
2/20 × 100% = 10%
MBA
7
7/20 × 100% = 35%
None
1
1/20 × 100% = 5%
PhD
1
1/20 × 100% = 5%
Total
20
100%
Note: Usually we think relative frequency distribution and percent distribution are equivalent. Therefore, if we have constructed a relative frequency table, then there is no need to construct a percent table, and vice versa.
2.1.2 Table for Numerical Data
Example The following shows monthly electricity bills (in dollars) for a sample of households:
130 55 45 64 155 66 60 80 102 62
58 75 111 139 81 55 66 90
For numerical data, we often use intervals to classify the data. For example, we may choose the following intervals:
40-59, 60-79, 80-99, 100-119, 120-139 and 140-159
Then we have the following frequency table:
Monthly electricity bill
Frequency
40 – 59
4
60 – 79
6
80 – 99
3
100 – 119
2
120 – 139
2
140 – 159
1
total
18
Note: The above shows that four households have bills between 40 and 59 (55, 45, 58, 55).
Note: For homework assignments or tests, I will give you the intervals.
The frequency and relative frequency table is as follows:
Monthly electricity bill
Frequency
Relative Frequency
40 – 59
4
4/18 = 0.22
60 – 79
6
6/18 = 0.33
80 – 99
3
3/18 = 0.17
100 – 119
2
2/18 = 0.11
120 – 139
2
2/18 = 0.11
140 – 159
1
1/18 = 0.06
total
18
1
Rounding Numbers
In the above, 4/18 = 0.222….. If we keep more decimal places, it is more accurate but more tedious. In this course, in most cases, it is required to keep at least two decimal places.
We use the following rule to round 2 decimal places:
If the 3rd decimal number is 5, 6, 7, 8, or 9, then add the 2nd decimal number by 1. For example, in the above, 1/18 = 0.05555…. Then 1/18 = 0.06.
If the 3rd decimal number is 0, 1, 2, 3, or 4, then keep the same 2nd decimal number. For example, in the above, 4/18 = 0.222…. Then 4/18 = 0.22.