Web Lesson 13: Measures of Spread I

Measures of Spread

There are three different measures of spread:

Range
(often used with together with the mode)

The range tells us the difference between the highest and lowest values in the data

Inter-quartile Range (often used with together with the median)

The inter-quartile range tells the the range for the central 50% of the data. This is more useful as it excludes the extreme values which make up the range

Standard Deviation (often used with together with the mean)

The standard deviation is a more complicated. It it used a lot in 'A' level statistics. In general, it tells us the range for the central 68% of the data

Each one is worked out in a different way:
  


Finding the Range

1. If you have a list of data:

  • Find the highest value in the data

  • Find the lowest value in the data

  • The range is found by subtracting these 

e.g. Find the range of this data: 2, 4, 6, 5, 3, 5 ,3, 2, 7, 3, 8, 4, 3, 6, 3, 4, 2, 3
      Here we have a "list of data"
 
      We can see that the highest value is '8'
      And the lowest value is '2'
      So the range is 8 - 2  =  6
 
      Note: The range can also be written as '2 — 8', meaning the values lie between 2 & 8
 

2. If you have a table of ungrouped data:

  • Label the rows as ‘x’ (for the values) and ‘f’ (for the frequencies)

  • Subtract the highest value of 'x' from the lowest value value of 'x' 
    (Note: First cross out any classes where the frequency is zero)

e.g. Find the range of this data:
    No of Wins
    0
    1
    2
    3
    4
    frequency
    12
    27
    25
    13
    2
      We have a "table of ungrouped data"
 
      The highest value is '4'
      The lowest value is '0'
      So, the range is 4 - 0  =  4
 
      Note: The range can also be written as '0 - 4', meaning the values lie between 0 & 4
 

3. If you have a table of grouped data:

  • Once data is grouped we can not find the range any more!

e.g. Find the range of this data:
    Voltage
    5.6 – 5.8
    5.8 – 5.9
    5.9 – 6.0
    6.0 – 6.1
    6.1 – 6.4
    frequency
    20
    20
    80
    50
    30
      We have a "table of grouped data"
 
      It is not possible to find the range because, in the 1st class which is '5.6-5.8',
      we don't know what the exact value of the smallest number was…
 
      We might instead use the  10th percentile (P10)  and the  90th percentile (P90)
      as alternatives…         ╘══════════╤══════════╛         ╘══════════╤═══════════╛
				╒═════════╧═════════╕	        ╒═════════╧══════════╕
			         The 10th Percentile	         The 90th percentile
			         is the n/10th value	         is the 9n/10th value
			        ╘═══════════════════╛	        ╘════════════════════╛
 
      These can be found using the same interpolation method as you learnt to find the median
 

 

Question 1: A class of 12 Math'scool students was asked how many questions do they they think should be set for homework: 
		 23,   28,   21,   20,   31,   29,   26,   31,   33,   37,   48,   34
 
Find the range for these data:
  Clues: Sorry - no clues for this rather easy question…
 

 

Question 2: The number of 'sick days' taken by 200 employees in 2001 and in 2002 are shown in the table:
    No of sick days
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    frequency (in 2001)
    5
    12
    13
    45
    52
    30
    13
    15
    10
    5
    frequency (in 2002)
    0
    15
    22
    23
    35
    35
    30
    20
    10
    10
(a): Find the range for the number of sick days in 2001
(b): Find the range for the number of sick days in 2002
  Clues: Sorry - you still can't have any help…
 
 
Question 3: The salaries of the 200 workers was also recorded:
    Salary (£1000s)
    10-15
    15-20
    20-25
    25-30
    30-40
    40-60
    frequency
    10
    55
    68
    38
    17
    2
Explain why is it not possible to find the range for these data
  Clues: Again, sorry - but no-can-do…
 
 

Finding the Inter-quartile Range

1. If you have a list of data:

e.g. Find the quartiles of this data and find the I.Q.R: 1.3, 1.2, 1.4, 1.5, 1.2, 1.6, 1.5, 1.8, 1.5, 2.0, 1.7
  • Re-write all of the values, in numerical order and count how many there are:

    	  1.2,  1.2,  1.3,  1.4,  1.5,  1.5,  1.5,  1.6,  1.7,  1.8,  2.0
    	 └———————————————————— 11 values in the data ————————————————————┘
    	                                n=11
     
  • The lower quartile (Q1) is the n+½)th value; so work out n+½) and then count across your list to find that value:

    Sometimes, we need to find a value that ends in '½'; such as the 12½th value
    In this case, use: ½(12th value + 13th value)
    But if we need to find a value that ends in 'Ό'; such as the 3Όth value
    In that case, we
    round DOWN and find the 3rd value instead
       
      So Q1 is the (11)+½)th value = th value (we round this to the 3rd value):
     
    	               ┌—— 3rd value
    	               ▼
    	  1.2,  1.2,  1.3,  1.4,  1.5,  1.5,  1.5,  1.6,  1.7,  1.8,  2.0
    	               
    	        Q1 is 1.3
     
  • The upper quartile (Q3) is the n+½)th value; so work out n+½) and then count across your list to find that value:

    Sometimes, we need to find a value that ends in '½'; such as the 12½th value
    In this case, use: ½(12th value + 13th value)
    But if we need to find a value that ends in '¾'; such as the 8¾th value
    In that case, we
    round UP and find the 9th value instead
      
      So Q3 is the (11)+½)th value = th value (we round this to the 9th value):
     
    	                                                   ┌—— 9th value
    	                                                   ▼
    	  1.2,  1.2,  1.3,  1.4,  1.5,  1.5,  1.5,  1.6,  1.7,  1.8,  2.0
    	 						   
    						    Q3 is 1.7
     
  • I.Q.R. = Q3 - Q1 

      So the Inter-quartile range: I.Q.R = 1.7 - 1.3 = 0.4
     

2. If you have a table of ungrouped data:

e.g. Find the quartiles and the inter-quartile range of this data:
    No of Wins
    0
    1
    2
    3
    4
    frequency
    12
    27
    25
    13
    2
  • Write a Cumulative Frequency Table. The last number in the ‘F’ row is called ‘n’

    Upper Boundary
    up to 0
    up to 1
    up to 2
    up to 3
    up to 4
    Cumulative
    Frequency
    
    
    
    12
    12+27=
    
    
    39
    12+27
    +25=
    
    64
    12+27
    +25+13=
    
    77
    12+27
    +25+13
    +2=
    79
  • The lower quartile (Q1) is the n+½)th value; so work out n+½) and then look along the cumulative frequencies for this number (or above)
    Read across to the value of ‘x’. This is Q1 

    Upper Boundary
    up to 0
    up to 1
    up to 2
    up to 3
    up to 4
    Cumulative
    Frequency
    
    
    
    12
    12+27=
    
    
    39
    12+27
    +25=
    
    64
    12+27
    +25+13=
    
    77
    12+27
    +25+13
    +2=
    79
     				       
    Since we know n = 79		       ╚════════════════════╗
    To find the Q1, we look up the 20th value		    
    i.e. the (79)+½)th value = 20Όth value20th value       ║
    							    ║
    					╒═══════════════════╩══════════════════╕
    					 Since 20 is NOT THERE, we must look  │
    					│ up the next number ABOVE 20 in the   │
    					│ cumulative frequencies (which is 39) │
    					╘══════════════════════════════════════╛
    So, Q1 is '1'
  • The upper quartile (Q3) is the n+½)th value; so work out n+½) and then look along the cumulative frequencies for this number (or above)
    Read across to the value of ‘x’. This is Q3 

    Upper Boundary
    up to 0
    up to 1
    up to 2
    up to 3
    up to 4
    Cumulative
    Frequency
    
    
    
    12
    12+27=
    
    
    39
    12+27
    +25=
    
    64
    12+27
    +25+13=
    
    77
    12+27
    +25+13
    +2=
    79
     						  
    Since we know n = 79				  ╚═════════╗
    To find the Q3, we look up the 60th value		    
    i.e. the (79)+½)th value = 59Ύth value60th value       
    							    ║
    					╒═══════════════════╩══════════════════╕
    					 Since 60 is NOT THERE, we must look  │
    					│ up the next number ABOVE 60 in the   │
    					│ cumulative frequencies (which is 64) │
    					╘══════════════════════════════════════╛
    So, Q3 is '2'
  • I.Q.R. = Q3 - Q1 

      So the Inter-quartile range: I.Q.R = 2 - 1 = 1
      
║
║
║
Note: Strictly speaking, if n = 79, then Q1 is the (100)+½)th value (as we did above) - but in practice, when n is large (bigger than 30) then the difference between using ¼n+½ and just ¼n isn't really worth bothering with...
Similarly, for Q3, if n is bigger than 30, then just use: 
¾n

3. If you have a table of grouped data:

e.g. What with all this extra tuition, the amount of stuff a student has to carry around is too much!
To investigate this, the weights of the school bags of a class of students were measured:
    Mass
    5 – 15
    15 – 25
    25 – 30
    30 – 40
    40 – 50
    frequency
    8
    10
    9
    10
    3
Estimate the I.Q.R. of this data

Rather than use a "Cumulative Frequency Curve" (as we did at G.C.S.E. level) we can use the same Interpolation method that we used to find the median:

  • Write a Cumulative Frequency Table (remember to use the upper class boundary of each class). The last number in this row is called ‘n’

    Mass (U.C.B.)
    15
    25
    30
    40
    50
    cumulative frequency
    8
    18
    27
    37
    40
  • The lower quartile (Q1) is the n+½)th value; so we work out what n+½) gives:

    					╒═══════════════════════════════════════════╕
      n = 40     ╔══════════════════════════╡ If n is larger than 30 then we can chose  │
    	     ▼				│ whether we want to use: n + ½)th value  
      Ό(n)  =  Ό(39)  10 th value 	                or just:  Όn     th value  │
      					│ in this case, it was easier to use: Όn    │
    					╘═══════════════════════════════════════════╛
    
  • Squeeze an extra column into our table to help us find this value:

    Mass (U.C.B.)
    15
       Q  
    25
    30
    40
    50
    cumulative frequency
    8
    10
    18
    27
    37
    40
    ● We need to find some differences using our table: 'Δ1',  'Δ2', 'D1' &  'D2' need to be found
    
                        
    ╔═══════ D2=10 ══════╗
     
    
                        
    
                        
    
                        
    ╔═ D1=??? ═╗
     
    
                        
    
                        
    
                        
    Mass (U.C.B.)
    15
       Q  
    25
    30
    40
    50
    cumulative frequency
    8
    10
    18
    27
    37
    40
    
                        
    ╚═ Δ1= 2 ═╝
     
    
                        
    
                        
    
                        
    
                        
    ╚═══════ Δ2=10 ══════╝
     
    
                        
    
                        
    ● Among these differences, only D2 is unknown - but it can be found using:
    				D1  =  Δ1
    				D2     Δ2 
     
    			    =>  D1  =   2
    				10     10
     
    			    =>  D1  =   2
     
    ● D1 tells us what to add to the class to the left of the median to estimate the median:
     
    ╔══ + 2 ══╗
    ║         ▼
     
    Mass (U.C.B.)
    15
    17
    25
    30
    40
    50
    cumulative frequency
    14
    10
    18
    27
    37
    40
    ●  If the data is discrete, then round the answer
     
  • Find Q3 in a similar way

      To find Q3, we need to locate the: Ύnth value = 30th value
     
      Start by inserting a row at F = 30:
     
    Mass (U.C.B.)
    15
    25
    30
    Q3 
    40
    50
    cumulative frequency
    8
    18
    27
    30
    37
    40
     
    	Δ1 = 30 - 27 = 3
    	Δ2 = 37 - 27 = 10		    =>	D1  =  3
    	D1 = ???					10    10
    	D2 = 10	
    					    =>  D1  =  3
    
      So: Q3 (estimate) = 30 + 3 = 33
     
  • I.Q.R. = Q3 - Q1

      So the Inter-quartile range: I.Q.R = 33 - 17 = 16
    

 

Question 4: Find the quartiles and I.Q.R for these data: 23,  30,  29,  20,  22,  29,  27,  23,  27,  24
  Clues: This is a "LIST OF DATA"
 
         Start by writing the list in order:

			  20,  22,  23,  23,  24,  27,  27,  29,  29,  30
 			└———————————— 10 values in the data ——————————————┘
					       n=10
 
         So Q1 is the (10)+½)th value = 3rd value:
 
			             ┌—— 3rd value
			             ▼ 
			  20,  22,  23,  23,  24,  27,  27,  29,  29,  30
			            
			      Q1 is 23
 
         So Q3 is the (10)+½)th value = 8th value:

         Counting across the list, the 8th value is: …, so Q3 is …

         So the Inter-quartile range: I.Q.R = … - 23 = ...
  

 

Question 5: A class of 12 Math'scool students was asked how many questions they think should be set for homework: 
		 23,   28,   21,   20,   31,   29,   26,   31,   33,   37,   48,   34
 
Find the quartiles and the I.Q.R. for these data:
  Clues: This is a "LIST OF DATA"
 
         Firstly, re-write the numbers in numerical order:
 
		 20,   21,   23,   26,   28,   29,   31,    …,    …,    …,    …,    …
		╘══════════════════════════════════╤═════════════════════════════════╛ 
						 n = 12
  
         Since n = 12, Q1 is the (12)+½)th value, which is the th value:
 
		 20,   21,   23,   26,   28,   29,   31,    …,    …,    …,    …,    …
		                  
	       3rd value ═════╝    ╚═════ 4th value
		                 
		        3½th value would be
		         between these two
 
         So Q1 = ½(23+26) = ...
 
         Q3 is the (12)+½)th value, which is the th value
 
         Now the 9th value is ... and the 10th value is ..., so Q3 = ½(...+...) = ...
 
         Finally: I.Q.R. = Q3 – Q1 = ... – ... = ...
  
  

 

Question 6: A group of old people were asked to count how many grey hairs they have: 
	  192,   180,   131,   187,   200,   210,   194,   199,   204,   203,   200
 
Find the quartiles and the I.Q.R. for these data:
  Clues: This is "A LIST OF DATA"
 
         Start by putting the data in numerical order
 
         n = …
 
         Q1 is the (…)+½)th value, which is the ...th value
						 ╘═════╦═════╛
						       ║
If we need to find a value that ends in 'Ό'; such as the 3Όth value
In that case, we round DOWN and find the 3rd value instead
 
         Similarly, to find Q3, we will need to round up after working out Ύn
  

 

Question 7: I'm fed up that some students don't hand in all their corrections. I decided to investigate the number of outstanding corrections for each statistics student before deciding upon a plan of action!
    No of outstanding
    corrections
    0
    1
    2
    3
    4
    5
    6
    7
    8
    No of students
    2
    3
    3
    4
    4
    3
    3
    2
    2
(a): Determine the quartiles and the I.Q.R. of these data
(b): I decided that students whose number of outstanding corrections EXCEEDS the 3rd quartile will be expelled. How many students will be expelled?
    Clues: This is "A TABLE OF UNGROUPED DATA"
     
    We need to start will a table of cumulative frequencies:
     
    No of outstanding
    corrections
    0
    1
    2
    3
    4
    5
    6
    7
    8
    F
    2
    5
    8
    12
    …
    …
    …
    …
    26
     
    n = 26
     
    Q1 is the (26)+½)th value, which is the 7th value 
     
    But, looking along the cumulative frequencies, ‘7’ is NOT THERE!
     
    In that case, we must use the next cumulative frequency that is ABOVE 7 
     
    No of outstanding
    corrections
    0
    1
    2
    3
    4
    5
    6
    7
    8
    F
    2
    5
    8
    12
    …
    …
    …
    …
    26
     					 
    					 ║
    		      ╒══════════════════╩══════════════════╕
    		       Since 7 is NOT THERE, we must look  │
    		      │ up the next number ABOVE 7 in the   │
    		      │ cumulative frequencies (which is 8) │
    		      ╘═════════════════════════════════════╛ 
     
    So, Q1 is 2
     
    We find Q3 in the same way
       

 

Question 8: The number of 'sick days' taken by 200 employees in 2001 and in 2002 are shown in the table:
    No of sick days
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    frequency (in 2001)
    5
    12
    13
    45
    52
    30
    13
    15
    10
    5
    frequency (in 2002)
    0
    15
    22
    23
    35
    35
    30
    20
    10
    10
(a): Find the I.Q.R. for the number of sick days in 2001
(b): Find the I.Q.R. for the number of sick days in 2002
    Clues: This is "A TABLE OF UNGROUPED DATA"
     
    Firstly deal with the data for 2001:
     
    No of sick days
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    frequency (in 2001)
    5
    12
    13
    45
    52
    30
    13
    15
    10
    5
     
    Write a table of cumulative frequencies and then proceed in a similar way to question 7…
     
    Then do the data for 2002
     

 

Question 9: I asked last year's students how long it took them to do this question
    Time taken (mins)
    4-5
    5-6
    6-7
    7-8
    8-10
    10-15
    Number of students
    2
    5
    7
    2
    4
    2
(a): Determine the interquartile range of the time taken to do this question
(b): How long did it take you to do that?
    Clues: We have a "TABLE OF GROUPED DATA"
     
    So this time, to form our cumulative frequency table, we need to use the UPPER BOUNDARY of each class:
     
    U.C.B.
    5
    6
    7
    8
    10
    15
    F
    2
    7
    14
    16
    20
    22
     			  ║	     ║
    			  ║	     ║
    		 ╒════════╩══════════╩═══════╕
    		       Q1 is the 6th value
    		   Let's make a little space
    		    between F = 2 and F = 7
    		 ╘════════╦══════════╦═══════╛
    			  ║	     ╚═════════╗		
    			  ▼		       ▼
    U.C.B.
    5
       Q  
    6
    7
    8
    10
    15
    F
    2
    6
    7
    14
    16
    20
    22
     
    ● We need to find some differences using our table: 'Δ1',  'Δ2', 'D1' &  'D2' need to be found
     
    
                        
    ╔═══════ D2=1 ═══════╗
     
    
                        
    
                        
    
                        
    ╔═ D1=??? ═╗
     
    
                        
    
                        
    
                        
    U.C.B.
    5
       Q  
    6
    7
    8
    10
    15
    F
    2
    6
    7
    14
    16
    20
    22
    
                        
    ╚═ Δ1= 4 ═╝
     
    
                        
    
                        
    
                        
    
                        
    ╚═══════ Δ2= 5 ══════╝
     
    
                        
    
                        
     
    ● Among these differences, only D2 is unknown - but it can be found using:
     
    				D1  =  Δ1
    				D2     Δ2 
     
    			    =>  D1  =  4
    				1      5
     
    			    =>  D1  = 0.8
     
    ● D1 tells us what to add to the class to the left of the median to estimate the median:
     
     
    ╔══ +... ══╗
    ║          ▼
     
    U.C.B.
    5
    ...
    6
    7
    8
    10
    15
    F
    2
    6
    7
    14
    16
    20
    22
     
    So, that's Q1 found!
     
    Now to find Q3:
     
    Q3 is the …th value, so let's make a little space between the column where F = 16
    						      and the column where F = 20 
     
    U.C.B.
    5
    6
    7
    8
    Q3
    10
    15
    F
    2
    7
    14
    16
    ...
    20
    22
     
    And, in the same way working out D1 
     
    And adding it to 8 to get Q3 
     

 

Question 10: The salaries of the 200 workers was also recorded:
    Salary (£1000s)
    10-15
    15-20
    20-25
    25-30
    30-40
    40-60
    frequency
    10
    55
    68
    38
    17
    12
Find the median and quartiles of the salaries
  Clues: 			╒═══════════════════════════╕	╒═════════════════╕
				  This time, as n is large,	  We can just use
         			    we don't HAVE to use
				╘═════════════╤═════════════╛	╘════════╤════════╛
					  ╒═══╧════╕                  ╒══╧═╕
         Q1	——————————————————————————┤ Όn+½   ├——————————————————┤ ΌnMedian	——————————————————————————┤ ½(n+1) ├——————————————————┤ ½nQ3	——————————————————————————┤ Ύn+½   ├——————————————————┤ Ύn │
					  ╘════════╛                  ╘════╛
  

The pass mark (to avoid additional homework on this topic) is 8/10

Hand in your workings and answers