Lesson 22 Comparing correlations

# Episode 1

Reasoning Resources: Worksheet 1 - A2/A3 size or on board, Worksheet 1 for pupils
Whole class introduction
Looking at a correlation scatter in terms of predicting from one variable to another, and the degree of uncertainty of a prediction.

There is a rather large hierarchical step between intuitive concepts of correlation — processed by looking at the degree of scatter between the two variables concerned - and quantitative concepts intended to provide a measure of correlation.

This activity attempts to treat correlation at a level where most 12-year-old pupils can handle it ‘in some intellectually honest form’ consistent with the underlying principles of coefficients of correlation.
Display the maths and science grades:How might one person's grade in science compare with their grade in maths? Some pupils would talk about the overall shape of the relationship including some caveats: ‘high in maths high in science, low and low but not exactly’, ‘the middle grades are confusing'.

Some will recognise that for each grade in maths there is a range of grades in science.How easy is it to estimate someone's grade in one subject if you know their grade in the other?

A line of best fit (or ‘best prediction’) could be arrived at either visually between upper and lower straight lines sandwiching all the points, or by roughly using medians at each grade. For the median accept any point half-way through the vertical range in the science grades that corresponds to a maths grade. How useful is this line in predicting from maths to science?

Simple issues such as not having a half-grade could be aired, contributing to the approximate nature of the prediction.

Draw pupils’ attention to comparing the width of the range of science grades for each maths grade (between 2 and 4 grades, an average of 3) to the full range of science grades possible (7 grades). Explain that there are other such relationships; show the graphs for the English and PE marks and for the sprint race times.
Pair work
First they look at the three scatters in terms of the range of prediction (related to TM 18: Prediction & correlation scatters). Then they look in a semi-quantitative way to the ratio of the range of one variable for each value of the second, to the full range of variation possible. Give pupils copies of Worksheet 1. Pupils should draw the lines of best fit on the graphs, either through medians or relying on upper and lower limiting lines. For each maths grade they should find the median science grade; for each group of PE marks they should find the median English mark; for the time graph they can draw the line of best prediction without finding median marks.

Depending on the ability range of the class, pupils could either: work on all three of the graphs on Worksheet 1 in pairs first, followed by a class discussion of each graph, or have short whole class sessions after they work on each graph. For a maths grade E, C or B, what is the range of science grades possible?
Whole class sharing and discussion
Here we ask them to look at a correlation in terms of its strength. The lower the degree of uncertainty (the range of a prediction) the greater the correlation.

In the second episode they look at the same scatters by reducing them to four-cell tables, and counting the confirming and disconfirming cases.
Compare the graphs in terms of which graph gives the best, middle and worst prediction, and then in terms of the range of prediction in each case. Explain to someone else how the graphs differ from each other in how useful they are for prediction.

Get the pupils to quantify the ranges in approximate terms, e.g. that for the 100 m and 200 m sprints the range of the 200 m times for each 100 m time is about 0.2 seconds while the total range of results is about 0.8 seconds. They should compare that with the maths/science ratio of 3 to 7 grades, and to the English/PE ratio of 50 to 50 marks. They should see that these ratios correspond to the ‘narrow strip/oval ‘ the ‘middle or thick strip/oval’ and the ‘near circular’ overall shapes of the data on the graphs.