Page 338 - AI_Ver_3.0_class_11
P. 338
Example 1: The age and income of five people are given below. Calculate the Pearson coefficient. What does it depict?
Age (x) Income (y)
20 2000
30 40000
40 49000
50 61000
60 75000
Solution: To calculate the coefficient, we need to calculate the following values.
x y xy x 2 y 2
20 2000 40000 400 4000000
30 4500 135000 900 20250000
40 5700 228000 1600 32490000
50 6800 340000 2500 46240000
60 8000 480000 3600 64000000
2
2
∑x=200 ∑y=27000 ∑xy=1223000 ∑x =9000 ∑y =166980000
Putting the values in the formula,
5 (1223000) – (200) (27000)
r =
2
[(5) 9000 – (200) ] [(5) (166980000) – (27000) 2
715000
=
727667.5065
= 0.98
0.98 represents a positive strong relationship between the two variables. As the age of a person increases, the
person’s income also goes up.
Example 2: Amit is an idol student good in both academics and sports. However, after some time, he reduced his
sports activity and observed that he scored less marks in his test also. To investigate this hypothesis, he noted how he
scored in his tests, based on how many hours he played any sport before appearing in the school tests. He gathered
this data to check the correlation between number of hours of his sports activity and his tests scores. He thus,
calculated the Pearson Correlation Coefficient = 0.95. Explain his observation.
Solution: 0.95 shows a positive and strong strength of association between the two variables. This means that Amit scored
better marks if he continued his sports activities. If Amit reduced his playing hours, the marks he scored also reduced.
Assumptions
There are four assumptions for Pearson's correlation coefficient which are as follows. If any of these four requirements
are not met, analysis of data using Pearson's correlation coefficient might not yield a valid result:
1. The data type of the two variables should be continuous. Examples of such continuous variables include height
(measured in feet and inches), temperature (measured in °C), income (measured in INR), study time (measured in
hours), intelligence (measured through IQ score), exam performance (measured from 0 to 100), sales (measured
in number of transactions every month), etc.
2. There must be a linear relationship between the two variables. Create a scatterplot by plotting the two variables
against each other. The scatterplot can then be used to check for linearity.
336 Touchpad Artificial Intelligence (Ver. 3.0)-XI

