Page 338 - Ai_V3.0_c11_flipbook
P. 338

Example 1: The age and income of five people are given below. Calculate the Pearson coefficient. What does it depict?
                                                      Age (x)     Income (y)
                                                        20           2000
                                                        30          40000
                                                        40          49000
                                                        50          61000
                                                        60          75000

              Solution: To calculate the coefficient, we need to calculate the following values.
                               x             y              xy              x 2                y 2
                               20          2000           40000            400              4000000

                               30          4500          135000            900             20250000
                               40          5700          228000           1600             32490000
                               50          6800          340000           2500             46240000
                               60          8000          480000           3600             64000000
                                                                                           2
                                                                           2
                            ∑x=200       ∑y=27000      ∑xy=1223000      ∑x =9000         ∑y =166980000
              Putting the values in the formula,
                                                         5 (1223000) – (200) (27000)
                                         r  =
                                                                2
                                                  [(5) 9000 – (200) ] [(5) (166980000) – (27000) 2
                                                  715000
                                             =
                                               727667.5065
                                             =   0.98
              0.98  represents  a  positive  strong  relationship  between  the  two  variables.  As  the  age  of  a  person  increases,  the
              person’s income also goes up.

              Example 2: Amit is an idol student good in both academics and sports. However, after some time, he reduced his
              sports activity and observed that he scored less marks in his test also. To investigate this hypothesis, he noted how he
              scored in his tests, based on how many hours he played any sport before appearing in the school tests. He gathered
              this  data  to  check  the  correlation  between  number  of  hours  of  his  sports  activity  and  his  tests  scores.  He  thus,
              calculated the Pearson Correlation Coefficient = 0.95. Explain his observation.

              Solution: 0.95 shows a positive and strong strength of association between the two variables. This means that Amit scored
              better marks if he continued his sports activities. If Amit reduced his playing hours, the marks he scored also reduced.

              Assumptions
              There are four assumptions for Pearson's correlation coefficient which are as follows. If any of these four requirements
              are not met, analysis of data using Pearson's correlation coefficient might not yield a valid result:

              1.   The data type of the two variables should be continuous. Examples of such continuous variables include height
                  (measured in feet and inches), temperature (measured in °C), income (measured in INR), study time (measured in
                  hours), intelligence (measured through IQ score), exam performance (measured from 0 to 100), sales (measured
                  in number of transactions every month), etc.

              2.   There must be a linear relationship between the two variables. Create a scatterplot by plotting the two variables
                  against each other. The scatterplot can then be used to check for linearity.





                    336     Touchpad Artificial Intelligence (Ver. 3.0)-XI
   333   334   335   336   337   338   339   340   341   342   343