Page 337 - Ai_V3.0_c11_flipbook
P. 337

●   Positive correlation: Positive correlation is the relationship between two variables, in which both variables have a
                    linear relationship. As one variable increases/decreases, the second variable too increases/decreases. For example,
                    when fuel prices increase, prices of airline tickets also increase.
                 ●   Negative correlation: Negative correlation is the relationship between two variables, where one variable increases
                    as the second variable decreases, and vice versa. For example, more exercising leads to a decrease in body weight.

                 ●   No correlation: No correlation means that there is no relationship between two variables. If the value of a variable is
                    changed, another variable is not affected. For example, shirt size and monthly expense, body weight and intelligence,
                    etc.

                 ●   Non-linear correlation: A non-linear correlation is a correlation in which the relationship between variables may
                    not always be a straight line and all the points of a scatter plot are tend to lie near a smooth curve.

                 Pearson's r—Correlation Coefficient

                 The degree of association between two sets of data is measured by a correlation coefficient, represented by r. It is
                 also called Pearson's correlation coefficient and measures linear association between two variables. If a curved line is
                 needed to state the relationship, more complicated measures of correlation should be used.

                 The correlation coefficient is measured on a scale that varies from + 1 to – 1.

                 ●  1 is a perfect positive correlation.
                 ●  0 is no correlation (the values don't seem to be linked at all).

                 ●  –1 is a perfect negative correlation.
                 Pearson’s coefficient, r, is denoted by:

                                                                 NΣxy–(Σx)(Σy)
                                                      r =
                                                                            2
                                                                      2
                                                                2
                                                             [NΣx  – (Σx) ][NΣy  (Σy) ]
                                                                                2
                 Where,
                 N = Number of Values or Elements
                 x = First Score
                 y = Second Score

                 Σxy = Sum of the Product of First and Second Scores

                 Σx = Sum of First Scores
                 Σy = Sum of Second Scores

                 Σx  = Sum of Square of First Scores
                   2
                   2
                 Σy  = Sum of Square of Second Scores
                 Following are the guidelines given for interpreting the Pearson’s coefficient ‘r’:

                                                               Coefficient, r
                                            Strength of Association    Positive    Negative
                                            Small                      .1 to .3   –0.1 to –0.3
                                            Medium                     .3 to .5   –0.3 to –0.5

                                            Large                      .5 to 1.0  –0.5 to –1.0
                 Note that the strength of the association of the variables depends on what you are measuring and sample sizes.

                                                                                  Machine Learning Algorithms   335
   332   333   334   335   336   337   338   339   340   341   342