Page 337 - Ai_V3.0_c11_flipbook
P. 337
● Positive correlation: Positive correlation is the relationship between two variables, in which both variables have a
linear relationship. As one variable increases/decreases, the second variable too increases/decreases. For example,
when fuel prices increase, prices of airline tickets also increase.
● Negative correlation: Negative correlation is the relationship between two variables, where one variable increases
as the second variable decreases, and vice versa. For example, more exercising leads to a decrease in body weight.
● No correlation: No correlation means that there is no relationship between two variables. If the value of a variable is
changed, another variable is not affected. For example, shirt size and monthly expense, body weight and intelligence,
etc.
● Non-linear correlation: A non-linear correlation is a correlation in which the relationship between variables may
not always be a straight line and all the points of a scatter plot are tend to lie near a smooth curve.
Pearson's r—Correlation Coefficient
The degree of association between two sets of data is measured by a correlation coefficient, represented by r. It is
also called Pearson's correlation coefficient and measures linear association between two variables. If a curved line is
needed to state the relationship, more complicated measures of correlation should be used.
The correlation coefficient is measured on a scale that varies from + 1 to – 1.
● 1 is a perfect positive correlation.
● 0 is no correlation (the values don't seem to be linked at all).
● –1 is a perfect negative correlation.
Pearson’s coefficient, r, is denoted by:
NΣxy–(Σx)(Σy)
r =
2
2
2
[NΣx – (Σx) ][NΣy (Σy) ]
2
Where,
N = Number of Values or Elements
x = First Score
y = Second Score
Σxy = Sum of the Product of First and Second Scores
Σx = Sum of First Scores
Σy = Sum of Second Scores
Σx = Sum of Square of First Scores
2
2
Σy = Sum of Square of Second Scores
Following are the guidelines given for interpreting the Pearson’s coefficient ‘r’:
Coefficient, r
Strength of Association Positive Negative
Small .1 to .3 –0.1 to –0.3
Medium .3 to .5 –0.3 to –0.5
Large .5 to 1.0 –0.5 to –1.0
Note that the strength of the association of the variables depends on what you are measuring and sample sizes.
Machine Learning Algorithms 335

