Page 291 - Ai_V3.0_c11_flipbook
P. 291

Mean                           Median                           Mode
                      No Outliers: The mean is sensitive  Outliers  Present:  When the  Multimodal Distributions:
                      to outliers. Mean is most affected  dataset contains outliers, the  When the distribution has
                      by extreme values (outliers). If used  median provides a better central  multiple peaks, the mode helps
                      with the data having no extreme  tendency measure.             to identify the frequent values.
                      values, could skew the results.

                      Quantitative Analysis: It’s useful  Ordinal Data: When dealing  Simple Understanding: It
                      for  further  statistical  analysis,  with ordinal data (ranked data),  provides a straightforward
                      such as variance and standard  the median is appropriate as it  understanding of the most
                      deviation calculations.         considers the order but not the  common value.
                                                      magnitude of differences.

                      Example:                        Example:                       Example:
                      •   Calculating the average score  •   Analysing household income in  •   Identifying the most common
                        of students in a class.         a region with  a few very  high   brand preference in a survey.

                      •  Determining   the   average    incomes.                     •   Finding  the  most  frequent
                        income of a population when  •   Reporting the central value in   diagnosis in a medical dataset.
                        there are no extreme income     real estate prices, where there
                        disparities.                    can be a significant range.



                        Variance and Standard Deviation

                 Measures of central tendency (mean, median, and mode) provide the central value of the data set. Variance and
                 standard  deviation  are  measures  of  dispersion  (quartile,  percentile,  range).  They  provide  information  about  the
                 distribution of data around the center.
                 In this section, we will learn two other measures of dispersion: variance and standard deviation.


                 Variance
                 Variance measures the distance of each number in the data set from the mean and also from every other number in
                 the set. Variance is often depicted by the symbol: σ  so variance of x will be denoted by σx 2
                                                              2
                 Calculating the variance:
                                                                       10 + 8 + 10 + 8 + 8 + 4
                                                     1     2     3    4   5   6  = 48
                                                 10,  8,  10, 8, 8, 4
                                                                          48 ÷ n ⇒ 48 ÷ 6
                                                     n = 6
                                                                             Mean = 8
                    • The variance represents how far the data  in your sample  are
                   grouped around the mean.                                            high variance
                    • Data sets with low variance have data grouped closely around the
                   mean.

                    • Data sets with high variance have data grouped far from the mean.
                                                                                                            MEAN
                 Step 1:     Subtract the mean from each of your numbers in your sample.
                                                                                                             low variance
                                  10   8  10    8   8   4
                                  − 8 − 8  − 8  − 8 − 8 − 8
                                   2   0   2    0   0   4

                                                                   Data Literacy—Data Collection to Data Analysis  289
   286   287   288   289   290   291   292   293   294   295   296