Page 123 - Data Science class 10
P. 123

Preference          Boys            Girls
                                            Landscaping           35             20

                                         Interior Decoration      15             30
                                               Total              50             50

            To convert this into a relative two-way frequency table, you will have to convert individual cells into percentages.

                                            Preference          Boys            Girls

                                            Landscaping           70             40
                                         Interior Decoration      30             60

                                               Total             100            100
            Two-way relative frequency tables are helpful when there are different sample sizes in a dataset. Percentages make
            it easier to compare the preferences.


            1.5. CENTRAL TENDENCY
            Central tendency means the value derived from the random variables in the set of data that reflects the midpoint
            of the data distribution. Central tendency not only tells you specifics about the individual pieces of data, but it also
            gives you an overall picture of what is going on in the complete dataset. Hence, a central tendency is the central
            or typical value of a probability distribution.

            Measures of central tendency help you find the middle, or the average, of a dataset. The three most common
            measures of central tendency are the mean, median and mode. The mode is the most frequent value. The median
            is the middle number in an ordered dataset.

                                                       Central Tendency



                                        Mean                Median                Mode


            1.5.1. Mean

            The mean is a measure of central tendency. The "mean," also known as the "simple average" in data science, is the
            average value of a dataset. The mean is a value in the dataset around which the entire data is spread out. While
            mean is calculated, all values used in calculating the average are weighted equally. There are three types of mean
            which are as follows:

               • Arithmetic mean
               • Geometric mean
               • Harmonic mean

            When you simply say mean, it always means arithmetic mean.

                                                             Mean



                                       Arithmetic          Geometric           Harmonic
                                       Mean (AM)           Mean (GM)           Mean (HM)
            The mean of a dataset is calculated by adding up all the values in the dataset and later dividing them by the
            number of values present in the data frame.


                                                                               Use of Statistics in Data Science  121
   118   119   120   121   122   123   124   125   126   127   128