Page 123 - Data Science class 10
P. 123
Preference Boys Girls
Landscaping 35 20
Interior Decoration 15 30
Total 50 50
To convert this into a relative two-way frequency table, you will have to convert individual cells into percentages.
Preference Boys Girls
Landscaping 70 40
Interior Decoration 30 60
Total 100 100
Two-way relative frequency tables are helpful when there are different sample sizes in a dataset. Percentages make
it easier to compare the preferences.
1.5. CENTRAL TENDENCY
Central tendency means the value derived from the random variables in the set of data that reflects the midpoint
of the data distribution. Central tendency not only tells you specifics about the individual pieces of data, but it also
gives you an overall picture of what is going on in the complete dataset. Hence, a central tendency is the central
or typical value of a probability distribution.
Measures of central tendency help you find the middle, or the average, of a dataset. The three most common
measures of central tendency are the mean, median and mode. The mode is the most frequent value. The median
is the middle number in an ordered dataset.
Central Tendency
Mean Median Mode
1.5.1. Mean
The mean is a measure of central tendency. The "mean," also known as the "simple average" in data science, is the
average value of a dataset. The mean is a value in the dataset around which the entire data is spread out. While
mean is calculated, all values used in calculating the average are weighted equally. There are three types of mean
which are as follows:
• Arithmetic mean
• Geometric mean
• Harmonic mean
When you simply say mean, it always means arithmetic mean.
Mean
Arithmetic Geometric Harmonic
Mean (AM) Mean (GM) Mean (HM)
The mean of a dataset is calculated by adding up all the values in the dataset and later dividing them by the
number of values present in the data frame.
Use of Statistics in Data Science 121

