Page 291 - Ai_V3.0_c11_flipbook
P. 291
Mean Median Mode
No Outliers: The mean is sensitive Outliers Present: When the Multimodal Distributions:
to outliers. Mean is most affected dataset contains outliers, the When the distribution has
by extreme values (outliers). If used median provides a better central multiple peaks, the mode helps
with the data having no extreme tendency measure. to identify the frequent values.
values, could skew the results.
Quantitative Analysis: It’s useful Ordinal Data: When dealing Simple Understanding: It
for further statistical analysis, with ordinal data (ranked data), provides a straightforward
such as variance and standard the median is appropriate as it understanding of the most
deviation calculations. considers the order but not the common value.
magnitude of differences.
Example: Example: Example:
• Calculating the average score • Analysing household income in • Identifying the most common
of students in a class. a region with a few very high brand preference in a survey.
• Determining the average incomes. • Finding the most frequent
income of a population when • Reporting the central value in diagnosis in a medical dataset.
there are no extreme income real estate prices, where there
disparities. can be a significant range.
Variance and Standard Deviation
Measures of central tendency (mean, median, and mode) provide the central value of the data set. Variance and
standard deviation are measures of dispersion (quartile, percentile, range). They provide information about the
distribution of data around the center.
In this section, we will learn two other measures of dispersion: variance and standard deviation.
Variance
Variance measures the distance of each number in the data set from the mean and also from every other number in
the set. Variance is often depicted by the symbol: σ so variance of x will be denoted by σx 2
2
Calculating the variance:
10 + 8 + 10 + 8 + 8 + 4
1 2 3 4 5 6 = 48
10, 8, 10, 8, 8, 4
48 ÷ n ⇒ 48 ÷ 6
n = 6
Mean = 8
• The variance represents how far the data in your sample are
grouped around the mean. high variance
• Data sets with low variance have data grouped closely around the
mean.
• Data sets with high variance have data grouped far from the mean.
MEAN
Step 1: Subtract the mean from each of your numbers in your sample.
low variance
10 8 10 8 8 4
− 8 − 8 − 8 − 8 − 8 − 8
2 0 2 0 0 4
Data Literacy—Data Collection to Data Analysis 289

