Page 164 - Data Science class 11
P. 164
Example
Stratified sampling is based on age, socioeconomic factors, nationality, religious beliefs, educational achievements,
etc.
Case Study
Consider a scenario of researchers performing a study designed to analyse the political preferences of history
students at a major university. The researchers want to make sure the random sample best approximates the student
population, including gender, undergraduates and graduate students. The subgroups are created out of the total
population of 1000 students.
Following are the four groups with their percentage representation of the total population calculated by researchers:
Male undergraduates = 350 students or 35%
Female undergraduates = 300 students or 30%
Male graduate students = 250 students or 25%
Female graduate students = 100 students or 10%
Random sampling of each subpopulation is done, based on its representation within the population as a whole. Since
male undergraduates are 35% of the population, 35 male undergraduates are randomly chosen out of that subgroup.
Because male graduates make up only 25% of the population, 25 are selected for the sample, and so on.
While stratified random sampling accurately reflects the population being studied, conditions that need to be satisfied
mean this technique can’t be used in every study.
Advantages
Stratified random sampling has advantages when compared to simple random sampling. These advantages are:
• Accurately Reflects Population Studied: Stratified random sampling accurately reflects the population being
studied because researchers are stratifying the entire population before applying random sampling methods. In
short, it ensures each subgroup within the population receives proper representation within the sample. As a result,
stratified random sampling provides better coverage of the population since the researchers have control over the
subgroups to ensure all of them are represented in the sampling.
• Good representation of attributes of population: With simple random sampling, there isn’t any guarantee that
any particular subgroup or type of person is chosen. In our earlier example of the university students, using simple
random sampling to procure a sample of 100 from the population might result in the selection of only 20 male
undergraduates or only 20% of the total population. Also, 25 female graduate students might be selected (25% of the
population) resulting in under-representation of male undergraduates and over-representation of female graduate
students. Any errors in the representation of the population have the potential to diminish the accuracy of the study.
Disadvantages
Stratified random sampling also presents researchers with a disadvantage. Here is the disadvantage:
• Can’t be used in all studies: Unfortunately, this method of research cannot be used in every study. The method’s
disadvantage is that several conditions must be met for it to be used properly. Researchers must identify every
member of the population being studied and classify each of them into one, and only one, subpopulation. As a
result, stratified random sampling is disadvantageous when researchers can’t confidently classify every member of the
population into a subgroup. Also, finding an exhaustive and definitive list of an entire population can be challenging.
Systematic Sampling
Systematic sampling selects a random starting point from the population, and then a sample is taken from regular
fixed intervals of the population depending on its size.
162 Touchpad Data Science-XI

