Page 164 - Data Science class 11
P. 164

Example
        Stratified sampling is based on age, socioeconomic factors, nationality, religious beliefs, educational achievements,
        etc.
        Case Study

        Consider a scenario of researchers performing a study designed to analyse the political preferences of history
        students at a major university. The researchers want to make sure the random sample best approximates the student
        population, including gender, undergraduates and graduate students. The subgroups are created out of the total
        population of 1000 students.
        Following are the four groups with their percentage representation of the total population calculated by researchers:

        Male undergraduates = 350 students or 35%
        Female undergraduates = 300 students or 30%

        Male graduate students = 250 students or 25%
        Female graduate students = 100 students or 10%
        Random sampling of each subpopulation is done, based on its representation within the population as a whole. Since
        male undergraduates are 35% of the population, 35 male undergraduates are randomly chosen out of that subgroup.
        Because male graduates make up only 25% of the population, 25 are selected for the sample, and so on.
        While stratified random sampling accurately reflects the population being studied, conditions that need to be satisfied
        mean this technique can’t be used in every study.

        Advantages
        Stratified random sampling has advantages when compared to simple random sampling. These advantages are:

           • Accurately  Reflects  Population  Studied:  Stratified  random  sampling  accurately  reflects  the  population  being
          studied because  researchers are stratifying the  entire population  before applying  random  sampling methods. In
          short, it ensures each subgroup within the population receives proper representation within the sample. As a result,
          stratified random sampling provides better coverage of the population since the researchers have control over the
          subgroups to ensure all of them are represented in the sampling.
           • Good representation of attributes of population: With simple random sampling, there isn’t any guarantee that
          any particular subgroup or type of person is chosen. In our earlier example of the university students, using simple
          random sampling to procure a sample of 100 from the population might result in the selection of only 20 male
          undergraduates or only 20% of the total population. Also, 25 female graduate students might be selected (25% of the
          population) resulting in under-representation of male undergraduates and over-representation of female graduate
          students. Any errors in the representation of the population have the potential to diminish the accuracy of the study.

        Disadvantages
        Stratified random sampling also presents researchers with a disadvantage. Here is the disadvantage:
           • Can’t be used in all studies: Unfortunately, this method of research cannot be used in every study. The method’s
          disadvantage is that  several  conditions must be  met for  it to  be  used properly.  Researchers must identify  every
          member of the population being studied and classify each of them into one, and only one, subpopulation. As a
          result, stratified random sampling is disadvantageous when researchers can’t confidently classify every member of the
          population into a subgroup. Also, finding an exhaustive and definitive list of an entire population can be challenging.
        Systematic Sampling
        Systematic sampling selects a random starting point from the population, and then a sample is taken from regular
        fixed intervals of the population depending on its size.



          162   Touchpad Data Science-XI
   159   160   161   162   163   164   165   166   167   168   169