Page 169 - Data Science class 11
P. 169
4.4.2 Sampling error
Sampling error is the difference between a population parameter and a sample statistic used to estimate it. For
example, the difference between a population mean and a sample mean is sampling error. The most common result
of sampling error is systematic error wherein the results from the sample differ significantly from the results from the
entire population.
Five Common Types of Sampling Errors
Following are the five commonly used types of sampling errors:
• Population Specification Error: This error occurs when the researcher does not understand who they should survey.
For example, imagine a survey about breakfast cereal consumption.
• Sample Frame Error: A frame error occurs when the wrong sub-population is used to select a sample.
• Selection Error: This occurs when respondents self-select their participation in the study—only those who are
interested respond. Selection error can be controlled by going extra lengths to get participation. A typical survey
process includes initiating pre-survey contact requesting cooperation, actual surveying, and post-survey follow-up.
If a response is not received, a second survey request follows, and perhaps interviews using alternate modes such as
telephone or person-to-person.
• Non-Response: Non-response error occurs when almost the entire data for a sampling unit are missing. This can
occur if the respondent is unreachable or temporarily missing, the respondent is unable to participate or refuses to
participate in the survey, or if the dwelling is vacant.
• Sampling Size Errors; These errors occur because of variation in the number or representativeness of the sample
that responds. Sampling errors can be controlled by careful sample designs, large samples, and multiple contacts to
assure representative response.
Most sampling errors can be avoided by increasing the population size and ensuring that most of the selected
respondents adequately represent the rest of the population. The more rigorously you sample and find the right
candidates for your survey, the better the outcome will be.
4.4.3 Sampling Bias
Bias means preference or prejudice for or against some thing. Random sampling is sample drawn and recorded by
a method which is free from bias. This implies that this sampling method is not only free from bias in the method of
selection, e.g. random sampling, but free from any bias of procedure, e.g. wrong definition, non-response, design
of questions, interviewer bias, etc. Although simple random sampling is intended to be an unbiased approach to
surveying, sample selection bias can occur.
When census data cannot be gathered, statisticians collect data by developing particular experiment designs and survey
samples. Representative sampling ensures that conclusions can genuinely extend from the sample to the population
as a whole. An experimental study involves taking measurements of the system under research, manipulating the
system, and then taking more measurements using the very procedure to determine if the manipulation has changed
the values of the measurements. Whereas, an observational study does not involve experimental manipulation.
A sampling method is biased if it systematically prefers some outcomes over others.
A basic example of this bias is when a person refers to an individual by his or her occupation, such as 'doctor' or 'data
scientist' and it is assumed that individual is male. Males, however, are not free from gender bias. Teachers, especially
those who teach younger children, are often assumed to be women.
Types of Sampling Bias
There are five main types of bias in research. Let us know each of them.
• Sampling bias: Sampling bias is an error based bias in the way the survey respondents are chosen. This bias occurs
when a survey sample is not completely random. In other words, if specific individuals are more or less likely to be
selected as a sample for your research, chances are high that a sample selection bias may occur.
Randomisation 167

