Page 168 - Data Science class 10
P. 168
People naturally tend to base decisions on information that is already available to us or things we hear about often
without looking at alternatives that might be useful. We thereby confine ourselves to a relatively narrow subset of
information.
How can you recognise, that the data which you are going to use, is biased?
If you notice the following, the source may be biased:
1. Heavily opinionated or one-sided
2. Relies on unsupported or unsubstantiated claims
3. Presents highly selected facts that lean to a certain outcome
4. Pretends to present facts, but offers only opinion
5. Uses extreme or inappropriate language
6. Data coming from any organisation which are religion based, belong to a particular cast and race or creed or a
political party
3.4. PROBABILITY FOR STATISTICS
Probability is all about counting randomness. It is the fundamentals of how statistical predictions are made. We can
use probability to predict how likely or unlikely particular events may be. We can also, if needed, consider informal
predictions beyond the scope of the data which we have analysed. In statistics, probability is a very important tool.
There are two problems and nature of their solution that will illustrate the difference.
When a dice is thrown, the possible outcomes are 1, 2, 3, 4, 5 and 6.
The sample space will be S = {1,2,3,4,5,6}.
Probability of an event E is given by the formula
Number of favourable outcomes
P(E)=
Total number of outcomes
Where total number of outcomes is the number of elements in the sample space. Prime numbers are the numbers
that have factors as 1 and that number itself. To determine the chances of receiving a prime number, we have to
check the number of prime numbers in the sample space and substitute the values in the formula for probability.
3.4.1. Populations, Samples, Parameters, and Statistics
The field of inferential statistics enables you to make educated guesses about the numerical characteristics of large
groups. You may test generalisations about these groups using only a tiny sample of their members according to
the logic of sampling.
Often, researchers want to know things about populations but do not have data for every person or thing in the
population. If a business's customer service department wants to find out whether or not its clients were happy, it
would not be practical (or perhaps even possible) to contact every individual who purchased a product. Instead,
the company might select a sample of the population.
A sample is an unbiased, objective group of people chosen to represent the entire population. In order to use
statistics to learn things about the population, the sample must be random. Every member of a population has
an equal probability of being chosen in a random sampling. The most commonly used sample is a simple random
sample. It requires that every possible sample of the selected size has an equal chance of being used.
A parameter is a characteristic of a population. A statistic is a characteristic of a sample. Inferential statistics let
you make an informed prediction about a population parameter based on a statistic computed from a sample
randomly drawn from that population.
166 Touchpad Data Science-X

