Page 139 - Data Science class 10
P. 139
DISTRIBUTIONS IN DATA
SCIENCE
02
Learning Outcome
2.1. What is Distribution in Data Science? 2.2. Types of Distributions
2.3. Statistical Problem-Solving Process
2.4. Activity - Choosing Groups for School Dance Programs
In the previous chapter, we have learnt about various statistical terminology that are frequently used in data
science. We have also learnt about two-way frequency table and its application in fitting data points in each
category. The statistical terms included the central tendencies like Mean, Median and Mode. Ultimately, you were
introduced about mean Absolute Deviation, Variance and Standard Deviation.
In this chapter, we will learn about distribution of data in statistics. We will also learn about different types of data
distributions and characteristics of each distribution in detail.
2.1. WHAT IS DISTRIBUTION IN DATA SCIENCE?
A distribution is a simple way to visualise a set of data. It can be shown either as a graph or a list, revealing which
values of a random variable have lower or higher chances of happening. In data science, the word "distribution"
typically refers to a probability distribution. Probability distribution is a mathematical approach that displays the
likely values for a variable and how frequently they occur. Probability is one of the main building blocks of data
science and machine learning.
Distribution assists us in truly visualising what is going underneath, while the concept of probability provides the
mathematical computations.
Probability refers to the possibility of something happening. It is a mathematical concept that predicts how likely
events are to occur.
For example, consider a coin which has two sides, head and tail.
Tail Head
Distributions in Data Science 137

