Page 232 - AI Ver 1.0 Class 10
P. 232
Statistical Learning with Python
Now we all have understood that Data Science deals with data analysis and data manipulation. But this numeric &
alpha-numeric data analysis and manipulation are not possible without the intervention of Mathematical Statistics.
Python with supported libraries like NumPy, Matplotlib etc. have a lot of pre-defined functions that implement
Mathematical statistics without getting into the hassle of doing the calculations and creating the formulas or
equations to find out the results. All we need to do is write that function and pass on the data to it. It’s that simple!
Let us take a look at some basic statistical tools used in Python.
Mean
Mean is the average of numeric data in a given dataset. So, we add all of the numbers together of the dataset and
divide by the number of elements in the given set.
It is calculated as:
sum of allvalues
Mean =
totalno. of values
To calculate mean in Python:
import statistics
marks = [45,34,41,46,47,39,38,48,45,34,41,39,39]
m = statistics.mean(marks)
print(“the average marks of the class :”,round(m,2))
Output will be:
the average marks of the class : 41.23
Median
When the data is arranged in an ascending order then median is the middle number in a given dataset. If there
are two middle numbers, taking the mean of these two numbers will give the median.
For odd number of datasets:
(N+1) th
2
For even number of datasets:
To calculate median in Python:
import statistics
marks = [45,34,41,46,47,39,38,48,45,34,41,39,39]
m = statistics.median(marks)
print(“the middle value of the sorted marks in class:”,m)
Output will be:
the middle value of the sorted marks in class : 41
230 Touchpad Artificial Intelligence-X

