Page 224 - AI Ver 1.0 Class 10
P. 224
Python for Data Science
Data Science is using a combination of Python and Mathematical concepts like Statistics, Data Analysis,
probability, etc. Python is the most suitable, simple and easy language to write the code and can handle the
highly complex mathematical processing required to develop applications using AI.
A file created in Python and saved with an extension .py is called a module. A collection of relevant modules
saved under the same directory and a name is called a Package. There are various packages related to various
purposes available for free to be used in Python. Some of the open-source packages available needed for
Artificial Intelligence are:
• NumPy: Numerical Array Data Handling Package. It is used for data analysis and calculation related to large
numerical data sets.
• OpenCV: Image Processing Package. It is used for manipulating and processing of images like cropping, resizing,
editing etc.
• Matplotlib: Data Visualisation Package. It is used for the graphical representation to produce high quality data
visualization of the numerical data.
• NLTK (Natural Language Tool Kit): Natural language Processing Package. It helps in tasks related to textual
data.
• Pandas: Data related to 2 or more dimensions is handled using Pandas. The source of data is data arranged in
tabular form either using spreadsheets or database software.
Data Access in Python
After the data is collected through different methods, this data needs to be accessed through Python code so
that it can be arranged in a structured manner and analyzed as required by AI model. To help us with this Python
provides different packages like NumPy, Pandas, and Matplotlib. Let us now study in detail the use of some of
these packages.
NumPy
NumPy is a powerful open-source scientific package that stands for ‘Numerical Python’. It uses mathematical and
logical operations for handling large datasets through powerful data structure-n-dimensional arrays. NumPy
is the first step in learning to become a Python data scientist in the future. Various other libraries like Pandas,
Matplotlib, and Scikit-learn are built on using some concepts of this magical library.
NumPy can be imported into the Jupyter Notebook by using the given statement:
>>> import numpy # this will import the complete numpy
# package
OR
>>> import numpy as npy # this will import numpy and referred
# as npy
OR
>>> from numpy import array # this will import ONLY arrays
# from whole numy package
222 Touchpad Artificial Intelligence-X

