Page 157 - Trackpad_V2.1_class8
P. 157
DATA SCIENCE
Data science is a field that studies data and the ways it can be transformed into valuable input and
resources to create business and IT strategies. This is a science that combines domain expertise,
programming skills and knowledge of mathematics to extract insights from the large and ever-
increasing volumes of data collected by organisations.
WHAT IS BIG DATA?
Big data is a term used for any dataset that is large or complex to be processed by traditional
data management techniques such as RDBMS (Relational Database Management Systems). It
involves the methods of analysing large amounts of data and extracting knowledge from it. Data
science and big data have evolved from the traditional data management and are now treated
as distinct disciplines.
Any dataset can be considered as big data if it possesses at least one of the following four V’s:
Volume: Large volume of data
Velocity: Data movement at high velocity
Variety: Diversity in the types of data
Veracity: Data obtained from authentic sources
Volume Velocity Variety Veracity
The Different
The Amount of Data The Speed of Data The Quality of Data
Types of Data
CATEGORIES OF DATA
We come across different types of data and we need different tools to work on this data. Let us
take a look at the different types of data.
Structured Data
Structured data is highly organised and formatted to be easily searchable, typically in databases
using rows and columns (e.g., SQL databases). It follows a predefined schema, making it efficient
for querying and analysis. For example:
Inventory Management Systems: Structured data in inventory systems helps manage stock.
You can query the system to find out how many units of a specific item are available or when
to reorder based on stock levels.
Employee Records: In an HR database, employee information is stored in structured tables.
Each employee has a unique row in the table, and the structured format makes it easy to
generate reports on salaries, hire dates, or department staffing.
Unstructured Data
Unstructured data lacks a predefined format or organisation, making it more difficult to search
and analyse. It includes diverse data types like text, images, audio, and videos.
Introduction to SDGs and Data Science 155

