Page 67 - KEC Khaitan C8.5 Flipbook
P. 67
CATEGORIES OF DATA
We come across different types of data and we need different tools to work on this data. Let us
take a look at the different types of data.
Structured Data
Structured data is highly organised and formatted to be easily searchable, typically in databases
using rows and columns (e.g., SQL databases). It follows a predefined schema, making it efficient
for querying and analysis. For example:
Inventory Management Systems: Structured data in inventory systems helps manage stock.
You can query the system to find out how many units of a specific item are available or when
to reorder based on stock levels.
Employee Records: In an HR database, employee information is stored in structured tables.
Each employee has a unique row in the table, and the structured format makes it easy to
generate reports on salaries, hire dates, or department staffing.
Unstructured Data
Unstructured data lacks a predefined format or organisation, making it more difficult to search
and analyse. It includes diverse data types like text, images, audio, and videos. For example:
Natural Language Data: It is a type of unstructured data and is very difficult to process. The
meaning of the same word changes depending on the mood of the speaker. For example, the
same word could have two different meanings when spoken joyfully or when uttered sadly.
Audio, video, and images: This type of data is the biggest challenge for data scientists because
finding objects and patterns turns out to be a challenging task for computers.
Semi-structured Data
Semi-structured data contains elements of both structured and unstructured data, with some
organisational properties but no rigid schema. It often uses tags or markers to separate data
fields. For example:
Graph-based or Network Data: The data that is generated from a relationship or connections
between objects is called graph-based or network data. Such type of data is found on social
media websites and is a natural way to represent networks.
Streaming Data: Streaming data can take any of the forms. It is not a different kind of data, but
it flows into the systems instead of being loaded into a data store in batches.
WHY DATA SCIENCE?
Data has become an important fuel on which industries function today. For companies to
grow and flourish, they need data to be analysed. This analysis then helps them measure their
performance and gauge the expectations of the market. Healthcare industries also use data
science to recognise microscopic tumours and deformities at an early stage of diagnosis. The
person responsible for dealing with data in order to assist companies in making proper decisions
is called a data scientist.
Introduction to SDGs and Data Science 65

