Page 121 - 2611_SmartGPT Pro V(5.0) C-8
P. 121

Unstructured Data
                 Unstructured data lacks a predefined format or organisation, making it more difficult to search
                 and analyse. It includes diverse data types like text, images, audio, and videos. For example:
                   Natural Language Data: It is a type of unstructured data and is very difficult to process. The
                   meaning of the same word changes depending on the mood of the speaker. For example, the
                   same word could have two different meanings when spoken joyfully or when uttered sadly.
                   Audio, video, and images: This type of data is the biggest challenge for data scientists because
                   finding objects and patterns turns out to be a challenging task for computers.


                 Semi-structured Data
                 Semi-structured data contains elements of both structured and unstructured data, with some
                 organisational properties but no rigid schema. It often uses tags or markers to separate data
                 fields. For example:
                   Graph-based or Network Data: The data that is generated from a relationship or connections
                   between objects is called graph-based or network data. Such type of data is found on social
                   media websites and is a natural way to represent networks.
                   Streaming Data: Streaming data can take any of the forms. It is not a different kind of data, but
                   it flows into the systems instead of being loaded into a data store in batches.




                           WHY DATA SCIENCE?

                 Data  has become  an important  fuel on which industries  function  today.  For companies  to
                 grow and flourish, they need data to be analysed. This analysis then helps them measure their
                 performance  and  gauge  the  expectations  of the  market.  Healthcare  industries  also  use  data
                 science to recognise microscopic tumours and deformities at an early stage of diagnosis. The
                 person responsible for dealing with data in order to assist companies in making proper decisions
                 is called a data scientist.




                      Tick ( ) if you know this.
                      ▶   Big data refers is a term used for large dataset that is complex and is to be processed by
                         traditional data management techniques.
                      ▶   Different types of data include structured, unstructured, semi structured, etc.






                           ROLE OF DATA SCIENTIST

                 A data scientist is someone who uses data sets to create a hypothesis and then works on the
                 data set to analyse the data, interprets it and makes sense of it. A data scientist is responsible for
                 dealing with all types of data. He/she uses various tools and practises to recognise patterns within
                 the data.






                                                                               Introduction to SDGs and Data Science  119
   116   117   118   119   120   121   122   123   124   125   126