Page 161 - TrackpadV5.1_class8
P. 161

Natural Language Data: It is a type of unstructured data and is very difficult to process. The
                   meaning of the same word changes depending on the mood of the speaker. For example, the
                   same word could have two different meanings when spoken joyfully or when uttered sadly.
                   Audio, video, and images: This type of data is the biggest challenge for data scientists because
                   finding objects and patterns turns out to be a challenging task for computers.

                 Semi-structured Data

                 Semi-structured data contains elements of both structured and unstructured data, with some
                 organisational properties but no rigid schema. It often uses tags or markers to separate data
                 fields. For example:

                   Graph-based or Network Data: The data that is generated from a relationship or connections
                   between objects is called graph-based or network data. Such type of data is found on social
                   media websites and is a natural way to represent networks.
                   Streaming Data: Streaming data can take any of the forms. It is not a different kind of data, but
                   it flows into the systems instead of being loaded into a data store in batches.



                           WHY DATA SCIENCE?


                 Data  has become  an important  fuel on which industries  function  today.  For companies  to
                 grow and flourish, they need data to be analysed. This analysis then helps them measure their
                 performance  and  gauge  the  expectations  of the  market.  Healthcare  industries  also  use  data
                 science to recognise microscopic tumours and deformities at an early stage of diagnosis. The
                 person responsible for dealing with data in order to assist companies in making proper decisions
                 is called a data scientist.




                      Tick ( ) if you know this.
                      ▶   Big data refers is a term used for large dataset that is complex and is to be processed by
                         traditional data management techniques.
                      ▶   Different types of data include structured, unstructured, semi structured, etc.





                           ROLE OF DATA SCIENTIST


                 A data scientist is someone who uses
                 data  sets  to  create  a hypothesis  and
                 then works on the data set to analyse                      Subject: Application of Data Science
                 the data, interprets it and makes sense                    Companies like Netflix, Google, and Amazon
                 of it. A data scientist is responsible for                 are using data  science  to  develop  powerful
                                                                            recommendation systems for their users.
                 dealing  with  all types  of data.  He/she
                 uses  various  tools  and practises  to
                 recognise patterns within the data.



                                                                               Introduction to SDGs and Data Science  159
   156   157   158   159   160   161   162   163   164   165   166