Page 132 - TP_V5.1_C8_fb
P. 132

Natural Language Data: It is a type of unstructured data and is very difficult to process. The
                    meaning of the same word changes depending on the mood of the speaker. For example, the
                    same word could have two different meanings when spoken joyfully or when uttered sadly.
                     Audio, video, and images: This type of data is the biggest challenge for data scientists because
                    finding objects and patterns turns out to be a challenging task for computers.

                  Semi-structured Data

                  Semi-structured data contains elements of both structured and unstructured data, with some
                  organisational properties but no rigid schema. It often uses tags or markers to separate data
                  fields. For example:

                     Graph-based or Network Data: The data that is generated from a relationship or connections
                    between objects is called graph-based or network data. Such type of data is found on social
                    media websites and is a natural way to represent networks.
                     Streaming Data: Streaming data can take any of the forms. It is not a different kind of data, but
                    it flows into the systems instead of being loaded into a data store in batches.



                            WHY DATA SCIENCE?


                  Data  has become  an important  fuel on which industries  function  today.  For companies  to
                  grow and flourish, they need data to be analysed. This analysis then helps them measure their
                  performance  and  gauge  the  expectations  of the  market.  Healthcare  industries  also  use  data
                  science to recognise microscopic tumours and deformities at an early stage of diagnosis. The
                  person responsible for dealing with data in order to assist companies in making proper decisions
                  is called a data scientist.




                       Tick ( ) if you know this.
                       ▶   Big data refers is a term used for large dataset that is complex and is to be processed by
                          traditional data management techniques.
                       ▶   Different types of data include structured, unstructured, semi structured, etc.





                            ROLE OF DATA SCIENTIST


                  A data scientist is someone who uses
                  data  sets  to  create  a hypothesis  and
                  then works on the data set to analyse                      Subject: Application of Data Science
                  the data, interprets it and makes sense                    Companies like Netflix, Google, and Amazon
                  of it. A data scientist is responsible for                 are using data  science  to  develop  powerful
                                                                             recommendation systems for their users.
                  dealing  with  all types  of data.  He/she
                  uses  various  tools  and practises  to
                  recognise patterns within the data.



                  130   Premium Edition-VIII
   127   128   129   130   131   132   133   134   135   136   137