Page 160 - trackpad v5.1 class 8 flipbook
P. 160

DATA SCIENCE

                  Data science is a field that studies data and the ways it can be transformed into valuable input and
                  resources to create business and IT strategies. This is a science that combines domain expertise,
                  programming  skills and knowledge  of mathematics  to  extract  insights  from the  large and
                  ever-increasing volumes of data collected by organisations.

                  WHAT IS BIG DATA?

                  Big data is a term used for any dataset that is large or complex to be processed by traditional
                  data management techniques such as RDBMS (Relational Database Management Systems). It
                  involves the methods of analysing large amounts of data and extracting knowledge from it. Data
                  science and big data have evolved from the traditional data management and are now treated
                  as distinct disciplines.
                  Any dataset can be considered as big data if it possesses at least one of the following four V’s:
                     Volume: Large volume of data
                     Velocity: Data movement at high velocity
                     Variety: Diversity in the types of data

                     Veracity: Data obtained from authentic sources




                                      Volume            Velocity         Variety          Veracity
                                   The Amount of Data  The Speed of Data  The Different   The Quality of Data
                                                                        Types of Data





                  CATEGORIES OF DATA

                  We come across different types of data and we need different tools to work on this data. Let us
                  take a look at the different types of data.

                  Structured Data
                  Structured data is highly organised and formatted to be easily searchable, typically in databases
                  using rows and columns (e.g., SQL databases). It follows a predefined schema, making it efficient
                  for querying and analysis. For example:
                     Inventory Management Systems: Structured data in inventory systems helps manage stock.
                    You can query the system to find out how many units of a specific item are available or when
                    to reorder based on stock levels.
                     Employee Records: In an HR database, employee information is stored in structured tables.
                    Each employee has a unique row in the table, and the structured format makes it easy to
                    generate reports on salaries, hire dates, or department staffing.


                  Unstructured Data
                  Unstructured data lacks a predefined format or organisation, making it more difficult to search
                  and analyse. It includes diverse data types like text, images, audio, and videos. For example:


                  158   Pro (V5.1)-VIII
   155   156   157   158   159   160   161   162   163   164   165