Page 167 - Ai_V1.0_Class9
P. 167

Data Acquisition from Websites

                 The process of collecting data from websites using software is called Data Scraping. It is a common method for
                 extracting information from websites. It is commonly known as Web Scraping.

                 Just like you might copy text from a book or from your friend, data scraping involves copying information from
                 websites. But instead of doing it manually, we use special tools or programs to do it automatically. These tools
                 can navigate websites, find the information we want, and copy it into the required format.
                 We scrape websites to get data needed for different reasons. It can be collecting prices for market research,
                 news articles for analysis, or customer reviews for a product.

                 While web scraping is not illegal, using data without permission is illegal. Think of web scraping like picking fruit
                 from someone else’s garden without their permission and it is also about what you do with the fruit afterwards

                 Using data with permission is legal and ethical, just like getting permission from the owner of garden to take
                 fruit. It’s all about respecting the rights of the website owner and following the rules.


                 Ethical Concerns in Data Acquisition
                 While gathering data and choosing datasets, certain ethical issues can be addressed before they occur:


                                        Bias         Take steps to understand and avoid any preferences
                                                     or partiality in data



                                      Consent        Take necessary permissions before collecting or
                                                     using an individual's data



                                   Transparency      Explain how you intend to use the collected data and
                                                     do not hide intentions



                                    Anonymity        Protect the identity of the person who is the source
                                                     of data



                                   Accountability    Take responsibility for your actions in case of misuse
                                                     of data



                         Usability, Features, and Preprocessing of Data


                 Data is indeed a collection of information gathered through various means such as observations, measurements,
                 research, surveys or analysis. This information can include a wide range of elements like facts, numbers, names,
                 figures, or descriptions of things. To make data easier to understand and analyse, it is often organised into
                 formats such as graphs, charts, or tables.


                 Usability of Data

                 Let's take an example of completing a school project. You need clear instructions, a neat workspace, and accurate
                 information.  Similarly,  using  data  effectively  relies  on  its  clarity,  organisation,  and  accuracy.  There  are  three
                 primary factors determining the usability of data:


                                                                                                Data Literacy   165
   162   163   164   165   166   167   168   169   170   171   172