Page 271 - Ai_417_V3.0_C9_Flipbook
P. 271

Checklist of Factors that make Data Good or Bad

                 Here’s a checklist of factors that can help determine whether data is of good quality (good) data or poor quality
                 (bad) data:

                                         Good Data                                     Bad Data


                          •  Data is well structured                    •  Data is scattered
                          •  It is accurate                             •  Contains a lot of incorrect values
                          •  It is consistent                           •  Contains missing and duplicate values
                          •  It is presented well                       •  It is poorly presented
                          •  Contains facts which are relevant to       •  Contains facts which are not relevant
                              our requirement                               to our requirement


                 Data Acquisition from Websites
                 The process of collecting data from websites using software is called Data Scraping. It is a common method for
                 extracting information from websites. It is commonly known as Web Scraping.

                 Just like you might copy text from a book or from your friend, data scraping involves copying information from
                 websites. But instead of doing it manually, we use special tools or programs to do it automatically. These tools
                 can navigate websites, find the information we want, and copy it into the required format.

                 We scrape websites to get data needed for different reasons. It can be collecting prices for market research,
                 news articles for analysis, or customer reviews for a product.

                 While web scraping is not illegal, using data without permission is illegal. Think of web scraping like picking fruit
                 from someone else’s garden without their permission and it is also about what you do with the fruit afterwards

                 Using data with permission is legal and ethical, just like getting permission from the owner of garden to take
                 fruit. It’s all about respecting the rights of the website owner and following the rules.


                 Ethical Concerns in Data Acquisition
                 While gathering data and choosing datasets, certain ethical issues can be addressed before they occur:


                                        Bias         Take steps to understand and avoid any preferences
                                                     or partiality in data



                                      Consent        Take necessary permissions before collecting or
                                                     using an individual's data



                                   Transparency      Explain how you intend to use the collected data and
                                                     do not hide intentions



                                    Anonymity        Protect the identity of the person who is the source
                                                     of data



                                   Accountability    Take responsibility for your actions in case of misuse
                                                     of data


                                                                                                Data Literacy   269
   266   267   268   269   270   271   272   273   274   275   276