Page 167 - Ai_V1.0_Class9
P. 167
Data Acquisition from Websites
The process of collecting data from websites using software is called Data Scraping. It is a common method for
extracting information from websites. It is commonly known as Web Scraping.
Just like you might copy text from a book or from your friend, data scraping involves copying information from
websites. But instead of doing it manually, we use special tools or programs to do it automatically. These tools
can navigate websites, find the information we want, and copy it into the required format.
We scrape websites to get data needed for different reasons. It can be collecting prices for market research,
news articles for analysis, or customer reviews for a product.
While web scraping is not illegal, using data without permission is illegal. Think of web scraping like picking fruit
from someone else’s garden without their permission and it is also about what you do with the fruit afterwards
Using data with permission is legal and ethical, just like getting permission from the owner of garden to take
fruit. It’s all about respecting the rights of the website owner and following the rules.
Ethical Concerns in Data Acquisition
While gathering data and choosing datasets, certain ethical issues can be addressed before they occur:
Bias Take steps to understand and avoid any preferences
or partiality in data
Consent Take necessary permissions before collecting or
using an individual's data
Transparency Explain how you intend to use the collected data and
do not hide intentions
Anonymity Protect the identity of the person who is the source
of data
Accountability Take responsibility for your actions in case of misuse
of data
Usability, Features, and Preprocessing of Data
Data is indeed a collection of information gathered through various means such as observations, measurements,
research, surveys or analysis. This information can include a wide range of elements like facts, numbers, names,
figures, or descriptions of things. To make data easier to understand and analyse, it is often organised into
formats such as graphs, charts, or tables.
Usability of Data
Let's take an example of completing a school project. You need clear instructions, a neat workspace, and accurate
information. Similarly, using data effectively relies on its clarity, organisation, and accuracy. There are three
primary factors determining the usability of data:
Data Literacy 165

