Page 154 - AI Ver 1.0 Class 10
P. 154
After filling the 4Ws Problem canvas, you now need to summarise all the cards into one template. The Problem
Statement Template helps us to summarise all the key points into one single Template so that in future, whenever
there is a need to look back at the basis of the problem, we can take a look at the Problem Statement Template
and understand the key elements of it.
Stage 2: Data Acquisition
This is the second stage of the AI project cycle. It is the process of collecting data required for training the AI
project. Data is raw information that is used to generate meaningful outcomes.
If we have to make an artificial intelligence system to predict the traffic flow for a particular geographical location
based on the previous traffic data. The data needs to be fed for the previous year into the system and the machine
can be trained to use it to predict the traffic flow effectively. The previous year traffic flow data is known as the
Training Data and the prediction it makes is the Testing Data.
The efficiency of the AI system is dependent on the authenticity and relevance of the training data. Like the
previous topic discussed, if the traffic data is not for the same geo location or not for the same period last year, the
predictability of the machine would not be accurate.
Hence, for the AI system to be able to work efficiently, the authenticity and relevance of the training data to the
scope of the problem statement is a must.
Data Features
Data features refers to the type of data to be collected. In our previous example data features would be day, date
and time of the data collected.
Now the next step is to know how and from where the data can be collected. The data can be collected through:
• Surveys: Customer’s feedback and reviews.
• Web scraping: Data extracted from various web pages.
• Sensors: Data collected from various sensors to track the conditions of physical things can be monitored in real
time.
• Cameras: Live data from surveillance cameras, web cameras, etc.
• Observations: Reading and analysing trends.
• Application Programme Interface (API): Application programs generate data of their own while working, like
data on their servers.
Nowadays data is becoming the new raw material for almost all businesses. This data is now considered as new
gold. There are innumerable sources on the internet that can provide data, but we have to be sure that the source
of data should be authentic and relevant, only then the AI project can predict precisely and accurately.
There are some authentic sources of information in the form of open-sourced websites hosted by the government.
These portals have information collected in a format that can be easily downloaded. Some of these open-source
Govt. portals are: data.gov.in, india.gov.in.
Stage 3: Data Exploration
This is the third stage in the AI project cycle. It refers to exploring the large data to uncover the patterns or trends
needed for the AI project.
It is considered to be the first step in data analysis where unstructured data is explored, researched, filtered and
visualised to decide the strategy for the type of model used in the later stage.
152 Touchpad Artificial Intelligence-X

