Page 126 - Ai_V1.0_Class9
P. 126

While acquiring data, you need to collect data in a regular bases or timely manner to ensure it is up-to-date and
              reflects current conditions or trends. You need to also ensure that the data you acquired is complete and sufficient.
              If you don't have sufficient data, your analysis may be incomplete or unreliable.


              What is Data?
              Data is a piece of raw information or facts and statistics collected together for reference or analysis. They are raw
              facts that need to be processed to get meaningful information. Whenever we want the AI project to be able to
              predict an output, we need to train it with a data set first. Data plays an important part in an AI project as it creates
              the base on which the AI project is built.


              Types of Data
              There are two types of data:
                 • Training Data: It is data on which we train our AI project model. It is basically to fit the parameters of the project
                 for the model. In training data, the output is available to the model.

                 • Testing Data: It is used to check the performance of an AI model. In testing data, the data is not seen for which
                 the predictions have to be made.
              For example, if we want to prepare an AI model to predict the school average of students in board examinations,
              we will feed the marks obtained by students in board examinations in the previous years, this will be treated as
              training data. Once the model is ready, it will predict the school average for the coming year. Now when we are
              testing it, we feed the different datasets and that is the testing data.


              Data Features

              In the data acquisition stage, it is very important that the data we provide to an AI project is relevant. How do we
              know what data to be used in a problem scoping?

                 • We need to visualise the factors that affect the problem statement, for which we need to extract the data features.
                 • We need to find out the parameters that will affect the problem statement directly or indirectly.
                 • Data features refer to the type of data that you want to collect. In the above example, the data features would
                 be each subject average, the number of students taking the exam, the theory and practical marks distribution
                 of each subject, etc.


              Reliable Sources of Relevant Data
              Data is the base for any AI project to be built. When the data is acquired, it's important to check if it's from a
              reliable and authentic source for the accuracy of the project.

              Also, the acquisition methods should be authentic so that there's no conflict in achieving project goals.
              There are various sources to collect relevant data for our project:

                 • Surveys: Data can be collected from online surveys, telephonic surveys or in-person surveys to collect responses.
                 Surveys are a way of collecting data from a group of people to gain information and insights into various topics
                 of interest. The process involves asking people for information through questionnaires which can be online or
                 offline. It can be considered as a data source.

                 • Web Scraping: Data or information can also be extracted from a website. Web scraping or Data scraping is the
                 method of downloading information from the World Wide Web (WWW) and storing it on your computer for
                 later reference. The data collected in this way is online data.


                    124     Artificial Intelligence Play (Ver 1.0)-IX
   121   122   123   124   125   126   127   128   129   130   131