Page 214 - AI Ver 1.0 Class 9
P. 214

Data Acquisition


        Data acquisition means collecting raw facts, figures or statistics from relevant sources either for reference or
        for analysis needed for AI projects. In the process of making an AI project cycle, data acquisition is the second
        stage. It is a time-consuming process as different types of data are scattered everywhere and we need to focus
        on data relevant to our needs.

                 What is Data?


        Data is a piece of raw information or facts and statistics collected together for reference or analysis. They are
        raw facts that need to be processed to get meaningful information. Whenever we want the AI project to be able
        to predict an output, we need to train it with a data set first. Data plays an important part of an AI project as it
        creates the base on which the AI project is built.


        Types of Data

        There are two types of data:
           • Training Data:  It is data on which we train our AI project model. It is basically to fit the parameters of the
           project for the model. In training data, the output is available to the model.
           • Testing Data: It is used to check the performance of an AI model. In testing data, the data is not seen for which
           the predictions have to be made.

        For example, if we want to prepare an AI model to predict the school average of students in board examination,
        we will feed the marks obtained by students in board examination in the previous years, this will be treated as
        training data. Once the model is ready, it will predict the school average for the coming year. Now when we are
        testing it, we feed the different data set and that is the testing data.


        Data Features

        In the data acquisition stage, it is very important that the data we provide to an AI project is relevant. How do
        we know what data to be used in a problem scoping?
           • We need to visualize the factors that affect the problem statement, for which we need to extract the data
           features.
           • We need to find out the parameters that will affect the problem statement directly or indirectly.

           • Data features refer to the type of data that you want to collect. In the above example, the data features would
           be each subject average, number of students taking the exam, theory and practical marks distribution of each
           subject, etc.


        Reliable Sources of Relevant Data
        Data is the base for any AI project to be built. When the data is acquired, it's important to check if it's from a
        reliable and authentic source for the accuracy of the project.

        Also, the acquisition methods should be authentic so that there's no conflict in achieving project goals.






                  212   Touchpad Artificial Intelligence-IX
   209   210   211   212   213   214   215   216   217   218   219