Page 225 - AI Ver 3.0 class 10_Flipbook
P. 225

Train-Test Split

                 It’s a model evaluation technique that reveals how the model performs on new data. This technique is used in
                 machine learning algorithms to evaluate the performance of the model by dividing the dataset into two subsets,
                 the Training subset and the Testing subset. The train-test procedure is appropriate when there is a sufficiently
                 large dataset available.

                 Training subset is used for model training, where it learns patterns from the data. Typically, this subset comprises
                 70% to 80% of the dataset. Testing subset is used to evaluate the model's generalisation ability on unseen data. It
                 typically consists of 20% to 30% of the dataset.




                                                                                  10000 labelled
                                                                                  data for image
                                                   Testing set                  classification model

                                                   Training set
                                                                    7000 labelled data        3000 labelled data
                                                                    used for training          used for testing




                 Need of Train-Test Split

                 The training dataset is used to make the model learn how to recognise patterns and relationships in the data. Once
                 the model is trained, the test dataset is used to evaluate its performance. The inputs from the test set are given
                 to the model, which makes predictions. These predictions are then compared with the actual expected results.
                 The goal is to understand how well the model can perform on new, unseen data that wasn’t part of the training
                 process. It provides an unbiased estimate of performance of the machine learning model in real world scenarios
                 and ensures the model can perform efficiently on the unseen data, rather than on the trained data.



                        Dataset
                                       Training Data
                                                                                       ?





                                     Train The ML
                                      Algorithm
                                                                                                        Successful Model





                                            Model                      Prediction
                                          Input Data



                        Testing
                         Data


                                                      ML Algorithm





                                                                                           Evaluating Models    223
   220   221   222   223   224   225   226   227   228   229   230