Page 225 - AI Ver 3.0 class 10_Flipbook
P. 225
Train-Test Split
It’s a model evaluation technique that reveals how the model performs on new data. This technique is used in
machine learning algorithms to evaluate the performance of the model by dividing the dataset into two subsets,
the Training subset and the Testing subset. The train-test procedure is appropriate when there is a sufficiently
large dataset available.
Training subset is used for model training, where it learns patterns from the data. Typically, this subset comprises
70% to 80% of the dataset. Testing subset is used to evaluate the model's generalisation ability on unseen data. It
typically consists of 20% to 30% of the dataset.
10000 labelled
data for image
Testing set classification model
Training set
7000 labelled data 3000 labelled data
used for training used for testing
Need of Train-Test Split
The training dataset is used to make the model learn how to recognise patterns and relationships in the data. Once
the model is trained, the test dataset is used to evaluate its performance. The inputs from the test set are given
to the model, which makes predictions. These predictions are then compared with the actual expected results.
The goal is to understand how well the model can perform on new, unseen data that wasn’t part of the training
process. It provides an unbiased estimate of performance of the machine learning model in real world scenarios
and ensures the model can perform efficiently on the unseen data, rather than on the trained data.
Dataset
Training Data
?
Train The ML
Algorithm
Successful Model
Model Prediction
Input Data
Testing
Data
ML Algorithm
Evaluating Models 223

