Page 129 - Artificial Intellegence_v2.0_Class_12
P. 129
Experiential Learning
Video Session
can the code or visit the follo ing link to atch the video raining and esting
https .youtube.com atch v qr p us
After atching the video, ans er the follo ing question
hat is the role of data in training a model
T r ain T est S plit Pr ocedur e in Python
he train test split ) function in the scikit learn ython machine learning package implements the train test split
evaluation procedure. he function accepts a dataset as input and returns the dataset split into t o subsets.
ou can use any of the follo ing statements
X _ tr ain, X _ test, y _ tr ain, y _ test = tr ain_ test_ sp lit( X , y , test_ siz e= 0 .3 3 )
O
X _ tr ain, X _ test, y _ tr ain, y _ test = tr ain_ test_ sp lit( X , y , tr ain_ siz e= 0 .6 7 )
Example:
split a dataset into train and test sets
from sklearn.datasets import make blobs
from sklearn.model selection import train test split
create dataset
X , y = mak e_ b lob s( n_ samp les= 1 0 0 0 )
split into train test sets
X _ tr ain, X _ test, y _ tr ain, y _ test = tr ain_ test_ sp lit( X , y , test_ siz e= 0 .5 0 )
p r int( X _ tr ain.shap e, X _ test.shap e, y _ tr ain.shap e, y _ test.shap e)
Output:
, ) , ) ,) ,)
Out of samples ) is for training set and ) is for test set.
C r oss- V alidation Pr ocedur e
ross validation is a resampling technique for evaluating machine learning models on a small sample of data. he
process includes only one parameter, k, that specifies the number of groups into hich a given data sample should be
divided. As a result, the process is frequently referred to as k fold cross validation. or e ample, k for fold cross
validation. It's a popular strategy since it's straightfor ard to grasp and produces a less biased or optimistic estimate of
model competence than other approaches such as a simple train test split.
he follo ing is the general procedure
. andomly shuf e the dataset.
. Organise the data into k groups.
. or each distinct group, rite
• As a holdout or test data set, use the group.
• As a training data set, use the remaining groupings.
• it a model to the training set and test it against the test set.
• eep the evaluation score but toss out the model.
C apstone P roj e ct

