Page 129 - Artificial Intellegence_v2.0_Class_12
P. 129

Experiential Learning

                        Video Session


                   can the    code or visit the follo ing link to  atch the video   raining and  esting
                  https      .youtube.com  atch v    qr p us
                  After  atching the video, ans er the follo ing question
                   hat is the role of data in training a model










            T r ain T est S plit Pr ocedur e in Python
             he  train test split )  function  in  the  scikit learn   ython  machine  learning  package  implements  the  train test  split
            evaluation procedure.  he function accepts a dataset as input and returns the dataset split into t o subsets.
             ou can use any of the follo ing statements
            X _ tr ain,  X _ test,  y _ tr ain,  y _ test =  tr ain_ test_ sp lit( X ,  y ,  test_ siz e= 0 .3 3 )
            O
            X _ tr ain,  X _ test,  y _ tr ain,  y _ test =  tr ain_ test_ sp lit( X ,  y ,  tr ain_ siz e= 0 .6 7 )
            Example:
              split a dataset into train and test sets
            from sklearn.datasets import make blobs
            from sklearn.model selection import train test split
              create dataset
            X ,  y  =  mak e_ b lob s( n_ samp les= 1 0 0 0 )
              split into train test sets

            X _ tr ain,  X _ test,  y _ tr ain,  y _ test =  tr ain_ test_ sp lit( X ,  y ,  test_ siz e= 0 .5 0 )
            p r int( X _ tr ain.shap e,  X _ test.shap e,  y _ tr ain.shap e,  y _ test.shap e)
            Output:
                ,  )     ,  )     ,)     ,)
            Out of      samples         ) is for training set and         ) is for test set.

            C r oss- V alidation Pr ocedur e
             ross validation is a resampling technique for evaluating machine learning models on a small sample of data.  he
            process includes only one parameter, k, that specifies the number of groups into  hich a given data sample should be
            divided. As a result, the process is frequently referred to as k fold cross validation.  or e ample, k    for    fold cross
            validation. It's a popular strategy since it's straightfor ard to grasp and produces a less biased or optimistic estimate of
            model competence than other approaches such as a simple train test split.
             he follo ing is the general procedure
             .   andomly shuf e the dataset.
             .  Organise the data into k groups.
             .   or each distinct group,  rite
                •  As a holdout or test data set, use the group.
                •  As a training data set, use the remaining groupings.
                •   it a model to the training set and test it against the test set.
                •   eep the evaluation score but toss out the model.


                                                                                       C apstone  P roj e ct
   124   125   126   127   128   129   130   131   132   133   134