Page 229 - AI Ver 3.0 Class 11
P. 229

Program 60: To print the first 10 values of the separated data into X and Y in the IRIS dataset

                     # load dataset

                     from sklearn.datasets import load_iris
                     iris = load_iris()

                     # separate the data into features and target
                     X = iris.data

                     y = iris.target

                     # print the first 10 lines of the dataset
                     print("Features (X):")
                     print(X[:10])


                     print("\nTarget (y):")
                     print(y[:10])
                 Output:

                     Features (X):

                     [[5.1 3.5 1.4 0.2]
                      [4.9 3.  1.4 0.2]
                      [4.7 3.2 1.3 0.2]
                      [4.6 3.1 1.5 0.2]
                      [5.  3.6 1.4 0.2]

                      [5.4 3.9 1.7 0.4]
                      [4.6 3.4 1.4 0.3]
                      [5.  3.4 1.5 0.2]
                      [4.4 2.9 1.4 0.2]

                      [4.9 3.1 1.5 0.1]]
                     Target (y):
                     [0 0 0 0 0 0 0 0 0 0]
                 As you know that, the first 50 samples in the dataset belong to the Iris-setosa species that's why the first 10 target
                 values are all 0.

                 Splitting Data for Training and Testing Set
                 Splitting data into training and testing sets is a critical step in machine learning to evaluate the performance of your
                 model on unseen data. Typically, you use a portion of your data to train the model and the rest to test its performance.
                 The train_test_split function is a machine learning tool that divides a dataset into two parts: one for training the model
                 and another for testing the model. The description for training data and testing data is as follows:
                  • •    Training data (X_train, y_train): This portion of the dataset is utilised to train the model. This data is used to train
                     the model to identify patterns and relationships.
                  • •    Testing data (X_test, y_test): This portion of the dataset is used to evaluate how well the model learns from the
                     training data. The model is evaluated on this data to determine how well it can predict outcomes for new, previously
                     unseen data.


                                                                                         Python Programming     227
   224   225   226   227   228   229   230   231   232   233   234