Page 228 - AI Ver 3.0 Class 11
P. 228

[4.6 3.4 1.4 0.3]
                   [5.  3.4 1.5 0.2]
                   [4.4 2.9 1.4 0.2]

                   [4.9 3.1 1.5 0.1]]
              In the above code, we first load the Iris dataset using the load_iris() function and store the dataset in the iris
              variable. Then, the code prints the first 10 lines of the dataset. Here, each row represents an iris flower and each column
              represents a feature (or a measurement of the flower). The format of the data is sepal length, sepal width, petal length,
              petal width (in cm). iris.data[:10] uses Python slicing to select the first 10 rows of the feature array. The syntax
              [:10] means "from the start up to, but not including, the 10th index", which effectively gives the first 10 rows (0-9 rows).

              Separating Dataset into Feature and Target Value
              Splitting data into training and testing sets is a critical step in machine learning to evaluate the performance of your
              model on unseen data. Typically, you use a portion of your data to train the model and the rest to test its performance.
              In the context of supervised learning, it is common practice to separate the dataset into features and target values. The
              description of these values is as follows:

               • •    Features values: Features are the variables or attributes that describe the characteristics of the data samples.
                  For example, in the Iris dataset, the features are the measurements of sepal length, sepal width, petal length,
                  and petal width. These features are used as inputs to the machine learning model to make predictions or
                  classifications.
               • •    Target values: The target values, also known as labels or classes, are the values we want the model to predict or
                  classify. For example, in the Iris dataset, the target values represent the species of each iris flower: Setosa, Versicolor,
                  or Virginica. The model learns to associate patterns in the features with the corresponding target values during the
                  training process.

                Program 59: To separate the data into features and target values for IRIS dataset

                   # load dataset
                   from sklearn.datasets import load_iris
                   iris = load_iris()

                   # separate the data into features and target
                   X = iris.data
                   y = iris.target
              Let us understand the above code:
              X = iris.data
               • •  This statement assigns the feature data of the Iris dataset to the variable X.
               • •    iris.data contains the measurements for the features (sepal length, sepal width, petal length, and petal width) of the
                  Iris flowers.
               • •    X will be a NumPy array where each row corresponds to a sample (flower) and each column corresponds to a feature.
              Y = iris.target

               • •    This statement assigns the target labels of the Iris dataset to the variable y, i.e. Y will hold the labels that the machine
                  will learn to predict.
               • •  iris.target contains the labels for the species of the Iris flowers.
               • •    Y will be a NumPy array where each element is the species label for the corresponding row in X.
              Let us print the first 10 values of X and Y.


                    226     Touchpad Artificial Intelligence (Ver. 3.0)-XI
   223   224   225   226   227   228   229   230   231   232   233