Page 291 - AI Ver 3.0 class 10_Flipbook
P. 291
Step 16 Double-click on Feature Statistics widget. Now you can see that the data is clean and without the
missing values.
When you’re working with a dataset like TrainData, you’ll notice that most of the columns contain numbers. These
columns are called Numeric Features, and they represent the input data that your model will use to learn.
In a supervised learning model, you have two key parts:
1. Features: The input data (like age, weight, or height).
2. Labels: The output or target that you want to predict (like predicting whether someone will like a product or
classifying animals into species).
In the Palmer Penguin dataset, you want to predict the species of the penguins based on other features like size,
weight, or bill length. Since species is the output (what you’re trying to predict), it should be set as the label.
To do this, we need to tell the model that species is a label, not just a regular feature. By using the Select Columns
widget, you can change the Feature Type of the species column from Categorical Feature to Categorical Label. This
means the model will treat species as the output that it should predict, instead of as an input feature.
Step 17 Drag and drop the Select Columns widget on the canvas and connect the output of the Impute widget
to input of the Select Columns widget.
Statisical Data (Practical) 289

