Page 197 - AI Ver 3.0 class 10_Flipbook
P. 197
Note, Continuous data includes values that can be measured and take any value within a range (e.g., height,
temperature). It is analysed using regression and probability distributions.
Discrete data consists of countable values with no in-between (e.g., number of students, dice rolls). It is analysed
using frequency counts and probability tables.
Task 21 st Century #Critical Thinking
Skills
Identify the Model: Classification or Regression?
1. Estimating the price of a house
2. Determining if an email is spam or not
3. Predicting a student's test score out of 100
4. Identifying the species of a flower
5. Predicting whether a customer is eligible for a bank loan or not?
6. Predicting weather for next 24 hours
Sub Categories of Unsupervised Learning
Unsupervised Learning can further be divided into: Clustering and Association. Let us discuss these in detail.
Clustering
Clustering is a machine learning approach where the machine partitions the dataset into different clusters or
categories based on machine generated algorithms. The data fed to such a model is usually unlabelled or random
and thus the developer feeds in the data directly into the
machine and instructs it to build its own algorithm. The short hair people
machine then forms a pattern or cluster based on training long hair people
data and groups those that follow the same pattern. Like,
Model segregates people with long and short hair and
forms two clusters based on it as shown in the graph.
The best clustering is the one that minimises the error.
Clustering works on discrete dataset. For example, if you
have random data of insects and reptiles, since you are
unable to find any meaningful pattern amongst them, you
would feed their data into the clustering algorithm. The
algorithm would then analyse the data and divide them into
clusters according to their similarities based on the trends noticed. The clusters are then given as the output.
Advanced Concepts of Modeling in AI 195

