Page 317 - Artificial Intellegence_v2.0_Class_11
P. 317
B. Fill in the blanks.
1. Classification are of two types and .
2. In K-means algorithm, K has to be chosen .
3. is the percentage of correct predictions out of all the samples.
4. A clustering algorithm uses the to cluster data.
5. clustering builds a tree of clusters.
C. State whether the following statement is True or False.
1. Outliers do not affect the clustering process.
2. K-Means algorithm tries to minimize the sum of distances between the points and their
respective cluster centroid.
3. Email spam filter uses unsupervised learning algorithm.
4. In logistic regression, the resultant variable y can either be 1 or -1.
5. Clustering is used to identify patterns and structures in unlabelled data sets.
D. Match the following.
1. Clusters a. labelling a dataset into different classes
2. Classification b. Gaussian distribution
3. Distribution-based Clustering c. The actual value was negative and the classification model
also predicted negative
4. The predicted value doesn’t tally with d. Groups of similar items
the actual value
5. True Negative e. False Positive
SECTION B (Subjective Type Questions)
A. Short answer type questions.
1. Why can’t linear regression be used in place of logistic regression for binary classification?
Ans. The output predicted by linear regression is a continuous value. For example, linear regression is used to predict
the sales of a product. The predicted value is a real number, ranging from negative infinity to positive infinity.
The regression line is also a straight line.
However, logistic regression is used for classification problems. It predicts a probability value ranging between
0 and 1. For example, logistic regression is used in recommender system which predicts whether a customer will
make a purchase a product or not. The regression line is a Sigmund curve.
2. List the steps of K-means clustering algorithm.
Ans. Step 1: Decide the number of clusters (k)
Step 2: Select k random points from the data as centroids
Step 3: Group all the points to the nearest centroid
Step 4: Calculate the centroid of newly formed clusters
Step 5: Repeat steps 3 and 4
3. Differentiate between classification and clustering.
Ans. Classification Clustering
It is a supervised machine learning algorithm. It is an unsupervised machine learning algorithm.
Classifies new data into known classes. Groups data into clusters based on similar patterns.
Uses labelled samples from a set of classes. The samples used are unlabelled.
Classification & Clustering 315

