Page 317 - Artificial Intellegence_v2.0_Class_11
P. 317

B.   Fill in the blanks.
                     1.   Classification are of two types         and               .
                     2.   In K-means algorithm, K has to be chosen          .
                     3.                  is the percentage of correct predictions out of all the samples.

                     4.   A clustering algorithm uses the          to cluster data.
                     5.                  clustering builds a tree of clusters.
                 C.   State whether the following statement is True or False.
                     1.   Outliers do not affect the clustering process.
                     2.   K-Means algorithm tries to minimize the sum of distances between the points and their
                          respective cluster centroid.
                     3.   Email spam filter uses unsupervised learning algorithm.
                     4.   In logistic regression, the resultant variable y can either be 1 or -1.
                     5.   Clustering is used to identify patterns and structures in unlabelled data sets.

                 D.   Match the following.
                     1.   Clusters                               a.     labelling a dataset into different classes
                     2.   Classification                         b.   Gaussian distribution
                     3.   Distribution-based Clustering          c.     The actual value was negative and the classification model
                                                                     also predicted negative
                     4.   The predicted value doesn’t tally with   d.   Groups of similar items
                          the actual value
                     5.   True Negative                          e.   False Positive


                                                  SECTION B (Subjective Type Questions)
                 A.   Short answer type questions.
                       1.  Why can’t linear regression be used in place of logistic regression for binary classification?

                     Ans.  The output predicted by linear regression is a continuous value. For example, linear regression is used to predict
                          the sales of a product. The predicted value is a real number, ranging from negative infinity to positive infinity.
                          The regression line is also a straight line.
                          However, logistic regression is used for classification problems. It predicts a probability value ranging between
                          0 and 1. For example, logistic regression is used in recommender system which predicts whether a customer will
                          make a purchase a product or not. The regression line is a Sigmund curve.
                       2.  List the steps of K-means clustering algorithm.
                     Ans.  Step 1:   Decide the number of clusters (k)
                          Step 2:   Select k random points from the data as centroids
                          Step 3:   Group all the points to the nearest centroid
                          Step 4:   Calculate the centroid of newly formed clusters
                          Step 5:   Repeat steps 3 and 4
                       3.  Differentiate between classification and clustering.
                     Ans.                  Classification                                Clustering

                           It is a supervised machine learning algorithm.  It is an unsupervised machine learning algorithm.
                           Classifies new data into known classes.      Groups data into clusters based on similar patterns.
                           Uses labelled samples from a set of classes.  The samples used are unlabelled.





                                                                                    Classification & Clustering   315
   312   313   314   315   316   317   318   319   320   321   322