Page 352 - Ai_V3.0_c11_flipbook
P. 352

K-Means                                                Clustering


                            Randomly Select                Each Object Assigned             Clusters Centres Updated
                            K-Clusters (K=2)                To Similar Centroid          Depending On Renewed Cluster
                                                                                                   Mean

                    100                              100                                100
                    90                                90                                 90
                    80                                80                                 80
                    70                                70                                 70
                    60                                60                                 60
                    50                                50                                 50
                    40                                40                                 40
                    30                                30                                 30
                    20                                20                                 20
                    10                                10                                 10
                     0                                 0                                 0
                       0  10  20  30  40  50  60  70  80  90  100  0  10  20  30  40  50  60  70  80  90  100  0  10  20  30  40  50  60  70  80  90  100
                                           Re-Assign                                             Re-Assign
                                          Data Points                                           Data Points
                                                                        Update Cluster
                                           100                                  Centres 100
                                            90                                         90
                                            80                                         80
                                            70                                         70
                                            60                                         60
                                            50                                         50
                                            40                                         40
                                            30                                         30
                                            20                                         20
                                            10                                  Iterative  10
                                             0                                          0
                                               0  10  20  30  40  50  60  70  80  90  100  Process  0  10  20  30  40  50  60  70  80  90  100
              Advantages of K-Means Clustering

              Some of the advantages of K-Means Clustering are:
              •  Easy to implement.
              •  Can handle large data sets.
              •  Can give initial positions to centroids (randomly).

              •  Easily adapts to new data.
              •  Can easily adapt to clusters of different shapes and sizes, like elliptical clusters.

              Disadvantages of K-Means Clustering
              Some of the disadvantages of K-Means Clustering are:

              •  K has to be chosen manually and it is not an easy process.
              •  The algorithm is dependent on initial values.
              •  Outliers greatly affect the clustering process.
              •  The algorithm has trouble grouping data where clusters are of fluctuating sizes and density.


                       Why is Clustering Unsupervised?


              Clustering is an unsupervised machine learning technique that automatically divides the data into clusters or groups
              of similar elements. The algorithm does this without any knowledge of how the groups should look in advance. So,
              clustering is rather used for the discovery of knowledge rather than for prediction. It provides an idea of natural
              groupings that are within data.




                    350     Touchpad Artificial Intelligence (Ver. 3.0)-XI
   347   348   349   350   351   352   353   354   355   356   357