Page 350 - AI Ver 3.0 Class 11
P. 350

Medical Imaging Analysis

                                                           Clustering  is  used  to  match  patterns  in  the  images  and  identify
                                                           cancerous datasets. A mix of both cancerous and non-cancerous
                                                           datasets are analysed by the clustering algorithms to understand
                                                           the  different  characteristics  present  in  the  dataset,  producing
                                                           resultant clusters.


              How Clustering Works?

              In order to cluster the data, the following steps are conducted:
              1.  Data preparation: Data preparation means including effective data features for the clustering algorithm. The data
                  set must include descriptive features or any new features based on the original set that will be generated, in the
                  input dataset.

              2.   Creating similarity metric: The algorithm tries to understand how similar the pairs of samples are. You quantify
                  the similarity between the samples by creating a similarity metric. This requires clear understanding of your data
                  and how to derive similarity from the data features. For example, consider pin codes of an Indian state. If the
                  difference between two pin codes is small, this represents that the two regions denoted by the pin codes are
                  close to each other and have a higher similarity. When you can quantify the metric manually, it is called ‘manual
                  similarity measure’.

              3.  Run the clustering algorithm: A clustering algorithm uses the similarity metric developed in step 2 to cluster
                  data. Clustering algorithms are able to handle processing of large datasets efficiently. However, they do need to
                  compute the similarity between all pairs.

              4.  Result interpretation: As clustering is unsupervised, the interpretation of results is crucial and can be handled by
                  a human expert. The results are verified against expectations and if improvement is required, the above steps are
                  repeated.

              Types of Clustering

              Clustering algorithms are quite popular. Let us learn about some of them.

              Centroid-based Clustering
              Centroid-based clustering  arranges  the  data into  non-hierarchical clusters.
              K-means clustering is the most popular centroid-based clustering algorithm.
              Centroid-based  algorithms  are  efficient  but  easily  affected  by  the  initial
              conditions and outliers. This type of clustering is also called  Partitioning
              Clustering.





                                                 Density-based Clustering

                                                 Density-based  clustering  groups  high  density  areas  into  clusters.  Hence,
                                                 arbitrary-shaped distributions occur so that dense areas can be connected.
                                                 The data points in the separating regions of low density are considered outliers
                                                 and not assigned to clusters.





                    348     Touchpad Artificial Intelligence (Ver. 3.0)-XI
   345   346   347   348   349   350   351   352   353   354   355