Page 169 - Touhpad Ai
P. 169

The following steps are taken to clean/prepare the data:
                 u  Missing Data: Missing data refers to the absence of certain values in the dataset, which can result from various
                   causes. To handle missing data, strategies include removing rows or columns with missing values, imputing missing
                   values with estimates, or utilising algorithms that can manage missing data.
                 u  Outliers (extreme values): Outliers are data points that deviate significantly from most of the dataset, typically
                   due to errors or uncommon occurrences. Managing outliers includes detecting and excluding them, transforming
                   the data, or applying robust statistical techniques to minimise their influence.
                 u  Inconsistent Data: Inconsistent data, such as typographical errors or variations in data types, is rectified to ensure
                   uniformity and coherence across the dataset.

                 u  Duplicate Data: Duplicate data is identified and eliminated to maintain data integrity and accuracy.

                 Cleaning Data in Excel

                 Cleaning data in Excel means carefully reviewing the dataset to correct errors and improve accuracy. This process
                 includes removing duplicate entries, ensuring consistent formatting, correcting mistakes, handling or filling in missing
                 values, and transforming the data into a more useful structure if required.
                 We shall be performing cleaning measures on the following data:

                     ID        Name          Region        Rating           Product         Quantity     Price Per Unit
                      1     AAYUSH        North         Good          Pasta sauce              10            `20.00
                      2     Jai           East          Excelent      Oil                      15            `50.00

                      3     Kritika       West          Poor          Capsicum                  0                 na
                      4     Ananya        South         Average       Salt                     25            `30.00

                      5     Chetna        East          Good          Bitter     Gourd         30              `16.67
                      6     AnaNYa                      Excelent      Coffee                    0                 na

                      7     Vivek         West          Poor          Origano                  35             `10.00
                      8     Tarini        South         Average       Tea                      40             `15.00

                      9     Anushka       East          Good          Notebooks                45              `12.22
                     10     Ananya        North         Excelent      Butter Paper             50             `14.00
                      11    Saanvi        West          Poor          Toothpaste                5            `160.00

                     12     Gehna         South         Average       Pepper                   20             `45.00

                     13     Hardik        East          Good          Pickles                   0                 na
                     14     Harsimar                    Excelent      Honey                    30             `36.67
                      4     Ananya        South         Average       Salt                     25            `30.00

                      5     Chetna        East          Good          Bitter     Gourd         30              `16.67
                      6     AnaNYa                      Excelent      Coffee                    0                 na

                     15     Mehak         West          Poor          Roohafza                 35             `34.59
                     16     Pratty                      Average       Tomato Sauce              0                 na

                     17     Rajat         East          Good          Chocolates               40             `35.00
                     18     ANITA         North         Excelent      Plastic Bowls            45             `33.33



                                                                                                  Data Visualization  167
   164   165   166   167   168   169   170   171   172   173   174