Page 272 - Ai_417_V3.0_C9_Flipbook
P. 272

Usability, Features, and Preprocessing of Data

              Data is indeed a collection of information gathered through various means such as observations, measurements,
              research, surveys or analysis. This information can include a wide range of elements like facts, numbers, names,
              figures, or descriptions of things. To make data easier to understand and analyse, it is often organised into
              formats such as graphs, charts, or tables.


              Usability of Data
              Let's take an example of completing a school project. You need clear instructions, a neat workspace, and accurate
              information.  Similarly,  using  data  effectively  relies  on  its  clarity,  organisation,  and  accuracy.  There  are  three
              primary factors determining the usability of data:
              1.   Structure of Data: Defines how data is stored. Data needs to have a clear structure. It should be organised
                  in a way that makes sense so that it can be used effectively.
                   Like when your mother starts cooking your favourite food she ensures before cooking that all ingredients
                  are available and are put in order for smooth and organised cooking. For example,

                   Marks of a students arranged in a spreadsheet.








                                                                           Rohit Rawat a student with ID 10187
                                                                             of Class 12 Section D has scored
                                                                                          72%.




                                Spreadsheet – Good structure                 Text document – Poor structure
                        Data is stored in a sheet with the details of each   Data is stored in a text document
                          individual stored according to a set of rules.     with no set of organising rules.

              2.   Cleanliness:  Clean data should not have duplicates, missing values, outliers, and other anomalies so that
                  its reliability and usefulness for analysis is not affected. In the given example, cleaning of data removes the
                  duplicate values.


















              3.   Accuracy: Accuracy is same as reliability so it indicates how well the data matches real-world values. Accurate
                  data closely reflects actual values without errors, enhancing the quality and trustworthiness of the dataset.
                  When your measurement is accurate, it makes your data really good. It’s like having a gold star on your
                  homework—it shows you did a great job!



                    270     Touchpad Artificial Intelligence (Ver. 3.0)-IX
   267   268   269   270   271   272   273   274   275   276   277