Page 281 - Ai_V3.0_c11_flipbook
P. 281

Method                       Description                              Example

                                           Collecting data on the Earth’s surface  Monitoring deforestation and land use changes
                  Satellite Data Tracking  and atmosphere through satellite.    using satellite imagery.


                                           Collecting data from websites that  Kaggle, GitHub, KDNuggets, Google Dataset
                  Online Data Platforms    provide pre-compiled datasets for  Search etc.
                                           diverse purposes.

                  Books,  Textbooks,  and   Information that has been researched,  “Ancient Civilisations: A Comprehensive Guide”
                  Encyclopedias            compiled, and written by authors or editors. written by an expert historian.




                               Brainy Fact


                       Here are some fun facts about data:
                          • To download all of the information from the Internet it would take 181 million years.
                          • Every two days, we generate as much data as we did from the beginning of time until 2003.
                          • There are almost as many digital information bits as there are stars in the universe.

                          • Less than 0.5 percent of the data we generate is ever used or evaluated.
                          • According to PragmaticWorks, poor-quality data costs global firms between 20 and 35 percent of
                         their operating revenue.
                          • If you burn all the data created in one day onto DVDs, you could stack them on top of each other and
                         reach the moon twice.





                         Exploring Data


                 Exploration, one of the first steps in data preparation, is a way to know data before working with it. Exploring data
                 involves familiarising oneself with the data, and understanding its value—whether it is usual, unusual, widely distributed,
                 or extreme. This process not only helps in understanding the dataset better but also provides an opportunity to detect
                 and rectify any data issues that might affect the analysis results. Outliers are data points that significantly differ from
                 the majority of the data in a dataset. They can indicate variability in the measurement, experimental errors, or novel
                 phenomena. By addressing these problems during the exploration phase, one ensures that the conclusions drawn
                 from the analysis are reliable and accurate. Data exploration uses statistical methods and visualisation tools to:
                    • Evaluate the size and quality of your data.
                    • Detect outliers or anomalies.

                    • Identify possible links between data components, files, and tables.
                    • Look for similarity, patterns, relationships, and outliers.

                    • Determine the relationships between different variables.
                 Data exploration can be applied in a variety of areas, including banking, healthcare, retail, and marketing.





                                                                   Data Literacy—Data Collection to Data Analysis  279
   276   277   278   279   280   281   282   283   284   285   286