Page 233 - Touhpad Ai
P. 233

Examples of Data Standardisation in Real Life
                In everyday systems, data often comes in different formats or styles, which can cause confusion or errors. Some
                examples of data standardisation are as follows:
                u  Student data: Ensuring roll numbers follow a fixed format (e.g., STU2025_001), names have proper capitalisation,
                    and contact numbers all have 10 digits.
                u  Product data in online stores: Ensuring all product sizes are written consistently as “Small”, “Medium”, “Large”
                    instead of using abbreviations like ‘S’, ‘M’, ‘L’ or other variations.

                u  Financial data: Ensuring currency values follow the same format (e.g., 100.00 instead of 100) and that dates are
                    written in a consistent style.

                u  Healthcare records: Standardising patient information such as blood type (e.g., A+, B−), medical codes, and date
                    formats to prevent errors in treatment or reporting.
                u  E-commerce addresses: Ensuring addresses follow a uniform format (e.g., street name, city, postcode) to improve
                    delivery accuracy.
                u  Survey responses: Standardising answers so that options like “Yes”, “YES”, “Y” all become “Yes”, and numerical
                    ratings follow a consistent scale (e.g., 1–5).
                u  Sensor data in IoT systems: Converting measurements to the same units (e.g., temperature in °C, distance in metres)
                    so data from different devices can be accurately compared.

                Data Cleaning Vs Data Transformation Vs Data Standardisation
                All three techniques may seem similar; they are actually different in purpose and method. Let us understand the
                difference between the three.


                     Aspect            Data Cleaning             Data Transformation        Data Standardisation
                 What it means   Removing or correcting    Changing data into a suitable    Converting data into a uniform
                                 errors, duplicates, and   format or structure for analysis.  scale or format for better
                                 inconsistencies in data.                                   comparison and use.

                 Key purpose     To make data accurate     To reshape or reformat data for   To ensure consistency in how
                                 and reliable.             easier processing.               data is presented or used.

                 Examples        Removing duplicate rows,  Changing date formats, splitting   Ensuring all ratings are out of
                                 correcting typos, filling   columns, and aggregating values.  5, converting text to lowercase,
                                 missing values.                                            standard units.

                 Tools used      Pandas functions like     Functions like astype(), split(),   Functions like str.lower(),
                                 dropna(), fillna(), drop_  merge(), etc.                   converting units, scaling values
                                 duplicates()                                               with formulas or functions.




















                                                                      Theoretical and Practical Aspects of Data Processing  231
   228   229   230   231   232   233   234   235   236   237   238