Page 233 - Touhpad Ai
P. 233
Examples of Data Standardisation in Real Life
In everyday systems, data often comes in different formats or styles, which can cause confusion or errors. Some
examples of data standardisation are as follows:
u Student data: Ensuring roll numbers follow a fixed format (e.g., STU2025_001), names have proper capitalisation,
and contact numbers all have 10 digits.
u Product data in online stores: Ensuring all product sizes are written consistently as “Small”, “Medium”, “Large”
instead of using abbreviations like ‘S’, ‘M’, ‘L’ or other variations.
u Financial data: Ensuring currency values follow the same format (e.g., 100.00 instead of 100) and that dates are
written in a consistent style.
u Healthcare records: Standardising patient information such as blood type (e.g., A+, B−), medical codes, and date
formats to prevent errors in treatment or reporting.
u E-commerce addresses: Ensuring addresses follow a uniform format (e.g., street name, city, postcode) to improve
delivery accuracy.
u Survey responses: Standardising answers so that options like “Yes”, “YES”, “Y” all become “Yes”, and numerical
ratings follow a consistent scale (e.g., 1–5).
u Sensor data in IoT systems: Converting measurements to the same units (e.g., temperature in °C, distance in metres)
so data from different devices can be accurately compared.
Data Cleaning Vs Data Transformation Vs Data Standardisation
All three techniques may seem similar; they are actually different in purpose and method. Let us understand the
difference between the three.
Aspect Data Cleaning Data Transformation Data Standardisation
What it means Removing or correcting Changing data into a suitable Converting data into a uniform
errors, duplicates, and format or structure for analysis. scale or format for better
inconsistencies in data. comparison and use.
Key purpose To make data accurate To reshape or reformat data for To ensure consistency in how
and reliable. easier processing. data is presented or used.
Examples Removing duplicate rows, Changing date formats, splitting Ensuring all ratings are out of
correcting typos, filling columns, and aggregating values. 5, converting text to lowercase,
missing values. standard units.
Tools used Pandas functions like Functions like astype(), split(), Functions like str.lower(),
dropna(), fillna(), drop_ merge(), etc. converting units, scaling values
duplicates() with formulas or functions.
Theoretical and Practical Aspects of Data Processing 231

