Page 228 - Touhpad Ai
P. 228
Data Transformation and Standardisation
Data transformation and standardisation are important steps in preparing data before performing analysis or
building models.
Transformation means changing how the data appears or is organised — for example, adjusting values, formats, or
structures to make them consistent and meaningful. On the other hand, standardisation means converting data into
a common scale or format, often by adjusting it.
These processes help improve the quality of data, ensure consistency, and make data analysis or model building more
accurate and effective.
Data Transformation
Data transformation refers to converting data from one format, structure, or type to another. It helps make the data
easier to interpret, improves its quality, and allows integration of data from different sources.
Example:
u Changing date formats (like from 12/07/2025 to 07/12/2025)
u Converting units (like miles to kilometers)
u Turning categorical text into numerical values (like “Yes” to 1, “No” to 0)
u Filling missing values using averages or other imputation methods
u Aggregating data (for example, calculating yearly sales from monthly sales)
Let us learn to do the above in Python.
Program 26: To change date format
import pandas as pd
# Sample data
df = pd.DataFrame({'Date': ['07/12/2024', '08/12/2024']})
# Convert to datetime and change format
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
print(df)
Output:
Date
0 2024-12-07
1 2024-12-08
Program 27: To convert unit from miles to kilometers
import pandas as pd
# 1 mile = 1.60934 kilometers
df = pd.DataFrame({'Miles': [15, 20, 25]})
df['Kilometers'] = df['Miles'] * 1.60934
print(df)
226 Touchpad Artificial Intelligence - XI

