Page 217 - Touhpad Ai
P. 217
Program 25: To replace wrong values in a DataFrame with correct values
import pandas as pd
# Sample dataset with incorrect gender entries
data = {
'ID': [1, 2, 3],
'Name': ['Arjun', 'Priya', 'Nalini'],
'Gender': ['Male', 'Feemale', Feemale] # "Feemale" is a typo error
}
# Create DataFrame
df = pd.DataFrame(data)
# Show the original DataFrame
print("Original DataFrame:")
print(df)
# Correct spelling mistakes in the 'Gender' column using replace()
df['Gender'] = df['Gender'].replace('Feemale', 'Female')
# Show the updated DataFrame
print("\nUpdated DataFrame:")
print(df)
Output:
Original DataFrame:
ID Name Gender
0 1 Arjun Male
1 2 Priya Feemale
2 3 Nalini Feemale
Updated DataFrame:
ID Name Gender
0 1 Arjun Male
1 2 Priya Female
2 3 Nalini Female
21 st
VIDEO SESSION Century #Experiential Learning
Skills
Scan the QR code or visit the following link to watch the video:
Real World Data Cleaning in Python Pandas (Step By Step)
https://www.youtube.com/watch?v=iaZQF8SLHJs&t=1s
After watching the video, answer the following questions:
• What were the most common problems found in the raw dataset used in the video?
• How did cleaning the data help improve the quality or usefulness of the dataset?
Theoretical and Practical Aspects of Data Processing 215

