Page 214 - Touhpad Ai
P. 214
print("Original Data (with missing values):")
print(df)
# Method 1: Remove rows with any missing values
df_dropped = df.dropna()
print("\nCleaned Data (after dropping missing values):")
print(df_dropped)
# Method 2: Fill missing values with a default value
df_filled = df.fillna({
'Name': 'Unknown',
'Age': df['Age'].mean() # filling with average age
})
print("\nCleaned Data (after filling missing values):")
print(df_filled)
Output:
Original Data (with missing values):
Name Age
0 Aman 17.0
1 Riya NaN
2 Karan 16.0
3 Sia 18.0
4 None 17.0
Cleaned Data (after dropping missing values):
Name Age
0 Aman 17.0
2 Karan 16.0
3 Sia 18.0
Cleaned Data (after filling missing values):
Name Age
0 Aman 17.0
1 Riya 17.0
2 Karan 16.0
3 Sia 18.0
4 Unknown 17.0
Fixing Data Formats
Sometimes, dates or numbers may be stored in the wrong format. In Python, we can fix them using Pandas.
Convert a column to datetime:
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
This command changes the values in the Date column to proper date format. If any value cannot be converted, it
becomes NaT (Not a Time).
Convert a column to numeric:
df['Marks'] = pd.to_numeric(df['Marks'], errors='coerce')
This converts text values to numbers. If a value cannot be changed, it becomes NaN (Not a Number).
212 Touchpad Artificial Intelligence - XI

