Page 221 - Ai_V3.0_c11_flipbook
P. 221
• • Dropping: Dropping involves removing rows or columns containing missing values. This method is useful when
the missing values are sparse and dropping them doesn't significantly impact the analysis.
• Pandas provides the dropna() function to drop rows or columns with missing values. For example, you can drop
rows with any missing values in a DataFrame df using df.dropna().
• • Interpolation: Interpolation involves estimating missing values based on existing data. Pandas provides interpolation
methods such as interpolate() to estimate missing values. For example, you can perform linear interpolation
on a DataFrame df using df.interpolate(). This method is particularly useful for time series or ordered data
where missing values can be inferred from neighbouring values.
Program 54: To fill the missing value in the DataFrame
import pandas as pd
import numpy as np
# Define a dictionary containing data with some missing values
data = {
'Name': ['Adit', 'Ekam', 'Sakshi', 'Anu'],
'Age': [27, np.nan, 25, 30],
'Address': ['Delhi', 'Kanpur', np.nan, 'Indore'],
'Qualification': ['M.Sc.', 'MA', 'MCA', 'Ph.D.']
}
# Convert the dictionary into a DataFrame
df = pd.DataFrame(data)
# Display the original DataFrame
print("Original DataFrame:")
print(df)
# Finding any missing value in a column
print("\nMissing values in each column:")
print(df.isnull().sum())
# Finding the total number of NaN values
print("\nTotal number of NaN values:")
print(df.isnull().sum().sum())
# Deleting entire row with NaN values
df_dropped = df.dropna()
print("\nDataFrame after dropping rows with NaN values:")
print(df_dropped)
# Filling NaN values with mean in the 'Age' column and 'Chennai' in the 'Address'
column
# The mean value is rounded to an integer
df_filled = df.fillna({'Age': round(df['Age'].mean()), 'Address': 'Chennai'})
print("\nDataFrame after filling NaN values:")
print(df_filled)
Python Programming 219

