Page 221 - Ai_V3.0_c11_flipbook
P. 221

• •    Dropping: Dropping involves removing rows or columns containing missing values. This method is useful when
                     the missing values are sparse and dropping them doesn't significantly impact the analysis.
                 •     Pandas provides the dropna() function to drop rows or columns with missing values. For example, you can drop
                     rows with any missing values in a DataFrame df using df.dropna().
                  • •    Interpolation: Interpolation involves estimating missing values based on existing data. Pandas provides interpolation
                     methods such as interpolate() to estimate missing values. For example, you can perform linear interpolation
                     on a DataFrame df using df.interpolate(). This method is particularly useful for time series or ordered data
                     where missing values can be inferred from neighbouring values.

                  Program 54: To fill the missing value in the DataFrame

                     import pandas as pd
                     import numpy as np
                     # Define a dictionary containing data with some missing values
                     data = {
                         'Name': ['Adit', 'Ekam', 'Sakshi', 'Anu'],
                         'Age': [27, np.nan, 25, 30],
                         'Address': ['Delhi', 'Kanpur', np.nan, 'Indore'],
                         'Qualification': ['M.Sc.', 'MA', 'MCA', 'Ph.D.']
                     }
                     # Convert the dictionary into a DataFrame

                     df = pd.DataFrame(data)
                     # Display the original DataFrame

                     print("Original DataFrame:")
                     print(df)
                     # Finding any missing value in a column

                     print("\nMissing values in each column:")
                     print(df.isnull().sum())
                     # Finding the total number of NaN values

                     print("\nTotal number of NaN values:")
                     print(df.isnull().sum().sum())
                     # Deleting entire row with NaN values

                     df_dropped = df.dropna()
                     print("\nDataFrame after dropping rows with NaN values:")
                     print(df_dropped)

                     # Filling NaN values with mean in the 'Age' column and 'Chennai' in the 'Address'
                     column
                     # The mean value is rounded to an integer
                     df_filled = df.fillna({'Age': round(df['Age'].mean()), 'Address': 'Chennai'})
                     print("\nDataFrame after filling NaN values:")

                     print(df_filled)


                                                                                         Python Programming     219
   216   217   218   219   220   221   222   223   224   225   226