Page 220 - Ai_V3.0_c11_flipbook
P. 220

Reboot


                    1.  Fill in the blanks.
                       a.  The two data structures that are supported by Pandas are ………………………. and ………………………. .
                       b.  The ………………………. library in Python excels in creating N-dimension data objects.
                       c.  The statement to install NumPy is ………………………. .

                       d.  You can check the shape of an array by using the ………………………. method in NumPy.
                    2.  Answer the following questions:
                       a.  What is a DataFrame in Pandas?
                           ……………………….……………………….……………………….……………………….……………………..................….……………………….
                       b.  Give one advantage of using NumPy arrays over lists.

                           ……………………….……………………….……………………….……………………….……………………..................….……………………….





              Understanding Missing Values
              Understanding missing values in a DataFrame is crucial for data analysis, cleaning and preprocessing. Missing values,
              often denoted as NaN (Not a Number) or None, can occur due to various reasons such as data entry errors, incomplete
              data,  or data  transformation processes. For example, while reviews for  a product  online, some customers may not
              provide feedback on every aspect of the product if they did not use all of its features. Dealing with missing values is an
              essential step in data cleaning and preprocessing.

              The isnull() function in Pandas is used to detect missing or NaN (Not a Number) values within a DataFrame. It returns
              a DataFrame of the same shape as the original DataFrame, where each element is a boolean value indicating whether it's
              missing (True) or not (False). The isnull() function returns True for missing values and False for non-missing values.

                                 Employee Name      Salary                 Employee Name      Salary

                                Rohit              50000                  False              False

                                Pankaj             52000      isnull()    False              False
                                Vivan              NaN                    False              True

                                Chirag             53000                  False              False
                                Sanjay             NaN                    False              True

              You can count the number of missing values in each column or row.
               • •  df.isnull().sum(): Returns the number of missing values in each column.
               • •  df.isnull().sum(axis=1): Returns the number of missing values in each row.

              There are several ways by which you can handle missing values in DataFrame:
               • •    Imputation: Imputation involves replacing missing values with a specific value. Common strategies include replacing
                  missing values with the mean, median, or mode of the column. This method helps in retaining the structure of the
                  dataset and avoids losing valuable information.
              •   Pandas provides methods like fillna() to perform imputation. For example, you can fill the missing values in a
                  •
                  DataFrame df with the mean of each column using df.fillna(df.mean()).




                    218     Touchpad Artificial Intelligence (Ver. 3.0)-XI
   215   216   217   218   219   220   221   222   223   224   225