Page 71 - Informatics_Practices_Fliipbook_Class12
P. 71

2     Biscuit     Food     20         2
                 3  Bourn-Vita     Food     70         1
                 4        Soap  Hygiene     40         4

                 Day 2 Purchases:
                         Product   Category  Price  Quantity
                 0         Jeans    Clothes    400         2
                 1     Chocolate       Food     50         4
                 2  Air Freshner  Household     80         2
                 3        Coffee       Food    120         1
            Now that groceryDF1 and groceryDF2 have been constructed, let us use Pandas method concat() to concatenate
            them to construct a new DataFrame groceryDF comprising purchases of both days, as shown below:
             >>> groceryDF = pd.concat([groceryDF1, groceryDF2], ignore_index = True)
             >>> print(groceryDF)
                          Product   Category  Price  Quantity
                 0          Bread       Food     20         2
                 1           Milk       Food     60         5
                 2        Biscuit       Food     20         2
                 3     Bourn-Vita       Food     70         1
                 4           Soap    Hygiene     40         4
                 5          Brush    Hygiene     30         2
                 6      Detergent  Household     80         1
                 7        Tissues    Hygiene     30         5
                 8          Jeans    Clothes    400         2
                 9      Chocolate       Food     50         4
                 10  Air Freshner  Household     80         2
                 11        Coffee       Food    120         1
            Concatenating DataFrames having Mismatched Column Names

            In the above example, the two DataFrames had identical column names. However, while merging the two DataFrames
            (say df1 and df2), suppose  the DataFrame df2 contains  a column  name df2Col that does  not  belong  to the
            DataFrame df1. So, the concatenated DataFrame will have the column name df2Col, but it would not have any valid
            value corresponding to the rows of the DataFrame df1. These rows will have the value NaN in the column df2Col of
            the concatenated DataFrame. Similarly, if the DataFrame df1 contains a column name df1Col that does not belong
            to the DataFrame df2, then the concatenated DataFrame will have the column name df1Col, but it would not have
            any valid value corresponding to the rows of the DataFrame df2. These rows will have the value NaN in the column
            df1Col of the concatenated DataFrame.

             >>> # Create the first DataFrame for GDP data
             >>> gdp1 = {'Year': [2018, 2019, 2020],
                          'Gross Domestic Product': [21.3, 22.6, 20.9],
                          'Inflation Rate': [2.1, 1.8, 2.5]}
             >>> gdpDF1 = pd.DataFrame(gdp1)

             >>> # Create the second DataFrame for extended GDP data
             >>> gdp2 = {'Year': [2021, 2022],
                          'Gross Domestic Product': [23.2, 24.6],
                          'Inflation Rate': [2.5, 2.0],
                          'Unemployment Rate': [4.2, 4.4]}
             >>> gdpDF2 = pd.DataFrame(gdp2)
            Note  that  the  DataFrame  gdpDF2  contains  a  column  name  Unemployment  Rate  that  does  not  belong  to  the
            DataFrame gdpDF1. So, the concatenated DataFrame will have the column name Unemployment Rate, but it would
            not have any valid values corresponding to the rows of the DataFrame gdpDF1.




                                                                             Data Handling using Pandas DataFrame  57
   66   67   68   69   70   71   72   73   74   75   76