Page 68 - Informatics_Practices_Fliipbook_Class12
P. 68

2.8 Data Manipulation

        2.8.1 Adding and Dropping Columns and Rows from a DataFrame

        Often times, there is a need to modify an existing DataFrame. For example, we may want to remove the column
        containing the serial number. The columns having large number of missing values are also usually dropped while
        analying the data. Similarly, we may add new columns such as age (derived from date of birth), or total price (product of
        quantity and price per unit). Even though the information in these new columns is already available in the DataFrame,
        it enables easy analysis and visualisation. We may also like to modify the existing data in a DataFrame. For example,
        an organisation may like to increment employees' salaries by a certain amount. This would require updating of the
        employees' DataFrame. Pandas DataFrame supports manipulating a DataFrame by adding, dropping or modifying rows
        and columns.

        Adding Columns
        Before proceeding further, let us have a another quick look at the first few rows of the grocery DataFrame:

         >>> groceryDF.head()
        output:
                     Product        Category          Price         Quantity
              0        Bread            Food             20                 2
              1          Milk           Food             60                 5
              2      Biscuit            Food             20                 2
              3  Bourn-Vita             Food             70                 1
              4          Soap        Hygiene             40                 4
        Suppose,  we  wish  to  compute  total  price  paid  for  each  product.  So,  in  each  row,  we  need  to  multiply  the
        Price  by  Quantity  to  obtain  Total  Price.  Fortunately,  Pandas  allows  us  to  apply  the  multiplication
        operator  (*)  element  wise  on  the  corresponding  values  in  the  selected  columns.  So,  the  operation
        groceryDF['Price']*groceryDF['Quantity'] enables us to construct a series of the elementwise products
        of the columns groceryDF['Price'] and groceryDF['Quantity'].
         >>> # Multiplying two Columns to get Total Price for an item
         >>> TotalPrice = groceryDF['Price']*groceryDF['Quantity']
         >>> print('type(TotalPrice):', type(TotalPrice))
         >>> print(TotalPrice.head())
         >>> type(TotalPrice): <class 'pandas.core.series.Series'>
              0     40
              1    300
              2     40
              3     70
              4    160
              dtype: int64
        Next, let us add the column TotalPrice to the DataFrame groceryDF comprising row wise products of values in
        the columns groceryDF['Price'] and groceryDF['Quantity']:
         >>> # Adding a new Column to the Dataframe
         >>> groceryDF['TotalPrice'] = TotalPrice
         >>> print(groceryDF)
                    Product   Category  Price  Quantity  TotalPrice
              0       Bread       Food     20         2          40
              1        Milk       Food     60         5         300
              2     Biscuit       Food     20         2          40
              3  Bourn-Vita       Food     70         1          70
              4        Soap    Hygiene     40         4         160
              5       Brush    Hygiene     30         2          60
              6   Detergent  Household     80         1          80
              7     Tissues    Hygiene     30         5         150

          54   Touchpad Informatics Practices-XII
   63   64   65   66   67   68   69   70   71   72   73