Page 68 - Informatics_Practices_Fliipbook_Class12
P. 68
2.8 Data Manipulation
2.8.1 Adding and Dropping Columns and Rows from a DataFrame
Often times, there is a need to modify an existing DataFrame. For example, we may want to remove the column
containing the serial number. The columns having large number of missing values are also usually dropped while
analying the data. Similarly, we may add new columns such as age (derived from date of birth), or total price (product of
quantity and price per unit). Even though the information in these new columns is already available in the DataFrame,
it enables easy analysis and visualisation. We may also like to modify the existing data in a DataFrame. For example,
an organisation may like to increment employees' salaries by a certain amount. This would require updating of the
employees' DataFrame. Pandas DataFrame supports manipulating a DataFrame by adding, dropping or modifying rows
and columns.
Adding Columns
Before proceeding further, let us have a another quick look at the first few rows of the grocery DataFrame:
>>> groceryDF.head()
output:
Product Category Price Quantity
0 Bread Food 20 2
1 Milk Food 60 5
2 Biscuit Food 20 2
3 Bourn-Vita Food 70 1
4 Soap Hygiene 40 4
Suppose, we wish to compute total price paid for each product. So, in each row, we need to multiply the
Price by Quantity to obtain Total Price. Fortunately, Pandas allows us to apply the multiplication
operator (*) element wise on the corresponding values in the selected columns. So, the operation
groceryDF['Price']*groceryDF['Quantity'] enables us to construct a series of the elementwise products
of the columns groceryDF['Price'] and groceryDF['Quantity'].
>>> # Multiplying two Columns to get Total Price for an item
>>> TotalPrice = groceryDF['Price']*groceryDF['Quantity']
>>> print('type(TotalPrice):', type(TotalPrice))
>>> print(TotalPrice.head())
>>> type(TotalPrice): <class 'pandas.core.series.Series'>
0 40
1 300
2 40
3 70
4 160
dtype: int64
Next, let us add the column TotalPrice to the DataFrame groceryDF comprising row wise products of values in
the columns groceryDF['Price'] and groceryDF['Quantity']:
>>> # Adding a new Column to the Dataframe
>>> groceryDF['TotalPrice'] = TotalPrice
>>> print(groceryDF)
Product Category Price Quantity TotalPrice
0 Bread Food 20 2 40
1 Milk Food 60 5 300
2 Biscuit Food 20 2 40
3 Bourn-Vita Food 70 1 70
4 Soap Hygiene 40 4 160
5 Brush Hygiene 30 2 60
6 Detergent Household 80 1 80
7 Tissues Hygiene 30 5 150
54 Touchpad Informatics Practices-XII

