Page 60 - Informatics_Practices_Fliipbook_Class12
P. 60
Note that the rows where the Price column has a value greater than 50 have been marked True. The remaining
rows have been marked False. Now we can use this mask to select only the rows of the DataFrame that have been
marked True:
>>> higherGroceryDF = groceryDF[mask]
>>> print(higherGroceryDF)
output:
Product Category Price Quantity
1 Milk Food 60 5
3 Bourn-Vita Food 70 1
6 Detergent Household 80 1
Note that the DataFrame higherGroceryDF contains only the rows where the value in Price column is greater
than 50. To express complex conditions, Boolean masks can be combined using the logical operators not (~), and (&),
and or (|). For example, to select the rows having more than 2 units of the products whose price exceeds 50, we first
create priceMask using the condition groceryDF['Price']>50. Next, we create the quantityMask using the
condition groceryDF['Quantity']>2. Finally, to select the rows having more than 2 units of the products whose
price exceeds 50, we combine the masks priceMask and quantityMask using the and (&) operator:
>>> priceMask = groceryDF['Price']>50
>>> quantityMask = groceryDF['Quantity']>2
>>> mask = priceMask & quantityMask
>>> selectedGroceryDF = groceryDF[mask]
>>> print(selectedGroceryDF)
output:
Product Category Price Quantity
1 Milk Food 60 5
Here, the resulting DataFrame contains only the rows where the values in 'Price' column is greater than 50 and the
'Quantity' column is greater than 2.
Mask using startswith Condition
Boolean indexing can also be used to select specific columns from a Pandas DataFrame based on some condition. Using
Boolean indexing on columns, we can filter the required columns that satisfy a certain condition from a DataFrame. For
example, a mask to extract the columns whose names begin with letter 'P' may be created as follows:
>>> mask = groceryDF.columns.str.startswith('P')
>>> print(mask)
[ True False True False]
Once the required mask is created, the new DataFrame may be created as follows:
>>> ProductPriceDF = groceryDF.loc[:, mask]
>>> print(ProductPriceDF)
output:
Product Price
0 Bread 20
1 Milk 60
2 Biscuit 20
3 Bourn-Vita 70
4 Soap 40
5 Brush 30
6 Detergent 80
7 Tissues 30
Boolean indexing on columns is used to select the desired columns from a DataFrame.
46 Touchpad Informatics Practices-XII

