Page 50 - Informatics_Practices_Fliipbook_Class12
P. 50
Milk,Food,60,5
Biscuit,Food,20,2
Bourn-Vita,Food,70,1
Soap,Hygiene,40,4
Brush,Hygiene,30,2
Detergent,Household,80,1
Tissues,Hygiene,30,5
Hence, the following rows are not included in the DataFrame:
Milk,Food,60,5
Bourn-Vita,Food,70,1
Soap,Hygiene,40,4
To skip the first few (say, n) rows from a CSV file, we use the keyword argument skiprows as follows:
>>> import pandas as pd
>>> groceryDF = pd.read_csv('Grocery.csv', skiprows = 2)
>>> print(groceryDF)
output:
Product Category Price Quantity
0 Biscuit Food 20 2
1 Bourn-Vita Food 70 1
2 Soap Hygiene 40 4
3 Brush Hygiene 30 2
4 Detergent Household 80 1
5 Tissues Hygiene 30 5
2.3.3 Setting the Data Type of a Column in a DataFrame
Sometimes, the data type of an attribute in a frame needs to be different from that in the CSV file. For example, we
may want Price to take decimal values (floating point value) instead of integer values stored in the CSV file. For this
purpose, we specify the new data types in the form of a dictionary (column_name: data_type) using the keyword
argument dtype. This is illustrated below:
>>> import pandas as pd
>>> groceryDF = pd.read_csv('Grocery.csv', dtype={'Price': float})
>>> print(groceryDF)
output:
Product Category Price Quantity
0 Bread Food 20.0 2
1 Milk Food 60.0 5
2 Biscuit Food 20.0 2
3 Bourn-Vita Food 70.0 1
4 Soap Hygiene 40.0 4
5 Brush Hygiene 30.0 2
6 Detergent Household 80.0 1
7 Tissues Hygiene 30.0 5
What is the purpose of the following code segment?
df = pd.read_csv('Grocery.csv', delimiter=';', header=None, index_col=0,
usecols=[0, 2, 3], dtype={'Quantity': str})
print(df)
36 Touchpad Informatics Practices-XII

