Page 51 - Informatics_Practices_Fliipbook_Class12
P. 51
pd.read_csv(): Reads CSV files in Pandas DataFrame. It takes the path of the CSV file as input and returns a
Pandas DataFrame object.
• Keyword argument delimiter or sep of pd.read_csv(): Specify the delimiter explicitly.
• Keyword argument usecols of pd.read_csv(): Specify the list of required columns.
• Keyword argument dtype of pd.read_csv() method: Specify the new data types in the form of a
dictionary (column_name:data_type).
2.4 Dimensions of a DataFrame
We know that Pandas DataFrame is a two-dimensional tabular structure. The attribute ndim of Pandas yields the
number of dimensions of a DataFrame, as shown below:
>>> import pandas as pd
>>> #Reading a CSV file into a DataFrame
>>> groceryDF = pd.read_csv('Grocery.csv')
>>> print(groceryDF.ndim)
2
When a DataFrame is constructed using a csv file, we do not know the number of rows and the number of columns in
the DataFrame. The shape attribute of the DataFrame returns a tuple of the form (n_rows, n_cols) as shown
below:
>>> print(groceryDF.shape)
(8, 4)
>>> (nRows, nCols) = groceryDF.shape
>>> print("Number of rows:", nRows)
Number of rows: 8
>>> print("Number of columns:", nCols)
Number of columns: 4
Note that the attribute shape of a DataFrame being a tuple, we could have directly accessed the number of rows in the
DataFrame groceryDF as groceryDF.shape[0] and the number of columns in the DataFrame as groceryDF.
shape[1]:
>>> nRows = groceryDF.shape[0]
>>> nCols = groceryDF.shape[1]
>>> print("Number of rows:", nRows)
Number of rows: 8
>>> print("Number of columns:", nCols)
Number of columns: 4
To know the labels along the rows (row labels) and labels used across columns (column names) of the dataframe, we
use the attributes index and columns, respectively.
>>> # Retrieve row labels
>>> print('Row Labels:', groceryDF.index)
Row Labels: RangeIndex(start=0, stop=8, step=1)
>>> # Retrieve column labels
>>> print('Column names:', groceryDF.columns)
Column names: Index(['Product', 'Category', 'Price', 'Quantity'], dtype='object')
As shown in the output, there are 8 row labels ranging from 0 to 7. Also, there are four columns, namely, 'Product',
'Category', 'Price', and 'Quantity'.
Data Handling using Pandas DataFrame 37

