Page 240 - Touhpad Ai
P. 240
3. is a Python library widely used for data manipulation and analysis.
4. Data reshape or reformat data for easier processing.
5. allow us to select specific rows or columns based on various criteria.
6. A new row can be added to a DataFrame by utilising the method.
7. You can import a CSV file into a DataFrame using the function.
8. The function is used to remove duplicate rows.
9. means changing how the data appears or is organised.
10. scaling changes values to a fixed range, usually between 0 and 1.
C. State whether the following statement is true or false.
1. Data cleaning is optional before building an AI model.
2. Duplicate entries in a dataset can affect results.
3. Encoding “Yes” as 1 and “No” as 0 is helpful for machine understanding.
4. You can use str.replace() to correct spelling errors in a column.
5. Feature scaling changes all values to a range of 5 to 10.
6. Missing or incomplete information reduces the quality of data.
7. Outliers are unusual or extreme values that do not fit with the rest of the data.
8. To select a particular row from the DataFrame, you can use the .iloc[] method.
9. To remove rows and columns from a DataFrame, we specify the labels' names and the axis
(0 for rows, 1 for columns).
10. When data is consistent, it can lead to incorrect analysis and poor decisions-making.
SECTION B (Subjective Type Questions)
A. Short answer type questions.
1. List any two function that are used for handling duplicate values.
2. What is meant by “formatting the data” in the cleaning process?
3. What does the drop_duplicates() function do in Pandas?
4. What is meant by data transformation?
5. What is Min-Max scaling?
6. List down any three features of Kaggle.
7. List any two features of data standardisation.
8. What is the difference between data cleaning and data standardization?
9. What is DataFrame?
10. What is the use of the pd.merge(df1, df2, on='common_column') function?
B. Long answer type questions.
1. Explain the steps in data cleaning process.
2. List different operations perform on rows and columns in DataFrame.
3. Describe any three methods of data standardization with suitable examples.
4. Write a brief note on "Importing a CSV file into a DataFrame".
5. Write down the steps to explore dataset on Kaggle.
238 Touchpad Artificial Intelligence - XI

