Page 323 - Touhpad Ai
P. 323
35. What is data cleaning?
Ans: Removing incorrect or duplicate data entries.
36. Command to install Matplotlib?
Ans: pip install matplotlib.
37. What does sns.barplot() do?
Ans: Creates a bar chart in Seaborn.
38. Why reduce dimensionality?
Ans: To simplify data and improve computation.
39. Give a real-life example of visualization.
Ans: COVID-19 dashboards or school attendance charts.
40. What is a dashboard?
Ans: Interface displaying multiple charts and summaries.
41. What is the first step of data cleaning?
Ans: Identifying missing, duplicate, or inconsistent values.
42. Which Python library is used for cleaning?
Ans: Pandas.
43. What does drop_duplicates() do?
Ans: Removes duplicate rows in data.
44. What does formatting data mean?
Ans: Making data consistent in structure and format.
45. What is fillna() used for?
Ans: Filling missing data with specified values.
46. Why validate cleaned data?
Ans: To ensure it is correct and reliable.
47. What is Kaggle?
Ans: An online platform for datasets and AI competitions.
48. Example of inconsistent data?
Ans: ‘Delhi’ and ‘delhi’ written differently.
49. Command to import Pandas?
Ans: import pandas as pd.
50. Why is clean data important?
Ans: Because poor data leads to wrong predictions.
51. What is the purpose of data modelling?
Ans: To plan how data is stored and connected.
52. Name main types of data models.
Ans: Hierarchical, Relational, ER, and Dimensional.
53. What is a relational model?
Ans: Data stored in linked tables using common fields.
Viva Voce 321

