Page 169 - Touhpad Ai
P. 169
The following steps are taken to clean/prepare the data:
u Missing Data: Missing data refers to the absence of certain values in the dataset, which can result from various
causes. To handle missing data, strategies include removing rows or columns with missing values, imputing missing
values with estimates, or utilising algorithms that can manage missing data.
u Outliers (extreme values): Outliers are data points that deviate significantly from most of the dataset, typically
due to errors or uncommon occurrences. Managing outliers includes detecting and excluding them, transforming
the data, or applying robust statistical techniques to minimise their influence.
u Inconsistent Data: Inconsistent data, such as typographical errors or variations in data types, is rectified to ensure
uniformity and coherence across the dataset.
u Duplicate Data: Duplicate data is identified and eliminated to maintain data integrity and accuracy.
Cleaning Data in Excel
Cleaning data in Excel means carefully reviewing the dataset to correct errors and improve accuracy. This process
includes removing duplicate entries, ensuring consistent formatting, correcting mistakes, handling or filling in missing
values, and transforming the data into a more useful structure if required.
We shall be performing cleaning measures on the following data:
ID Name Region Rating Product Quantity Price Per Unit
1 AAYUSH North Good Pasta sauce 10 `20.00
2 Jai East Excelent Oil 15 `50.00
3 Kritika West Poor Capsicum 0 na
4 Ananya South Average Salt 25 `30.00
5 Chetna East Good Bitter Gourd 30 `16.67
6 AnaNYa Excelent Coffee 0 na
7 Vivek West Poor Origano 35 `10.00
8 Tarini South Average Tea 40 `15.00
9 Anushka East Good Notebooks 45 `12.22
10 Ananya North Excelent Butter Paper 50 `14.00
11 Saanvi West Poor Toothpaste 5 `160.00
12 Gehna South Average Pepper 20 `45.00
13 Hardik East Good Pickles 0 na
14 Harsimar Excelent Honey 30 `36.67
4 Ananya South Average Salt 25 `30.00
5 Chetna East Good Bitter Gourd 30 `16.67
6 AnaNYa Excelent Coffee 0 na
15 Mehak West Poor Roohafza 35 `34.59
16 Pratty Average Tomato Sauce 0 na
17 Rajat East Good Chocolates 40 `35.00
18 ANITA North Excelent Plastic Bowls 45 `33.33
Data Visualization 167

