Page 300 - Touhpad Ai
P. 300
'Forrest Gump', 'Fight Club'], 'release_year': [2010, 1997, 2009, 2008, 1972, 1994,
1994, 2003, 1994, 1999],'rating': [8.8, 7.8, 7.8, 9.0, 9.2, 9.3, 8.9, 8.9, 8.8,
15.0],'duration': [148, 195, 162, 152, 175, 142, 154, 201, 142, 500]}
df = pd.DataFrame(data)
df['duration'] = pd.to_numeric(df['duration'], errors='coerce')
df['rating'] = pd.to_numeric(df['rating'], errors='coerce')
numerical_data = df[['rating', 'duration']]
Q1 = numerical_data.quantile(0.25)
Q3 = numerical_data.quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers_iqr = (numerical_data < lower_bound) | (numerical_data > upper_bound)
print("Outliers based on IQR method:")
print(df[outliers_iqr.any(axis=1)])
plt.figure(figsize=(10,6))
sns.boxplot(data=numerical_data)
plt.title('Boxplot to Visualize Outliers')
plt.show()
Output:
5. Write a Python program to create a box plot using Seaborn to visualize the distribution of a numerical variable.
Download the dataset using the given link or scan the QR code:
https://www.kaggle.com/code/nancyalaswad90/diamonds-prices-ideas/input
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv('D:\\Data\\Diamonds.csv')
sns.boxplot(x=df['price'])
plt.title('Box Plot of Diamond Price Distribution')
plt.xlabel('Price')
plt.ylabel('Value')
plt.show()
298 Touchpad Artificial Intelligence - XI

