Page 223 - Touhpad Ai
P. 223

Data Transformation
                 Following functions are used to make data more suitable for modelling or analysis:
                                 Technique                                           Description
                  Normalization                             Scales data between 0 and 1.
                  Standardisation                           Converts data to have a mean of 0 and standard deviation of 1.
                  Binning                                   Groups continuous data into categories or intervals.

                 Creating and Manipulating DataFrames from Kaggle Datasets

                 Once you’ve downloaded a dataset (usually in CSV format), you can use Python and the Pandas library to work with it.
                 For example, we want to study the usage of AI tools by college students in India, we will search for this dataset on
                 Kaggle and download it. Let us now study and analyse this dataset using Pandas.
                 Scan the QR code or visit the link to get dataset: https://www.kaggle.com/datasets/rakeshkapilavai/ai-
                 tool-usage-by-indian-college-students-2025
                 1.  Import Libraries
                   import pandas as pd
                   pd.set_option('display.max_columns', None)   # Show all columns
                 2.  Load the Dataset into a DataFrame
                   df = pd.read_csv('Students.csv')  # Replace with your file path
                 3.  View the First Few Rows
                   print(df.head())
                    Output:
                        Student_Name                                                 College_Name             Stream  \
                   0             Aarav           Indian Institute of Information Technology             Engineering
                   1           Vivaan       Government Ram Bhajan Rai NES College, Jashpur                  Commerce
                   2           Aditya         Dolphin PG Institute of BioMedical & Natural                   Science
                   3           Vihaan      Shaheed Rajguru College of Applied Sciences for                       Arts
                   4             Arjun                         Roorkee College of Engineering                Science
                       Year_of_Study             AI_Tools_Used             Daily_Usage_Hours  \
                   0                  4                  Gemini                             0.9
                   1                  2                 ChatGPT                             3.4
                   2                  2                 Copilot                             3.6
                   3                  2                 Copilot                             2.9
                   4                  1                  Gemini                             0.9
                                                Use_Cases         Trust_in_AI_Tools           Impact_on_Grades  \
                   0          Assignments, Coding Help                                2                          2
                   1                Learning new topics                               3                        -3
                   2            MCQ Practice, Projects                                5                          0
                   3                     Content Writing                              5                          2
                   4  Doubt Solving, Resume Writing                                   1                          3
                               Do_Professors_Allow_Use            Preferred_AI_Tool            Awareness_Level  \
                   0                                    No                    Copilot                           9
                   1                                   Yes                       Other                          6
                   2                                    No                     Gemini                           1
                   3                                   Yes                     Gemini                           5
                   4                                   Yes                       Other                          8


                                                                      Theoretical and Practical Aspects of Data Processing  221
   218   219   220   221   222   223   224   225   226   227   228