Page 65 - Informatics_Practices_Fliipbook_Class12
P. 65

'Mathematics': [90, 88, 65, 40],
                           'Economics': [92, 78, 55, 50],
                           'History': [86, 77, 60, 55],
                           'Psychology': [79, 70, 62, 51]}
            Having created the dictionary comprising the marks of the students in different subjects, now we create a DataFrame
            comprising this information:

             >>> studentDF = pd.DataFrame(students)
             >>> print(studentDF)
                     RollNo  English  Mathematics  Economics  History  Psychology
                  0     301       78           90         92       86          79
                  1     302       68           88         78       77          70
                  2     303       57           65         55       60          62
                  3     304       45           40         50       55          51
            Suppose, for each student, we wish to find the maximum marks across the subjects. It seems natural to find the
            maximum value row-wise i.e. along axis 1. Let us try it out:
             >>> studentDF.max(axis=1)
            output:
                  0    301
                  1    302
                  2    303
                  3    304
                  dtype: int64

            You would notice that Pandas outputs roll numbers of the students and not the maximum marks across the subjects
            obtained by the students. So, where have we gone wrong? We entered the roll number as an attribute like marks in
            different subjects. But, in our application, we want to think of the attribute RollNo as an index attribute. Pandas
            allows us to set an attribute as an index using the method set_index(), as shown below:

             >>> studentDF.set_index('RollNo', inplace=True)
             >>> print(studentDF)
                          English  Mathematics  Economics  History  Psychology
                  RollNo
                  301          78           90         92       86          79
                  302          68           88         78       77          70
                  303          57           65         55       60          62
                  304          45           40         50       55          51

            Now, let us apply the max() function on the DataFrame studentDF, row-wise (along axis 1), one more time:
              >>>> print("Maximum Score among Five Subjects for each Student:")
              >>>> print(studentDF.max(axis=1))
                  Maximum Score among Five Subjects for each Student:
                  RollNo
                  301    92
                  302    88
                  303    65
                  304    55
                  dtype: int64
            As seen above, the function max() only considers the subject scores, and the RollNo column serves as the index for
            the resulting data. Suppose, we also wish to record the name of the student in the DataFrame studentDF. We can
            add the column Name to the DataFrame, as demonstrated below:

             >>> studentDF['Name'] = ['Arya', 'Rashmi', 'Naira', 'Samridhi']
             >>> studentDF




                                                                             Data Handling using Pandas DataFrame  51
   60   61   62   63   64   65   66   67   68   69   70