Page 66 - Informatics_Practices_Fliipbook_Class12
P. 66
output:
English Mathematics Economics History Psychology Name
RollNo
301 78 90 92 86 79 Arya
302 68 88 78 77 70 Rashmi
303 57 65 55 60 62 Naira
304 45 40 50 55 51 Samridhi
Note that function call studentDF.max(axis=1) still computes the maximum marks across the subjects for each
student as shown below. It is so because the string column Name is ignored while computing maximum value.
>>> studentDF.max(axis=1)
<ipython-input-118-49c2fff99bae>:1: FutureWarning: Dropping of nuisance columns in
DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version
this will raise TypeError. Select only valid columns before calling the reduction.
studentDF.max(axis=1)
output:
RollNo
301 92
302 88
303 65
304 55
dtype: int64
Suppose, a unversity admits the students based on their marks in four subjects (including English) in which they
performed the best in the board examination. So, to award admission to an undergraduate course, we need to find the
marks of the students in the best performed 4 subjects including English. This may be achieved as follows:
1. Create a single column DataFrame engMarks comprising the marks obtained by the students in English.
2. Write a function getBest3 which arranges the DataFrame studentDF in descending order by setting
ascending=False and picking top 3 values.
3. Concatenate the DataFrames engMarks and best3Marks, row wise to yield best4Marks.
>>> engMarks = studentDF['English']
>>> def getBest3(df):
>>> return df.sort_values(ascending=False)[:3]
>>> best3Marks = studentDF[['Mathematics', 'Economics', 'History', 'Psychology']].
apply(getBest3, axis=1)
>>> best4Marks = pd.concat( [engMarks, best3Marks], axis = 1)
>>> print(best4Marks)
English Economics History Mathematics Psychology
RollNo
301 78 92.0 86.0 90.0 NaN
302 68 78.0 77.0 88.0 NaN
303 57 NaN 60.0 65.0 62.0
304 45 50.0 55.0 NaN 51.0
Pandas provides various aggregation functions, such as sum(), mean(), max(), and min() which allow us to
obtain various summary statistics for different groupings of data.
Default axis along which the comparison is to be performed is set to 'index' or 0 which denotes that the
operation is to be applied column-wise across rows. Alternatively, we may apply the operation row-wise across
columns by setting the axis to 'columns' or 1.
Consider the DataFrame employeeDF and apply the aggregate functions sum(), mean(), max(),
and min() to the Salary column.
52 Touchpad Informatics Practices-XII

