Reputation: 67
I have a list of students in a csv file. I want (using Python) to display four columns in that I want to display the male students who have higher marks in Maths, Computer, and Physics.
I tried to use pandas
library.
marks = pd.concat([data['name'],
data.loc[data['students']==1, 'maths'].nlargest(n=10)], 'computer'].nlargest(n=10)], 'physics'].nlargest(n=10)])
I used 1 for male students and 0 for female students. It gives me an error saying: Invalid syntax.
Upvotes: 1
Views: 85
Reputation: 42875
Here's a way to show the top 10 students in each of the disciplines. You could of course just sum the three scores and select the students with the highest total if you want the combined as opposed to the individual performance (see illustration below).
df1 = pd.DataFrame(data={'name': [''.join(random.choice('abcdefgh') for _ in range(8)) for i in range(100)],
'students': np.random.randint(0, 2, size=100)})
df2 = pd.DataFrame(data=np.random.randint(0, 10, size=(100, 3)), columns=['math', 'physics', 'computers'])
data = pd.concat([df1, df2], axis=1)
data.info()
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
name 100 non-null object
students 100 non-null int64
math 100 non-null int64
physics 100 non-null int64
computers 100 non-null int64
dtypes: int64(4), object(1)
memory usage: 4.0+ KB
res = pd.concat([data.loc[:, ['name']], data.loc[data['students'] == 1, 'math'].nlargest(n=10), data.loc[data['students'] == 1, 'physics'].nlargest(n=10), data.loc[data['students'] == 1, 'computers'].nlargest(n=10)], axis=1)
res.dropna(how='all', subset=['math', 'physics', 'computers'])
name math physics computers
0 geghhbce NaN 9.0 NaN
1 hbbdhcef NaN 7.0 NaN
4 ghgffgga NaN NaN 8.0
6 hfcaccgg 8.0 NaN NaN
14 feechdec NaN NaN 8.0
15 dfaabcgh 9.0 NaN NaN
16 ghbchgdg 9.0 NaN NaN
23 fbeggcha NaN NaN 9.0
27 agechbcf 8.0 NaN NaN
28 bcddedeg NaN NaN 9.0
30 hcdgbgdg NaN 8.0 NaN
38 fgdfeefd NaN NaN 9.0
39 fbcgbeda 9.0 NaN NaN
41 agbdaegg 8.0 NaN 9.0
49 adgbefgg NaN 8.0 NaN
50 dehdhhhh NaN NaN 9.0
55 ccbaaagc NaN 8.0 NaN
68 hhggfffe 8.0 9.0 NaN
71 bhggbheg NaN 9.0 NaN
84 aabcefhf NaN NaN 9.0
85 feeeefbd 9.0 NaN NaN
86 hgeecacc NaN 8.0 NaN
88 ggedgfeg 9.0 8.0 NaN
89 faafgbfe 9.0 NaN 9.0
94 degegegd NaN 8.0 NaN
99 beadccdb NaN NaN 9.0
data['total'] = data.loc[:, ['math', 'physics', 'computers']].sum(axis=1)
data[data.students==1].nlargest(10, 'total').sort_values('total', ascending=False)
name students math physics computers total
29 fahddafg 1 8 8 8 24
79 acchhcdb 1 8 9 7 24
9 ecacceff 1 7 9 7 23
16 dccefaeb 1 9 9 4 22
92 dhaechfb 1 4 9 9 22
47 eefbfeef 1 8 8 5 21
60 bbfaaada 1 4 7 9 20
82 fbbbehbf 1 9 3 8 20
18 dhhfgcbb 1 8 8 3 19
1 ehfdhegg 1 5 7 6 18
Upvotes: 1