Khalil Al Hooti
Khalil Al Hooti

Reputation: 4526

Pandas column values between values from another dataframe column

I have two pandas data-frames as follows:

import pandas as pd 
import numpy as np
import string

size = 5

student_names = [''.join(np.random.choice(list(string.ascii_lowercase), size=4)) for i in range(size)]
marks = list(np.random.randint(50, high=100, size=size))
df1 = pd.DataFrame({'Student Names': student_names, 'Total': marks})
grade_leters = ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D',
                 'D-', 'F']
grade_minimum_value =  [95, 90, 85, 80, 75, 70, 65, 60, 55, 50, 45, 40, 0]
df2 = pd.DataFrame({'Grade Letters': grade_leters, 'Minimums': grade_minimum_value})

df1

  Student Names  Total
0          cjpv     83
1          iywm     98
2          jhhb     87
3          qwau     70
4          ppai     82

df2

   Grade Letters  Minimums
0             A+        95
1              A        90
2             A-        85
3             B+        80
4              B        75
5             B-        70
6             C+        65
7              C        60
8             C-        55
9             D+        50
10             D        45
11            D-        40
12             F         0

I want to give the grade letter as a new column to df1. For example, student cjpv having a total mark of 83 will receive a grade letter of B+, since 83 is between 80 (inclusive) and 85 (exclusive).

The desired output is as follows.

  Student Names  Total  Grade
0          cjpv     83   B+
1          iywm     98   A+
2          jhhb     87   A-
3          qwau     70   B-
4          ppai     82   B+

Thanks in advance. My apologies if there is a similar question to this, However, I could not find one after a long search.

Upvotes: 2

Views: 118

Answers (1)

jezrael
jezrael

Reputation: 863246

Use cut with dynamic values bins and labels from df2 columns, also is added right=False for left closed bins:

np.random.seed(123)
size = 20

student_names = [''.join(np.random.choice(list(string.ascii_lowercase), size=4)) for i in range(size)]
marks = list(np.random.randint(50, high=100, size=size))
df1 = pd.DataFrame({'Student Names': student_names, 'Total': marks})
grade_leters = ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D',
                 'D-', 'F']
grade_minimum_value =  [95, 90, 85, 80, 75, 70, 65, 60, 55, 50, 45, 40, 0]
df2 = pd.DataFrame({'Grade Letters': grade_leters, 'Minimums': grade_minimum_value})

df2 = df2.sort_values('Minimums')
df1['new'] = pd.cut(df1['Total'], 
                    bins=df2['Minimums'].tolist() + [np.inf], 
                    labels=df2['Grade Letters'],
                    right=False)
print (df1)

   Student Names  Total new
0           nccg     70  B-
1           rtkz     99  A+
2           wbar     62   C
3           pjao     68  C+
4           apzt     67  C+
5           oeaq     51  D+
6           erxd     94   A
7           cuhc     91   A
8           upyq     98  A+
9           hjdu     77   B
10          gbvw     99  A+
11          cbmi     72  B-
12          dkfa     53  D+
13          lckw     53  D+
14          nsep     61   C
15          lmug     71  B-
16          ntqg     75   B
17          ouhl     89  A-
18          whbl     91   A
19          fxzs     84  B+

Like @Henry Yik commented here is possible use merge_asof:

df1 = pd.merge_asof(df1.sort_values('Total'), df2, left_on='Total', right_on='Minimums')
print (df1)
   Student Names  Total new Grade Letters  Minimums
0           oeaq     51  D+            D+        50
1           lckw     53  D+            D+        50
2           dkfa     53  D+            D+        50
3           nsep     61   C             C        60
4           wbar     62   C             C        60
5           apzt     67  C+            C+        65
6           pjao     68  C+            C+        65
7           nccg     70  B-            B-        70
8           lmug     71  B-            B-        70
9           cbmi     72  B-            B-        70
10          ntqg     75   B             B        75
11          hjdu     77   B             B        75
12          fxzs     84  B+            B+        80
13          ouhl     89  A-            A-        85
14          whbl     91   A             A        90
15          cuhc     91   A             A        90
16          erxd     94   A             A        90
17          upyq     98  A+            A+        95
18          rtkz     99  A+            A+        95
19          gbvw     99  A+            A+        95

Upvotes: 3

Related Questions