Reputation: 1037
i have data frame like this.
import pandas as pd
raw_data = {'Sub1':['A','B','C','D','E'],
'Sub2':['F','G','H','I','J'],
'Sub3':['K','L','M','N','O'],
'S_score1': [1, 0, 0, 6,0],
'S_score2': [0, 1, 0, 6,0],
'S_score3': [0, 1, 0, 6,0],
}
df2 = pd.DataFrame(raw_data, columns = ['Sub1','Sub2','Sub3','S_score1', 'S_score2', 'S_score3'])
Have data frame
i wants to check score columns and check if the score is greater than 1 then take the respective subject in text.
Wanted Output:
Upvotes: 1
Views: 248
Reputation: 31011
One of possible solutions consists of the following steps:
subs
table containing Sub1
, Sub2
and Sub3
.msk
(mask) table containing S_score...
columns and
change column names to Sub1
, Sub2
and Sub3
.subs.where(msk)
and apply the above function to each row.
Note that for False elements in mask, the respective output element
is None, so the function applied will not include it in join.So the whole script can look like below:
def txt(x):
tbl = list(filter(lambda elem: not pd.isnull(elem), x))
if len(tbl) > 0:
return 'You have scored on ' + ', '.join(tbl)
else:
return 'You have not scored any subject'
subs = df.loc[:, :'Sub3']
msk = df.loc[:, 'S_score1':] > 0
msk.columns = ['Sub1', 'Sub2', 'Sub3']
df['s_text'] = subs.where(msk).apply(txt, axis=1)
Upvotes: 0
Reputation: 403050
First, separate out the grade columns from the one hot columns.
u = df2.filter(like='Sub')
v = df2.filter(like='S_score').astype(bool)
Next, aggregate letter grades by multiplication, and set column values.
r = (u.mul(v.values)
.agg(','.join, axis=1)
.str.strip(',')
.str.replace(',{2,}', ','))
df2['s_text'] = np.where(r.str.len() > 0, 'You scored ' + r, 'N/A')
df2
Sub1 Sub2 Sub3 S_score1 S_score2 S_score3 s_text
0 A F K 1 0 0 You scored A
1 B G L 0 1 1 You scored G,L
2 C H M 0 0 0 N/A
3 D I N 6 6 6 You scored D,I,N
4 E J O 0 0 0 N/A
To make the last separator different, you will need a custom function.
def join(lst):
lst = lst[lst != '']
if len(lst) > 1:
return 'You scored ' + ', '.join(lst[:-1]) + ' and ' + lst[-1]
elif len(lst) > 0:
return 'You scored ' + ', '.join(lst)
return 'N/A'
df2['s_text'] = u.mul(v.values).agg(join, axis=1)
df2
Sub1 Sub2 Sub3 S_score1 S_score2 S_score3 s_text
0 A F K 1 0 0 You scored A
1 B G L 0 1 1 You scored G and L
2 C H M 0 0 0 N/A
3 D I N 6 6 6 You scored D, I and N
4 E J O 0 0 0 N/A
Upvotes: 2
Reputation: 323366
Doing with join
after multiple
s=(df2.filter(like='Sub')*df2.filter(like='S_').ge(1).values).apply( lambda x : ','.join([y for y in x if y is not '']),axis=1)
s
Out[324]:
0 A
1 G,L
2
3 D,I,N
4
dtype: object
Then chain with np.where
np.where(s=='','You do not have score','You have'+s)
Out[326]:
array(['You haveA', 'You haveG,L', 'You do not have score',
'You haveD,I,N', 'You do not have score'], dtype=object)
#Assign it back
df2['s_txt']=np.where(s=='','You do not have score','You have'+s)
df2
Out[328]:
Sub1 Sub2 ... S_score3 s_txt
0 A F ... 0 You haveA
1 B G ... 1 You haveG,L
2 C H ... 0 You do not have score
3 D I ... 6 You haveD,I,N
4 E J ... 0 You do not have score
[5 rows x 7 columns]
Upvotes: 2