How to turn a pandas dataframe row into a comma separated value with condition

Question

I have a pandas dataframe as such:

id =[30,30,40,40,30,40,55,30]
month =[1,3,11,4,10,2,12,12]
average=[90,80,50,92,18,15,16,55]
sec =['id1','id1','id3','id4','id2','id2','id1','id1']

df = pd.DataFrame(list(zip(id,sec,month,average)),columns =['id','sec','month','Average'])

We want to add one more column having comma separated months of below conditions

Need to exclude id2 sec
and below 90 average

Desired Output

I have tried below code but not getting desired output

final=pd.DataFrame()
for i in set(sec):
  if i !='id2': #Exclude id2
    d2 =df[df['sec']==i]
    d2=df[df['average']<90]  # apply below 90 condition
    d2=d2[['id','month']].groupby(['id'], as_index=False).agg(lambda x: ', '.join(sorted(set(x.astype(str))))) #comma seperated data
    d2.rename(columns={'month':'problematic_month'},inplace=True)
    d2['sec']=i
    tab =df.merge(d2,on =['id','sec'], how ='inner')
    final =final.append(tab)
  else:
    d2 =df[df['sec']==i]
    d2['problematic_month']=np.NaN
    final =final.append(d2)

Kindly suggest any other way(without merge) to get the desired output

anky · Accepted Answer

Another way using groupby+transform

import calendar
d = dict(enumerate(calendar.month_abbr))

s = df['month'].map(d).where(df['sec'].ne("id2")& (df['Average'].lt(90)))
col = s.groupby([df["id"],df['sec']]).transform(lambda x: ','.join(x.dropna()))

out = df.assign(problematic_column=col.replace("",np.nan)).sort_values(['id','sec'])

print(out)

   id  sec  month  Average problematic_column
0  30  id1      1       90            Mar,Dec
1  30  id1      3       80            Mar,Dec
7  30  id1     12       55            Mar,Dec
4  30  id2     10       18                NaN
5  40  id2      2       15                NaN
2  40  id3     11       50                Nov
3  40  id4      4       92                NaN
6  55  id1     12       16                Dec

Steps:

Map the month column to the calender to get month abbreviation.
Retain values only when the condition matches.
Use groupby and transform to dropna and join by comma.

How to turn a pandas dataframe row into a comma separated value with condition

Answers (2)

Related Questions