Reputation: 111
Suppose I have this dataframe
name age gender travelled_abroad for_work for_entertainment set_at_home
John 32 male True True False False
Ieren 52 female False False False True
Kris 13 male True False True False
I want to create a new column that has the columns that their values are True
name age gender travelled_abroad for_work for_entertainment set_at_home final
John 32 male True True False False travelled_Abroad, for_work
Ieren 52 female False False False True set_at_home
Kris 13 male True False True False travelled_abroad, for_entertainment
I've tried
def new_column(row):
for col in df:
if row == True:
return ','.join(row)
df['final'] = df.applymap(new_column)
but I got nothing. any help?
Upvotes: 0
Views: 84
Reputation: 103
Just as Mehdi has done with a tiny amendment:
data ={"name": ['John', 'Ieren', 'Kris', 'Houda'], "age": [32, 52, 13, 28], "travelled_abroad": [True, False, True, True], "for_work":[True, False, False, True], \
"for_entertainment": [False, False, True, True], "set_at_home": [False, True, False, False]}
df = pd.DataFrame(data)
cols = ['travelled_abroad','for_work','for_entertainment','set_at_home']
df = df.assign(final = lambda x: x[cols].apply(lambda s: ','.join(np.array(cols)[s.astype(bool)]),axis=1))
df
Another lengthier approach but tidier one from my point of view is to do some melting and merging to make the data more elegant:
Create a travel status column
df_travel_status = pd.melt(df, id_vars= ['name', 'age'], value_vars=['travelled_abroad', 'set_at_home'], var_name="travel_status", value_name="to_be_dropped" )
df_travel_status = df_travel_status[df_travel_status['to_be_dropped']]
df_travel_status
Then a travel reason column:
df_travel_reason = pd.melt(df, id_vars= ['name', 'age'], value_vars=['for_work', 'for_entertainment'], var_name='travel_reason', value_name="to_be_dropped_2" ) df_travel_reason = df_travel_reason[df_travel_reason["to_be_dropped_2"]] df_travel_reason
Merging and dropping the extra columns
df_modified = df_travel_status.merge(df_travel_reason, on=['name', 'age'], how='left') df_modified.drop(columns=['to_be_dropped', "to_be_dropped_2"], inplace=True) df_modified
Finally reaching to the intended dataframe and clearing any duplicates
df_modified = df_modified.replace(np.nan, "") df_modified.groupby(['name', "age", "travel_status"])['travel_reason'].apply(", ".join).reset_index()
Upvotes: 2
Reputation: 22493
You can use list comprehension:
df["new"] = [", ".join(df.columns[3:][i]) for i in df.iloc[:, 3:].to_numpy()]
print (df)
name age gender travelled_abroad for_work for_entertainment set_at_home new
0 John 32 male True True False False travelled_abroad, for_work
1 Ieren 52 female False False False True set_at_home
2 Kris 13 male True False True False travelled_abroad, for_entertainment
Upvotes: 2
Reputation: 2583
Use assign function:
cols = ['travelled_abroad','for_work','for_entertainment','set_at_home']
df = df.assign(final = lambda x: x['travelled_abroad','for_work','for_entertainment','set_at_home'].apply(lambda s: ','.join(np.array(cols)[s.astype(bool)]),axis=1))
Upvotes: 3