Reputation: 69
Say I have a dataframe
I want to re-shape it AND concatenate the strings
I can reshape it using melt but I lose the description. I've tried transform but no luck Any ideas?
Code:
import pandas as pd
x = [['a', 'Electronics', 'TV', '42" plasma'], ['a', 'Electronics', 'TV', '36" LCD'], ['a', 'Electronics', 'hifi', 'cd player'], ['a', 'Electronics', 'hifi', 'record player'], ['b', 'Sports', 'Soccer', 'mens trainers'], ['b', 'Sports', 'Soccer', 'womens trainers'], ['b', 'Sports', 'golf', '9 iron']]
df = pd.DataFrame(x, columns =['id', 'category','sub_category','description'])
y = pd.melt(df, id_vars=['id'],value_vars=['category','sub category'])['description'].transform(lambda x : ' '.join(x))
Upvotes: 2
Views: 463
Reputation: 862591
There is first problem melt
, need add description
column to id_vars
and then aggregate join
with groupby
, so all togehter is:
y = (pd.melt(df,
id_vars=['id','description'],
value_vars=['category','sub_category'],
value_name='Category')
.groupby(['id','Category'])['description']
.agg(' '.join)
.reset_index())
print (y)
id Category description
0 a Electronics 42" plasma 36" LCD cd player record player
1 a TV 42" plasma 36" LCD
2 a hifi cd player record player
3 b Soccer mens trainers womens trainers
4 b Sports mens trainers womens trainers 9 iron
5 b golf 9 iron
Upvotes: 2