Reputation: 1694
There is a dataframe like as following:
date id device t1 t2 text y1 y2
2010-1-1 1 pc yes1 I am1 This is a test1 5 3
2010-1-1 1 smart yes1 I am1 This is a test1 6 4
2010-1-1 1 table yes1 I am1 This is a test1 7 5
2010-1-1 2 pc yes2 I am1 This is a test2 8 2
2010-1-1 2 smart yes2 I am1 This is a test2 8 3
2010-1-1 2 table yes2 I am1 This is a test2 9 4
2010-1-1 3 pc yes3 I am3 This is a test3 10 3
2010-1-1 3 smart yes3 I am3 This is a tes3 11 2
........................
Now I want to merge a new dataframe:
(1). sum the y1, and y2, when the id and date, t1, t2, text are same.
(2). join the device str when when the id and date, t1, t2, text are same.
(3). merge the common rows(which have same id, date, text, t1, t2) as one row,
and the new dataframe like as following:
date id device t1 t2 text y1 y2
2010-1-1 1 pc,smart,table yes1 I am1 This is a test1 18 12
2010-1-1 2 pc,smart,table yes2 I am2 This is a test2 25 9
2010-1-1 3 pc,smart yes3 I am3 This is a test3 21 5
Upvotes: 1
Views: 170
Reputation: 862731
Use groupby
by all columns with same values per groups and aggregate by agg
with dictionary, last add reindex
for same ordering of final columns:
df = (df.groupby(['date','id', 't1', 't2', 'text'], as_index=False)
.agg({'y1':'sum', 'y2':'sum', 'device': ', '.join})
.reindex(columns=df.columns))
print (df)
date id device t1 t2 text y1 y2
0 2010-1-1 1 pc, smart, table yes1 I am1 This is a test1 18 12
1 2010-1-1 2 pc, smart, table yes2 I am1 This is a test2 25 9
2 2010-1-1 3 pc, smart yes3 I am3 This is a test3 21 5
Upvotes: 1
Reputation: 76927
Use
In [294]: (df.groupby(['date', 'id', 't1', 't2', 'text'], as_index=False)
.agg({'device': ','.join, 'y1': sum, 'y2': sum}))
Out[294]:
date id t1 t2 text device y1 y2
0 2010-1-1 1 yes1 I am1 This is a test1 pc,smart,table 18 12
1 2010-1-1 2 yes2 I am1 This is a test2 pc,smart,table 25 9
2 2010-1-1 3 yes3 I am3 This is a test3 pc,smart 21 5
Upvotes: 1