Reputation: 85
+----+----------------+----------+---------------------+
| id | characteristic | location | value |
+----+----------------+----------+---------------------+
| 1 | start | loc1 | 01/01/2020 00:00:00 |
+----+----------------+----------+---------------------+
| 1 | end | loc1 | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1 | start | loc2 | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1 | end | loc2 | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2 | start | loc1 | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2 | end | loc1 | 01/01/2020 00:01:00 |
+----+----------------+----------+---------------------+
I have the above table and I would like to convert it to something like below
+----+---------------------+---------------------+----------+
| id | start | end | location |
+----+---------------------+---------------------+----------+
| 1 | 01/01/2020 00:00:00 | 01/01/2020 00:00:20 | loc1 |
+----+---------------------+---------------------+----------+
| 1 | 01/01/2020 00:00:20 | 01/01/2020 00:00:40 | loc2 |
+----+---------------------+---------------------+----------+
| 2 | 01/01/2020 00:00:40 | 01/01/2020 00:01:00 | loc1 |
+----+---------------------+---------------------+----------+
Please advise on how would you solve this. Thank you!!!
Upvotes: 3
Views: 557
Reputation: 97
You can use pivot_table
function from pandas.
pd.pivot_table(df, values='value', index =['id', 'location'] ,columns=['characteristic'], aggfunc='first')
Upvotes: 3
Reputation: 4284
You can use groupby
and unstack
to solve this:
The sum
method is just here because we need something between the groupby
and unstack
df.groupby(['id','location','characteristic')['value']\
.sum()\ # other aggregation methods such as min, max could also work here
.unstack('characteristic')\ # will create one col by characteristic value and group rows
.reset_index()
# id location end start
#0 1 loc1 2 1
#1 1 loc2 4 3
#2 2 loc1 6 5
Upvotes: 1
Reputation: 323226
We need use cumcount
create the help key , then this should be pivot problem
df['helpkey']=df.groupby(['id','characteristic']).cumcount()
s=df.set_index(['id','location','helpkey','characteristic'])['value'].unstack(level=3).reset_index().drop('helpkey',1)
s
characteristic id location end start
0 1 loc1 01/01/2020 00:00:20 01/01/2020 00:00:00
1 1 loc2 01/01/2020 00:00:40 01/01/2020 00:00:20
2 2 loc1 01/01/2020 00:01:00 01/01/2020 00:00:40
Upvotes: 2