Reputation: 131
I've following df
ID Status Date
1 A 01-09-2020
1 B 03-09-2020
2 A 10-12-2020
2 B -
And would like to convert to this:
ID Status1 Status2 Date1 Date2
1 A B 01-09-2020 03-09-2020
2 A B 10-12-2020 -
I've think pivot doesn't apply here since I'm not really aggregating something and I've managed somewhat by using a group_by function where I get the min and max date for each ID, and afterwards joining it, but that seems somehow very devious and doesn't give me the status columns for which I cant use the min or max function since this is a numeric value.
I've tried following solution (answer 10 as someone suggested) How to pivot a dataframe?, which would look like this
df.insert(0, 'count', df.groupby('ID').cumcount())
pivot = df.pivot(index='count', columns='ID', values='STATUS')
but this resulted in following df:
1 2
A A
B B
I've also tried How to do a transpose a dataframe group by key on pandas? but this gives me the error
Index contains duplicate entries, cannot reshape
Also if I use pd.pivot_table() instead of df.pivot as someone else suggested in another post.
Upvotes: 1
Views: 50
Reputation: 191
In my opinion you must create two aux dataframes before, the first one grouping by ID and getting the first and last Status (if you are going to have A and B everytime you could use a pivot table) and the second one, you should get the first and last date. Something like that:
import pandas as pd
df = (
pd.DataFrame(
{
"ID": [1, 1, 2, 2],
"Status": ["A", "B", "A", "B"],
"Date": ["01-09-2020", "03-09-2020", "10-12-2020", "11-12-2020"]
}
)
.assign(Date=lambda x: pd.to_datetime(x["Date"])
)
aux1 = df.groupby("ID").agg(Status1=("Status", "first"), Status2=("Status", "last"))
aux2 = df.groupby("ID").agg(Date1=("Date", "min"), Date2=("Date", "max"))
output = pd.merge(aux1, aux2, left_index=True, right_index=True)
print(output)
Upvotes: 1